Is operating on [{1}, {2}, ...] meaningfully less performant than [{:a, :b, ...}, ...]

overcomeoj · January 17, 2023, 12:42pm

Knowing that Elixir lists are linked lists, and that updating them has some performance cost for traversal, etc, does the data in those lists impact performance?

Eg, given a “simple” list of [{1}, {2}, ...] where the elements are “light” in byte count and [%Struct{b: "abcd"}, %Struct{}, ...] where the elements are “heavy”, is the performance between these lists when traversing, reversing, inserting, etc different enough to matter (over n elements where n is “big”)?

Does elixir copy the values for everything (so more bytes == more work), or is it all pointer shuffling and its just the size of the list that matters?

g-andrade · January 17, 2023, 6:32pm

Pretty much this. Terms within the VM are represented as tagged memory words (say, 64-bit unsigned integers), with some of the bits assigned to a tag and the remaining assigned to content. The tag contains type-related information. There are two kinds of terms:

“immediate” terms where both the tag and value can fit within the memory word: small integers, atoms, that kind of thing
“indirect” terms, where the content of the memory word is a pointer to where the actual value resides: larger integers, floats, binaries, composite terms other than empty lists

Composite terms like lists, maps and tuples are made up of other terms: the terms they contain are handled according to their nature as tagged memory words, which are all of the same size - therefore, you’re shuffling either small values or pointers, and thus it doesn’t make a difference.

(I may have simplified a bit / butchered some of the terminology)

g-andrade · January 17, 2023, 6:41pm

I should mention, however, that copying terms to other processes, ETS tables, etc usually implies deep copying (with rare exceptions), and in that case it will absolutely make a difference.