How does fetching a keyword from a keyword list work internally?

nsuchy · February 20, 2018, 2:21pm

So I am curious in how Elixir fetches an item from a keyword list internally.

Let’s say we have the following list: [one: 'blue', two: 'red', three: 'yellow'] it will be represented internally as [{:one, 'blue'}, {:two, 'red'}, ...] a list with tuples that said how does the process to fetch these values work, what works well, what doesn’t? Could someone point me to the function that does this and describe the process in more detail?

I looked at https://github.com/elixir-lang/elixir/blob/master/lib/elixir/lib/keyword.ex#L360 and am still not sure about this, the functionality isn’t a part of erlang itself (ref: https://github.com/erlang/otp/blob/master/lib/stdlib/src/lists.erl#L49).

Thoughts?

idi527 · February 20, 2018, 3:18pm

Let’s say we have the following list: [one: ‘blue’, two: ‘red’, three: ‘yellow’] it will be represented internally as [{:one, “blue”}, {:two, " red"}, …]

No, it won’t … It will be more like [{:one, 'blue'}, {:two, 'red'}, ...].

"blue" != 'blue'

chrismcg · February 20, 2018, 3:20pm

Looks like it’s implemented as a bif here https://github.com/erlang/otp/blob/master/erts/emulator/beam/erl_bif_lists.c#L388

nsuchy · February 20, 2018, 3:55pm

I’ve updated the quotes for you. That being said could you answer the question how the process Elixir takes to search through a keyword list?

idi527 · February 20, 2018, 3:56pm

I think @chrismcg has already answered that.

madlep · February 21, 2018, 7:23am

It’s a linear search

The logic is to iterate through the keyword list, checking if the key of each tuple matches the specified key, and returning the value if it does.

The bigger the list, the longer it takes. But that’s fine for config options and function args etc where they’re usually used.

For larger data structures, maps are more appropriate as they have better access by key (log I think… need to double check that)

cmkarlsson · February 21, 2018, 8:26am

As @chrismcg pointed out elixir’s keyword uses erlang’s list keyfind which in turn is a built-in C function.

The classic way to do this (and what erlang’s proplists module do) is to do a recursive list function.

The implementation would be something like this:

def get_value(_key, []), do: nil % Or something
def get_value(key, [{key, value} | _rest]), do: value
def get_value(key, [_ | rest]), do: get_value(rest)

As others have pointed out this is not suitable for large lists but keyword lists are usually smallish.

I use the above construct a lot for a number of list based “data structures”. I don’t know if the choice of lists:keyfind was intentional over the naive recursive implementation done in erlang’s proplists but if I have to speculate (which I feel forced to) I would say it is

michalmuskala · February 21, 2018, 8:50am

The use of :lists.keyfind is an optimisation. The function is significantly faster, since it is implemented as a BIF in C. But yes, if we were to implement it in Elixir, it would look very similar to what @cmkarlsson presented.