What is the benefit of allowing repeated keys in keyword lists?

I can see it allowing one to automatically generate a smaller list out of the keyword value pairs that had themselves first been stored in the list automatically, but is there any reason to do this manually? Are they cheaper than maps?

Yes, they are cheaper, as adding a new value in a KW list is just a simple cons without any copying, while adding a key to a map might be a lot more involved and in worst case requires the map to be copied as a whole.

2 Likes

It depends on how you access values. Since a keyword list is a linked list, accessing an entry has linear complexity, while accessing a value in a map would have logarithmic complexity (bounded by a small constant for practical maps). Inserting one entry (without removing duplicates) would instead have constant complexity in a keyword list, and logarithmic complexity in a map.

In case of small collections though, this does not matter at all. Unless one is dealing with large collections and a tight loop where one repeatedly accesses values and needs to squeeze out every nanosecond, it’s usually better to use the collection that is semantically right, rather than the one that is faster.

The fact that keyword lists allow for duplicate values is primarily a semantic difference:

  • Keyword lists have a defined order, can have duplicate keys, only allow atoms as keys. Their literal syntax is slightly leaner, but pattern matching for a specific entry is tricky. Some use cases naturally yield to them: imagine parsing command line options or query string parameters in a URL (where entries can be repeated, and the order might matter).

  • Maps don’t have a defined order, cannot have duplicate keys, and allow any term as keys. They are the canonical generic key/value collection, and are sometimes easier to pattern match on, but slightly more verbose in their literal syntax.

In other words, keyword lists and maps only partially overlap, and usually one of them is semantically more appropriate than the other for each use case.

3 Likes

Thanks to Noobz and lucaong for the clarification.

1 Like

Keyword lists are mainly useful if you want to use them as options where the last supplied duplicate option wins. Imagine merging defaults with user-supplied options: f.ex. [:postgresql_port: 5432] is the default but if you supply another port via any configuration mechanism, then doing Keyword.merge(defaults, user_parsed_config) will have your custom option “win” and be used instead of the default.

1 Like

And how does one access the duplicated key’s value? Accessing by the keyword atom returns only the first value it seems.

Keyword - get_values/2 will return all matching values.

3 Likes

Ecto’s keyword query DSL makes good use of it allowing to stack multiple :wheres either in the same query or when composing them.

4 Likes

This is the main virtue I see; Keyword is a conveniently composable way to pass configuration around without needing to check for conflicts or overrides along the way; they can be accreted and the ultimate consumer can decide how to handle them.

String.myers_difference also makes good use of repeated keys.

From the docs:

string1 = "fox hops over the dog"
iex> string2 = "fox jumps over the lazy cat"
iex> String.myers_difference(string1, string2)
[eq: "fox ", del: "ho", ins: "jum", eq: "ps over the ", del: "dog", ins: "lazy cat"]
3 Likes