First, I meant efficiently (sorry for the misunderstanding), although I was wrong, kind of - more on that in a bit.
Second, thank you I’ve learned something really cool. It’s probably featured prominently in the guards documentation. When a guard raises, the next clause is evaluated, really cool! (for people, who, like me were unaware that means that even if :
test = [{:a, :b}, {:a,:b,:c}]
elem(hd(test), 2)
** (ArgumentError) errors were found at the given arguments:
* 1st argument: out of range
:erlang.element(3, {:a, :b})
Jose’s function works :
test = [{:a, :b}, {:a,:b,:c}]
Experiments.keyfind(test, :c, 2)
>>> {:a, :b, :c}
Third, regarding performances, again on OTP 24, with jit
Here’s my amended module :
defmodule Experiments do
def get_elem(str, [{str, _} = row | _tl]) , do: row
def get_elem(_str, []), do: :error_not_found
def get_elem(str, [_ | tl]), do: get_elem(str, tl)
def get_elem_pos_10(str, [{_, _, _, _, _, _, _, _, _, str} = row | _tl]) , do: row
def get_elem_pos_10(_str, []), do: :error_not_found
def get_elem_pos_10(str, [_ | tl]), do: get_elem(str, tl)
def keyfind([head | _tail], val, pos) when elem(head, pos) === val, do: head
def keyfind([_ | tail], val, pos), do: keyfind(tail, val, pos)
def keyfind([], _val, _pos), do: nil
end
Here’s my first benchmark :
list = Enum.to_list(1..10_000_000)
tuples = Enum.map(list, fn el -> {"#{el}", "#{el}" } end)
IO.inspect("######### first_elem")
Benchee.run(%{
"get elem" => fn -> Experiments.get_elem("9999999", tuples) end,
"elixir keyfind" => fn -> Experiments.keyfind(tuples, "9999999", 0) end,
"keyfind" => fn -> List.keyfind(tuples, "9999999", 0) end,
}
)
Results :
"######### first_elem"
Operating System: macOS
CPU Information: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
Number of Available Cores: 8
Available memory: 16 GB
Elixir 1.12.3
Erlang 24.1.2
Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 0 ns
parallel: 1
inputs: none specified
Estimated total run time: 21 s
Benchmarking elixir keyfind...
Benchmarking get elem...
Benchmarking keyfind...
Name ips average deviation median 99th %
get elem 7.46 133.99 ms ±2.23% 133.04 ms 143.36 ms
elixir keyfind 6.41 156.09 ms ±5.26% 153.76 ms 196.42 ms
keyfind 5.73 174.38 ms ±1.24% 173.72 ms 183.61 ms
Comparison:
get elem 7.46
elixir keyfind 6.41 - 1.16x slower +22.10 ms
keyfind 5.73 - 1.30x slower +40.39 ms
Trying to find at the 10th position or around, by matching directly is clearly a loser, however the results are still surprising with your implementation:
list = Enum.to_list(1..10_000_000)
tuples10 = Enum.map(list, fn el -> {"#{el}", "#{el}","#{el}", "#{el}","#{el}",
"#{el}","#{el}", "#{el}", "#{el}", "#{el}" } end)
IO.inspect("########### eigth_elem")
Benchee.run(%{
"elixir keyfind 8" => fn -> Experiments.keyfind(tuples10, "9999999", 7) end,
"keyfind 8" => fn -> List.keyfind(tuples10, "9999999", 7) end,
},
time: 20
)
IO.inspect("########### tenth_elem")
Benchee.run(%{
#"get elem 10" => fn -> Experiments.get_elem_pos_10("9999999", tuples10) end,
"elixir keyfind 10" => fn -> Experiments.keyfind(tuples10, "9999999", 9) end,
"keyfind 10" => fn -> List.keyfind(tuples10, "9999999", 9) end,
},
time: 20
)
Results:
"########### eigth_elem"
Operating System: macOS
CPU Information: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
Number of Available Cores: 8
Available memory: 16 GB
Elixir 1.12.3
Erlang 24.1.2
Benchmark suite executing with the following configuration:
warmup: 2 s
time: 20 s
memory time: 0 ns
parallel: 1
inputs: none specified
Estimated total run time: 44 s
Benchmarking elixir keyfind 8...
Benchmarking keyfind 8...
Name ips average deviation median 99th %
elixir keyfind 8 4.16 240.32 ms ±4.66% 237.33 ms 323.00 ms
keyfind 8 4.02 248.68 ms ±3.69% 246.13 ms 304.43 ms
Comparison:
elixir keyfind 8 4.16
keyfind 8 4.02 - 1.03x slower +8.36 ms
"########### tenth_elem"
Operating System: macOS
CPU Information: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
Number of Available Cores: 8
Available memory: 16 GB
Elixir 1.12.3
Erlang 24.1.2
Benchmark suite executing with the following configuration:
warmup: 2 s
time: 20 s
memory time: 0 ns
parallel: 1
inputs: none specified
Estimated total run time: 44 s
Benchmarking elixir keyfind 10...
Benchmarking keyfind 10...
Name ips average deviation median 99th %
keyfind 10 3.46 288.84 ms ±5.17% 284.73 ms 356.50 ms
elixir keyfind 10 3.42 292.25 ms ±4.59% 289.32 ms 349.21 ms
Comparison:
keyfind 10 3.46
elixir keyfind 10 3.42 - 1.01x slower +3.41 ms
However, looking for the 20th element seems breaks the elixir implementation :
list = Enum.to_list(1..10_000_000)
tuples20 = Enum.map(list, fn el -> {"#{el}", "#{el}","#{el}", "#{el}","#{el}",
"#{el}","#{el}", "#{el}", "#{el}", "#{el}", "#{el}",
"#{el}","#{el}", "#{el}","#{el}", "#{el}","#{el}", "#{el}", "#{el}", "#{el}" } end)
IO.inspect("########### twentieth_elem")
Benchee.run(%{
"elixir keyfind 20" => fn -> Experiments.keyfind(tuples20, "9999999", 19) end,
"keyfind 20" => fn -> List.keyfind(tuples20, "9999999", 19) end,
}
)
Results:
"########### twentieth_elem"
Operating System: macOS
CPU Information: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
Number of Available Cores: 8
Available memory: 16 GB
Elixir 1.12.3
Erlang 24.1.2
Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 0 ns
parallel: 1
inputs: none specified
Estimated total run time: 14 s
Benchmarking elixir keyfind 20...
Benchmarking keyfind 20...
Name ips average deviation median 99th %
keyfind 20 0.33 2.99 s ±0.00% 2.99 s 2.99 s
elixir keyfind 20 0.25 3.94 s ±0.00% 3.94 s 3.94 s
Comparison:
keyfind 20 0.33
elixir keyfind 20 0.25 - 1.32x slower +0.95 s
Now, I have more questions than answers:
- Would the results be the same on OTP 23 (sadly I can’t test this right now)
- Would they hold for any key type (sadly I can’t test this right now - but glancing at the bif implementation, it would seem all keys are not created equal)
- If they were, wouldn’t it be advantageous to switch to the pure elixir implementation, unless, maybe, many users use crazy long tuples, and reach for the n-th key where n >= 10?
Anyways, thanks for the interesting and stimulating answer.