hey all, I am comparing 2 cases for merging 2 lists, the best way would be probably to use list1++list2, but lets forget about that for this question. So the 2 cases are-
# for merging 2 lists -> list1 & list2
#case1 use Enum.reduce
Enum.reduce(list1,list2,&[&1 | &2])
#case2 -> merge(list1,list2)
def merge([],list2) do
list2
end
def merge(list1, list2) do
[head | remaining_list1] = list1
list2 = [head | list2]
merge(remaining_list1, list2)
end
is there any difference between these 2 methods?
benchee claims that for 200 size lists and in case of merging 4 such lists these are the stats-
Benchmarking Enum.reduce ...
Benchmarking merge ...
Name ips average deviation median 99th %
merge 30.72 K 32.55 μs ±16.74% 32 μs 37 μs
Enum.reduce 14.41 K 69.39 μs ±61.47% 68 μs 76 μs
Comparison:
merge 30.72 K
Enum.reduce 14.41 K - 2.13x slower +36.84 μs
can someone please explain what is the difference in implementation of these 2 cases?
Enum.reduce(list1, list2, &[&1, &2])
does not do what you want it to do.
iex> l1 = [1, 2, 3]
iex> l2 = [4, 5, 6]
iex> Enum.reduce(l1, l2, &[&1, &2])
[3, [2, [1, [4, 5, 6]]]]
I think what you’re looking for is Enum.reduce(list1, list2, &[&1 | &2])
. Not sure how that will benchmark against your merge
function, my guess is they’ll be pretty similar
1 Like
yeah sorry made a mistake in typing, will correct that.
hey @msimonborg I have corrected the code and updated it there still seems to be a difference in stats.
1 Like
Thanks!
I benchmarked these as well, I got a slower run with reduce
the first time around, on following passes they are pretty much the same.
First run:
Name ips average deviation median 99th %
merge 794.05 K 1.26 μs ±4467.93% 0.85 μs 1.47 μs
reduce 406.70 K 2.46 μs ±7602.35% 1.06 μs 3.25 μs
Comparison:
merge 794.05 K
reduce 406.70 K - 1.95x slower +1.20 μs
Second run:
Name ips average deviation median 99th %
reduce 712.89 K 1.40 μs ±3434.51% 0.89 μs 2.69 μs
merge 710.97 K 1.41 μs ±2934.43% 0.97 μs 1.70 μs
Comparison:
reduce 712.89 K
merge 710.97 K - 1.00x slower +0.00378 μs
Try configuring your benchmarks with variations in :time
and :warmup
options, and run the benchmarks multiple times back to back:
reduce = fn -> Enum.reduce(l1, l2, &[&1 | &2]) end
merge = fn -> Merge.merge(l1, l2) end
Benchee.run(
%{
"reduce" => reduce,
"merge" => merge
},
warmup: 5,
time: 10
)
try running it for merging 4 lists
list1 = Enum.to_list(1..300)
list2 = Enum.to_list(301..600)
list3 = Enum.to_list(601..900)
list4 = Enum.to_list(901..1200)
reduce = fn -> list1 |> Enum.reduce(list2,&[&1|&2]) |> Enum.reduce(list3,&[&1|&2]) |> Enum.reduce(list4,&[&1|&2]) end
merge = fn -> Draft.append(list1, list2) |> Draft.append(list3) |> Draft.append(list4) end
Benchee.run(
%{
"reduce" => reduce,
"merge" => merge
},
warmup: 5,
time: 10
)
the output-
❯ mix run benchmark.exs
Name ips average deviation median 99th %
merge 30.64 K 32.64 μs ±18.56% 32 μs 38 μs
Enum.reduce 14.50 K 68.95 μs ±6.11% 68 μs 75 μs
Comparison:
merge 30.64 K
Enum.reduce 14.50 K - 2.11x slower +36.31 μs
❯ mix run benchmark.exs
Name ips average deviation median 99th %
merge 30.84 K 32.43 μs ±31.50% 32 μs 37 μs
reduce 14.52 K 68.88 μs ±8.55% 68 μs 75 μs
Comparison:
merge 30.84 K
reduce 14.52 K - 2.12x slower +36.45 μs
❯ mix run benchmark.exs
Name ips average deviation median 99th %
merge 30.59 K 32.69 μs ±31.70% 32 μs 42 μs
reduce 14.48 K 69.07 μs ±8.80% 69 μs 74 μs
Comparison:
merge 30.59 K
reduce 14.48 K - 2.11x slower +36.38 μs
❯ mix run benchmark.exs
Name ips average deviation median 99th %
merge 30.70 K 32.58 μs ±30.74% 32 μs 39 μs
reduce 14.23 K 70.26 μs ±18.30% 68 μs 97 μs
Comparison:
merge 30.70 K
reduce 14.23 K - 2.16x slower +37.68 μs
nightfury17200:
#case2 -> merge(list1,list2)
def merge([],list2) do
list2
end
def merge(list1, list2) do
[head | remaining_list1] = list1
list2 = [head | list2]
merge(remaining_list1, list2)
end
If you want optimization, this is what I would do:
def merge([head | remaining_list1], list2) do
merge(remaining_list1, [head | list2])
end
def merge([],list2) do
list2
end
Avoid unnecessary variable assignment, make use of function head pattern matching and put your base-case last since it will only be called once.
on the other hand, they benchmark the same because Enum.reduce/3
uses :lists.foldl/3
which is implemented in a similar way: otp/lib/stdlib/src/lists.erl at c578f61c17d836d7b4943767f6a08400c33cb3cf · erlang/otp · GitHub
2 Likes
I think your best option here it to use List.foldl/3
to avoid any indirection (actually I wonder why List.foldl/3
is not inlined by the compiler).
Nitpick: I don’t know what the compiler optimization passes do with them, but Enum.reduce
on a list should naively be very very slightly faster because it has one term in its guard clause instead of two.
@spec foldl([elem], acc, (elem, acc -> acc)) :: acc when elem: var, acc: var
def foldl(list, acc, fun) when is_list(list) and is_function(fun) do
:lists.foldl(fun, acc, list)
end
def reduce(enumerable, acc, fun) when is_list(enumerable) do
:lists.foldl(fun, acc, enumerable)
end
2 Likes
You are right. I didn’t check the code. So i guess the most optimized way is to call :lists.foldl/3
directly.
1 Like
Note that all these implementations are not maintaining the order of the list, neither are merging the lists recursively.
2 Likes