How to check if any item of a list is in another list?
for example:
[:a, :b, :c] in [:d, :e, :f] # false
[:a, :b, :c] in [:d, :e, :c] # true
How to check if any item of a list is in another list?
for example:
[:a, :b, :c] in [:d, :e, :f] # false
[:a, :b, :c] in [:d, :e, :c] # true
Use MapSet and then check for intersection being non empty?
iex> MapSet.intersection(MapSet.new([:a, :b, :c]), MapSet.new([:d, :e, :f]))
#MapSet<[]>
iex> MapSet.intersection(MapSet.new([:a, :b, :c]), MapSet.new([:d, :e, :c]))
#MapSet<[:c]>
Oh and I totally forgot about MapSet.disjoint?
Edit: iPhone keyboard hates `
oh, good one! TIL
You can also brute force it (which may en up being faster):
Enum.any?([:a, :b, :c], &Enum.member?([:c, :d, :e], &1))
it got me curious, not a real benchmark, but:
$ mix profile.eprof -e "MapSet.intersection(MapSet.new(0..1000), MapSet.new(0..1000))"
Warmup...
Profile results of #PID<0.142.0>
# CALLS % TIME µS/CALL
Total 7032 100.0 1603 0.23
Map.take/2 1 0.00 0 0.00
Map.new/1 2 0.00 0 0.00
anonymous fn/0 in :elixir_compiler_1.__FILE__/1 1 0.00 0 0.00
Enum.reverse/1 2 0.00 0 0.00
:lists.reverse/1 2 0.06 1 0.50
Enum.to_list/1 2 0.06 1 0.50
MapSet.intersection/2 1 0.06 1 1.00
:erlang.apply/2 1 0.12 2 2.00
MapSet.new/1 2 0.12 2 1.00
:lists.reverse/2 2 1.06 17 8.50
:maps.keys/1 1 3.12 50 50.00
Map.take/3 1002 7.86 126 0.13
anonymous fn/2 in Enum.reverse/1 2002 17.47 280 0.14
MapSet.new_from_list/2 2004 18.96 304 0.15
:maps.from_list/1 3 23.77 381 127.00
Enum.reduce_range/5 2004 27.32 438 0.22
Profile done over 16 matching functions
$ mix profile.eprof -e "MapSet.intersection(MapSet.new(0..1000), MapSet.new(1001..1999))"
Warmup...
Profile results of #PID<0.142.0>
# CALLS % TIME µS/CALL
Total 7024 100.0 1540 0.22
Map.take/2 1 0.00 0 0.00
Map.new/1 2 0.00 0 0.00
:erlang.apply/2 1 0.06 1 1.00
anonymous fn/0 in :elixir_compiler_1.__FILE__/1 1 0.06 1 1.00
:lists.reverse/1 2 0.06 1 0.50
Enum.to_list/1 2 0.06 1 0.50
Enum.reverse/1 2 0.06 1 0.50
MapSet.intersection/2 1 0.06 1 1.00
MapSet.new/1 2 0.13 2 1.00
:lists.reverse/2 2 1.69 26 13.00
:maps.keys/1 1 3.05 47 47.00
Map.take/3 1000 6.88 106 0.11
anonymous fn/2 in Enum.reverse/1 2000 17.53 270 0.14
:maps.from_list/1 3 19.74 304 101.33
MapSet.new_from_list/2 2002 20.06 309 0.15
Enum.reduce_range/5 2002 30.52 470 0.23
Profile done over 16 matching functions
$ mix profile.eprof -e "Enum.any?(0..1000, &Enum.member?(0..1000, &1))"
Warmup...
Profile results of #PID<0.142.0>
# CALLS % TIME µS/CALL
Total 17 100.0 2 0.12
Enumerable.impl_for/1 2 0.00 0 0.00
Enumerable.reduce/3 1 0.00 0 0.00
Enumerable.member?/2 1 0.00 0 0.00
Enumerable.impl_for!/1 2 0.00 0 0.00
anonymous fn/1 in :elixir_compiler_1.__FILE__/1 1 0.00 0 0.00
anonymous fn/0 in :elixir_compiler_1.__FILE__/1 1 0.00 0 0.00
Enumerable.Range.reduce/5 2 0.00 0 0.00
Enumerable.Range.reduce/3 1 0.00 0 0.00
Enumerable.Range.member?/2 1 0.00 0 0.00
anonymous fn/3 in Enum.any?/2 1 0.00 0 0.00
Enum.member?/2 1 0.00 0 0.00
Range.size/1 1 0.00 0 0.00
:erlang.apply/2 1 50.00 1 1.00
Enum.any?/2 1 50.00 1 1.00
Profile done over 14 matching functions
mix profile.eprof -e "Enum.any?(0..1000, &Enum.member?(1001..1999, &1))"
Warmup...
Profile results of #PID<0.142.0>
# CALLS % TIME µS/CALL
Total 9017 100.0 1894 0.21
Enumerable.reduce/3 1 0.00 0 0.00
anonymous fn/0 in :elixir_compiler_1.__FILE__/1 1 0.00 0 0.00
Enumerable.Range.reduce/3 1 0.00 0 0.00
Enum.any?/2 1 0.00 0 0.00
:erlang.apply/2 1 0.11 2 2.00
anonymous fn/1 in :elixir_compiler_1.__FILE__/1 1001 6.39 121 0.12
Enumerable.impl_for/1 1002 6.65 126 0.13
Range.size/1 1001 7.60 144 0.14
anonymous fn/3 in Enum.any?/2 1001 12.14 230 0.23
Enumerable.impl_for!/1 1002 12.41 235 0.23
Enumerable.Range.member?/2 1001 13.62 258 0.26
Enumerable.member?/2 1001 13.67 259 0.26
Enumerable.Range.reduce/5 1002 13.67 259 0.26
Enum.member?/2 1001 13.73 260 0.26
Profile done over 14 matching functions
is it fair to say, that for bigger, especially unsorted lists, MapSet might be cheaper, and for short, especially sorted, Enum.any?
/ Enum.member?
combo will do a better job?
i guess for sorted lists, Enum.any?
/ Enum.member?
will return just after first comparison.
Me too lol.
It’s been months since I’ve wanted to try Benchee,
Here’s the code
# SMALL = 5
# MEDIUM = 30
# LARGE = 1000
Benchee.run(
%{
"mapset" => fn({list_a, list_b}) -> MapSet.intersection(MapSet.new(list_a), MapSet.new(list_b)) end,
"brute_force" => fn({list_a, list_b}) -> Enum.any?(list_a, &Enum.member?(list_b, &1)) end
},
formatters: [
Benchee.Formatters.HTML,
Benchee.Formatters.Console
],
time: 10,
memory_time: 2,
inputs: %{
# Small
"Small, ordered & match" => {Enum.to_list(1..5), Enum.to_list(1..5)},
"Small, ordered & no-match" => {Enum.to_list(1..5), Enum.to_list(5 + 1 .. 2 * 5)},
"Small, random & match" => {Enum.to_list(1..5) |> Enum.shuffle, Enum.to_list(1..5) |> Enum.shuffle},
"Small, random & no-match" => {Enum.to_list(1..5) |> Enum.shuffle, Enum.to_list(5 + 1 .. 2 * 5) |> Enum.shuffle},
# Medium
"Medium, ordered & match" => {Enum.to_list(1..30), Enum.to_list(1..30)},
"Medium, ordered & no-match" => {Enum.to_list(1..30), Enum.to_list(30 + 1 .. 2 * 30)},
"Medium, random & match" => {Enum.to_list(1..30) |> Enum.shuffle, Enum.to_list(1..30) |> Enum.shuffle},
"Medium, random & no-match" => {Enum.to_list(1..30) |> Enum.shuffle, Enum.to_list(30 + 1 .. 2 * 30) |> Enum.shuffle},
# Large
"Large, ordered & match" => {Enum.to_list(1..1000), Enum.to_list(1..1000)},
"Large, ordered & no-match" => {Enum.to_list(1..1000), Enum.to_list(1000 + 1 .. 2 * 1000)},
"Large, random & match" => {Enum.to_list(1..1000) |> Enum.shuffle, Enum.to_list(1..1000) |> Enum.shuffle},
"Large, random & no-match" => {Enum.to_list(1..1000) |> Enum.shuffle, Enum.to_list(1000 + 1 .. 2 * 1000) |> Enum.shuffle},
}
)
And here are the results: Benchee Report
A few key take aways:
As you said, MapSets are way faster for large number of items, but way slower for fewer items, but in every case MapSets uses a LOT more ram.
For ~30 they are about the same in speed but again, big ram costs.
EDIT:
Inspecting more carefully the data, brute force wins in both fronts for large, random & match.
IMO there are some issues with that benchmark:
the work done in the mapset
case is considerably larger, since MapSet.intersection
creates a full result while Enum.any?
can stop checking as soon as it finds one match.
the big win when doing these sorts of checks is to maintain the bigger list (usually list_a
) as a MapSet
and then do many lookups in it. Doing MapSet.new
inside the benchmark means that initial cost is paid for every loop of the benchmark.
I’ve added additional cases to the benchmark to address those issues:
# SMALL = 5
# MEDIUM = 30
# LARGE = 1000
Mix.install([{:benchee, ">= 0.0.0"}])
wrap_mapset = fn {l1, l2} -> {l1, l2, MapSet.new(l1), MapSet.new(l2)} end
Benchee.run(
%{
"mapset_intersection" => fn({list_a, list_b, _, _}) -> MapSet.intersection(MapSet.new(list_a), MapSet.new(list_b)) end,
"mapset_disjoint" => fn({list_a, list_b, _, _}) -> !MapSet.disjoint?(MapSet.new(list_a), MapSet.new(list_b)) end,
"mapset_intersection_pre1" => fn({_, list_b, mapset_a, _}) -> MapSet.intersection(mapset_a, MapSet.new(list_b)) end,
"mapset_disjoint_pre1" => fn({_, list_b, mapset_a, _}) -> !MapSet.disjoint?(mapset_a, MapSet.new(list_b)) end,
"mapset_intersection_pre2" => fn({_, _, mapset_a, mapset_b}) -> MapSet.intersection(mapset_a, mapset_b) end,
"mapset_disjoint_pre2" => fn({_, _, mapset_a, mapset_b}) -> !MapSet.disjoint?(mapset_a, mapset_b) end,
"brute_force" => fn({list_a, list_b, _, _}) -> Enum.any?(list_a, &Enum.member?(list_b, &1)) end
},
formatters: [
Benchee.Formatters.Console
],
time: 10,
memory_time: 2,
inputs: %{
# Small
"Small, ordered & match" => wrap_mapset.({Enum.to_list(1..5), Enum.to_list(1..5)}),
"Small, ordered & no-match" => wrap_mapset.({Enum.to_list(1..5), Enum.to_list(5 + 1 .. 2 * 5)}),
"Small, random & match" => wrap_mapset.({Enum.to_list(1..5) |> Enum.shuffle, Enum.to_list(1..5) |> Enum.shuffle}),
"Small, random & no-match" => wrap_mapset.({Enum.to_list(1..5) |> Enum.shuffle, Enum.to_list(5 + 1 .. 2 * 5) |> Enum.shuffle}),
# Medium
"Medium, ordered & match" => wrap_mapset.({Enum.to_list(1..30), Enum.to_list(1..30)}),
"Medium, ordered & no-match" => wrap_mapset.({Enum.to_list(1..30), Enum.to_list(30 + 1 .. 2 * 30)}),
"Medium, random & match" => wrap_mapset.({Enum.to_list(1..30) |> Enum.shuffle, Enum.to_list(1..30) |> Enum.shuffle}),
"Medium, random & no-match" => wrap_mapset.({Enum.to_list(1..30) |> Enum.shuffle, Enum.to_list(30 + 1 .. 2 * 30) |> Enum.shuffle}),
# Large
"Large, ordered & match" => wrap_mapset.({Enum.to_list(1..1000), Enum.to_list(1..1000)}),
"Large, ordered & no-match" => wrap_mapset.({Enum.to_list(1..1000), Enum.to_list(1000 + 1 .. 2 * 1000)}),
"Large, random & match" => wrap_mapset.({Enum.to_list(1..1000) |> Enum.shuffle, Enum.to_list(1..1000) |> Enum.shuffle}),
"Large, random & no-match" => wrap_mapset.({Enum.to_list(1..1000) |> Enum.shuffle, Enum.to_list(1000 + 1 .. 2 * 1000) |> Enum.shuffle}),
}
)
which produces on my machine (1.14 / OTP 25):
##### With input Large, ordered & match #####
Name ips average deviation median 99th %
brute_force 2847.78 K 0.35 μs ±28004.15% 0.176 μs 0.34 μs
mapset_disjoint_pre2 1388.47 K 0.72 μs ±7505.66% 0.50 μs 3.07 μs
mapset_intersection_pre2 6.76 K 147.83 μs ±15.34% 142.76 μs 214.91 μs
mapset_disjoint_pre1 6.64 K 150.51 μs ±17.07% 142.71 μs 226.43 μs
mapset_disjoint 3.31 K 302.29 μs ±15.82% 287.49 μs 430.78 μs
mapset_intersection_pre1 3.17 K 315.54 μs ±14.86% 304.50 μs 452.83 μs
mapset_intersection 2.10 K 476.69 μs ±10.97% 466.41 μs 648.73 μs
Comparison:
brute_force 2847.78 K
mapset_disjoint_pre2 1388.47 K - 2.05x slower +0.37 μs
mapset_intersection_pre2 6.76 K - 420.98x slower +147.48 μs
mapset_disjoint_pre1 6.64 K - 428.62x slower +150.16 μs
mapset_disjoint 3.31 K - 860.84x slower +301.94 μs
mapset_intersection_pre1 3.17 K - 898.58x slower +315.19 μs
mapset_intersection 2.10 K - 1357.51x slower +476.34 μs
Memory usage statistics:
Name Memory usage
brute_force 0.0469 KB
mapset_disjoint_pre2 1.05 KB - 22.50x memory usage +1.01 KB
mapset_intersection_pre2 83.74 KB - 1786.50x memory usage +83.70 KB
mapset_disjoint_pre1 30.12 KB - 642.50x memory usage +30.07 KB
mapset_disjoint 49.95 KB - 1065.50x memory usage +49.90 KB
mapset_intersection_pre1 112.78 KB - 2406.00x memory usage +112.73 KB
mapset_intersection 132.63 KB - 2829.50x memory usage +132.59 KB
**All measurements for memory usage were the same**
##### With input Large, ordered & no-match #####
Name ips average deviation median 99th %
mapset_intersection_pre2 37.17 K 26.90 μs ±27.51% 25.76 μs 56.63 μs
mapset_disjoint_pre2 25.80 K 38.76 μs ±23.63% 36.61 μs 67.08 μs
mapset_intersection_pre1 5.41 K 184.70 μs ±12.42% 175.12 μs 273.28 μs
mapset_disjoint_pre1 5.08 K 196.84 μs ±12.90% 186.64 μs 293.61 μs
mapset_intersection 2.94 K 340.10 μs ±11.61% 323.11 μs 474.96 μs
mapset_disjoint 2.88 K 347.06 μs ±11.30% 332.89 μs 499.21 μs
brute_force 0.23 K 4306.95 μs ±12.21% 4084.56 μs 5740.68 μs
Comparison:
mapset_intersection_pre2 37.17 K
mapset_disjoint_pre2 25.80 K - 1.44x slower +11.86 μs
mapset_intersection_pre1 5.41 K - 6.86x slower +157.80 μs
mapset_disjoint_pre1 5.08 K - 7.32x slower +169.93 μs
mapset_intersection 2.94 K - 12.64x slower +313.20 μs
mapset_disjoint 2.88 K - 12.90x slower +320.16 μs
brute_force 0.23 K - 160.08x slower +4280.04 μs
Memory usage statistics:
Name average deviation median 99th %
mapset_intersection_pre2 15.72 KB ±0.00% 15.72 KB 15.72 KB
mapset_disjoint_pre2 31.32 KB ±0.00% 31.32 KB 31.32 KB
mapset_intersection_pre1 44.84 KB ±0.00% 44.84 KB 44.84 KB
mapset_disjoint_pre1 36.18 KB ±0.35% 36.16 KB 36.55 KB
mapset_intersection 64.55 KB ±0.00% 64.55 KB 64.55 KB
mapset_disjoint 80.15 KB ±0.00% 80.15 KB 80.15 KB
brute_force 0.0469 KB ±0.00% 0.0469 KB 0.0469 KB
Comparison:
mapset_intersection_pre2 15.72 KB
mapset_disjoint_pre2 31.32 KB - 1.99x memory usage +15.60 KB
mapset_intersection_pre1 44.84 KB - 2.85x memory usage +29.13 KB
mapset_disjoint_pre1 36.18 KB - 2.30x memory usage +20.46 KB
mapset_intersection 64.55 KB - 4.11x memory usage +48.83 KB
mapset_disjoint 80.15 KB - 5.10x memory usage +64.43 KB
brute_force 0.0469 KB - 0.00x memory usage -15.67188 KB
##### With input Large, random & match #####
Name ips average deviation median 99th %
mapset_disjoint_pre2 1392.32 K 0.72 μs ±7655.04% 0.50 μs 3.07 μs
brute_force 404.44 K 2.47 μs ±1072.32% 2.24 μs 3.18 μs
mapset_disjoint_pre1 6.71 K 149.08 μs ±42.15% 141.30 μs 232.94 μs
mapset_intersection_pre2 6.62 K 151.13 μs ±12.42% 144.97 μs 223.85 μs
mapset_disjoint 3.30 K 302.85 μs ±12.29% 287.76 μs 439.65 μs
mapset_intersection_pre1 3.15 K 317.89 μs ±86.85% 302.77 μs 483.75 μs
mapset_intersection 2.11 K 473.18 μs ±10.49% 463.65 μs 652.93 μs
Comparison:
mapset_disjoint_pre2 1392.32 K
brute_force 404.44 K - 3.44x slower +1.75 μs
mapset_disjoint_pre1 6.71 K - 207.56x slower +148.36 μs
mapset_intersection_pre2 6.62 K - 210.42x slower +150.41 μs
mapset_disjoint 3.30 K - 421.67x slower +302.13 μs
mapset_intersection_pre1 3.15 K - 442.61x slower +317.18 μs
mapset_intersection 2.11 K - 658.82x slower +472.46 μs
Memory usage statistics:
Name Memory usage
mapset_disjoint_pre2 1.05 KB
brute_force 0.0469 KB - 0.04x memory usage -1.00781 KB
mapset_disjoint_pre1 30.12 KB - 28.56x memory usage +29.06 KB
mapset_intersection_pre2 83.74 KB - 79.40x memory usage +82.69 KB
mapset_disjoint 49.95 KB - 47.36x memory usage +48.89 KB
mapset_intersection_pre1 112.78 KB - 106.93x memory usage +111.73 KB
mapset_intersection 132.63 KB - 125.76x memory usage +131.58 KB
**All measurements for memory usage were the same**
##### With input Large, random & no-match #####
Name ips average deviation median 99th %
mapset_intersection_pre2 37.17 K 26.90 μs ±25.88% 25.81 μs 56.44 μs
mapset_disjoint_pre2 25.87 K 38.65 μs ±29.75% 36.54 μs 69.16 μs
mapset_intersection_pre1 5.53 K 180.83 μs ±11.82% 171.86 μs 263.88 μs
mapset_disjoint_pre1 5.09 K 196.45 μs ±12.23% 186.31 μs 287.30 μs
mapset_intersection 2.93 K 341.75 μs ±11.96% 324.39 μs 487.32 μs
mapset_disjoint 2.85 K 350.99 μs ±11.14% 336.21 μs 491.05 μs
brute_force 0.24 K 4250.21 μs ±11.08% 4029.24 μs 5646.10 μs
Comparison:
mapset_intersection_pre2 37.17 K
mapset_disjoint_pre2 25.87 K - 1.44x slower +11.74 μs
mapset_intersection_pre1 5.53 K - 6.72x slower +153.93 μs
mapset_disjoint_pre1 5.09 K - 7.30x slower +169.55 μs
mapset_intersection 2.93 K - 12.70x slower +314.85 μs
mapset_disjoint 2.85 K - 13.05x slower +324.08 μs
brute_force 0.24 K - 157.98x slower +4223.30 μs
Memory usage statistics:
Name average deviation median 99th %
mapset_intersection_pre2 15.72 KB ±0.00% 15.72 KB 15.72 KB
mapset_disjoint_pre2 31.32 KB ±0.00% 31.32 KB 31.32 KB
mapset_intersection_pre1 44.84 KB ±0.00% 44.84 KB 44.84 KB
mapset_disjoint_pre1 36.18 KB ±0.34% 36.16 KB 36.54 KB
mapset_intersection 64.55 KB ±0.00% 64.55 KB 64.55 KB
mapset_disjoint 80.15 KB ±0.00% 80.15 KB 80.15 KB
brute_force 0.0469 KB ±0.00% 0.0469 KB 0.0469 KB
Comparison:
mapset_intersection_pre2 15.72 KB
mapset_disjoint_pre2 31.32 KB - 1.99x memory usage +15.60 KB
mapset_intersection_pre1 44.84 KB - 2.85x memory usage +29.13 KB
mapset_disjoint_pre1 36.18 KB - 2.30x memory usage +20.46 KB
mapset_intersection 64.55 KB - 4.11x memory usage +48.83 KB
mapset_disjoint 80.15 KB - 5.10x memory usage +64.43 KB
brute_force 0.0469 KB - 0.00x memory usage -15.67188 KB
##### With input Medium, ordered & match #####
Name ips average deviation median 99th %
brute_force 3.51 M 284.95 ns ±24465.15% 173 ns 251 ns
mapset_disjoint_pre2 1.70 M 588.35 ns ±9653.26% 249 ns 2998 ns
mapset_disjoint_pre1 1.22 M 817.22 ns ±6922.87% 547 ns 3247 ns
mapset_disjoint 1.11 M 901.48 ns ±4319.02% 729 ns 1294 ns
mapset_intersection_pre2 0.38 M 2613.75 ns ±1580.08% 2365 ns 3605 ns
mapset_intersection_pre1 0.34 M 2903.53 ns ±803.65% 2600 ns 4392 ns
mapset_intersection 0.31 M 3192.43 ns ±780.39% 2866 ns 5175 ns
Comparison:
brute_force 3.51 M
mapset_disjoint_pre2 1.70 M - 2.06x slower +303.40 ns
mapset_disjoint_pre1 1.22 M - 2.87x slower +532.27 ns
mapset_disjoint 1.11 M - 3.16x slower +616.53 ns
mapset_intersection_pre2 0.38 M - 9.17x slower +2328.80 ns
mapset_intersection_pre1 0.34 M - 10.19x slower +2618.59 ns
mapset_intersection 0.31 M - 11.20x slower +2907.48 ns
Memory usage statistics:
Name Memory usage
brute_force 0.0469 KB
mapset_disjoint_pre2 0.0391 KB - 0.83x memory usage -0.00781 KB
mapset_disjoint_pre1 1.55 KB - 33.17x memory usage +1.51 KB
mapset_disjoint 2.13 KB - 45.50x memory usage +2.09 KB
mapset_intersection_pre2 2.21 KB - 47.17x memory usage +2.16 KB
mapset_intersection_pre1 2.79 KB - 59.50x memory usage +2.74 KB
mapset_intersection 3.37 KB - 71.83x memory usage +3.32 KB
**All measurements for memory usage were the same**
##### With input Medium, ordered & no-match #####
Name ips average deviation median 99th %
mapset_intersection_pre2 1220.56 K 0.82 μs ±3928.60% 0.70 μs 1.09 μs
mapset_intersection_pre1 942.86 K 1.06 μs ±4223.21% 0.92 μs 1.51 μs
mapset_intersection 751.87 K 1.33 μs ±2354.74% 1.21 μs 2.01 μs
mapset_disjoint_pre2 725.23 K 1.38 μs ±2776.98% 1.23 μs 1.87 μs
mapset_disjoint_pre1 598.81 K 1.67 μs ±2577.99% 1.47 μs 2.45 μs
mapset_disjoint 537.13 K 1.86 μs ±1549.84% 1.64 μs 2.77 μs
brute_force 492.79 K 2.03 μs ±1735.52% 1.82 μs 2.54 μs
Comparison:
mapset_intersection_pre2 1220.56 K
mapset_intersection_pre1 942.86 K - 1.29x slower +0.24 μs
mapset_intersection 751.87 K - 1.62x slower +0.51 μs
mapset_disjoint_pre2 725.23 K - 1.68x slower +0.56 μs
mapset_disjoint_pre1 598.81 K - 2.04x slower +0.85 μs
mapset_disjoint 537.13 K - 2.27x slower +1.04 μs
brute_force 492.79 K - 2.48x slower +1.21 μs
Memory usage statistics:
Name Memory usage
mapset_intersection_pre2 0.56 KB
mapset_intersection_pre1 1.14 KB - 2.03x memory usage +0.58 KB
mapset_intersection 1.72 KB - 3.06x memory usage +1.16 KB
mapset_disjoint_pre2 0.0391 KB - 0.07x memory usage -0.52344 KB
mapset_disjoint_pre1 1.55 KB - 2.76x memory usage +0.99 KB
mapset_disjoint 2.13 KB - 3.79x memory usage +1.57 KB
brute_force 0.0469 KB - 0.08x memory usage -0.51563 KB
**All measurements for memory usage were the same**
##### With input Medium, random & match #####
Name ips average deviation median 99th %
brute_force 3255.81 K 0.31 μs ±22401.87% 0.196 μs 0.29 μs
mapset_disjoint_pre2 1708.52 K 0.59 μs ±9133.31% 0.25 μs 3.01 μs
mapset_disjoint_pre1 605.86 K 1.65 μs ±2248.63% 1.50 μs 2.45 μs
mapset_intersection_pre2 380.21 K 2.63 μs ±1541.61% 2.37 μs 3.65 μs
mapset_disjoint 347.02 K 2.88 μs ±1389.02% 2.53 μs 5.01 μs
mapset_intersection_pre1 263.65 K 3.79 μs ±607.07% 3.49 μs 5.54 μs
mapset_intersection 196.51 K 5.09 μs ±526.02% 4.66 μs 8.28 μs
Comparison:
brute_force 3255.81 K
mapset_disjoint_pre2 1708.52 K - 1.91x slower +0.28 μs
mapset_disjoint_pre1 605.86 K - 5.37x slower +1.34 μs
mapset_intersection_pre2 380.21 K - 8.56x slower +2.32 μs
mapset_disjoint 347.02 K - 9.38x slower +2.57 μs
mapset_intersection_pre1 263.65 K - 12.35x slower +3.49 μs
mapset_intersection 196.51 K - 16.57x slower +4.78 μs
Memory usage statistics:
Name Memory usage
brute_force 0.0469 KB
mapset_disjoint_pre2 0.0391 KB - 0.83x memory usage -0.00781 KB
mapset_disjoint_pre1 1.55 KB - 33.17x memory usage +1.51 KB
mapset_intersection_pre2 2.21 KB - 47.17x memory usage +2.16 KB
mapset_disjoint 2.13 KB - 45.50x memory usage +2.09 KB
mapset_intersection_pre1 2.79 KB - 59.50x memory usage +2.74 KB
mapset_intersection 3.37 KB - 71.83x memory usage +3.32 KB
**All measurements for memory usage were the same**
##### With input Medium, random & no-match #####
Name ips average deviation median 99th %
mapset_intersection_pre2 1230.57 K 0.81 μs ±4179.38% 0.70 μs 1.07 μs
mapset_disjoint_pre2 728.62 K 1.37 μs ±2683.67% 1.22 μs 1.88 μs
mapset_intersection_pre1 578.31 K 1.73 μs ±1529.87% 1.56 μs 2.53 μs
brute_force 486.68 K 2.05 μs ±1601.31% 1.82 μs 2.59 μs
mapset_disjoint_pre1 429.29 K 2.33 μs ±1131.22% 2.07 μs 3.43 μs
mapset_intersection 364.92 K 2.74 μs ±280.48% 2.52 μs 4.38 μs
mapset_disjoint 301.09 K 3.32 μs ±1042.22% 2.96 μs 5.33 μs
Comparison:
mapset_intersection_pre2 1230.57 K
mapset_disjoint_pre2 728.62 K - 1.69x slower +0.56 μs
mapset_intersection_pre1 578.31 K - 2.13x slower +0.92 μs
brute_force 486.68 K - 2.53x slower +1.24 μs
mapset_disjoint_pre1 429.29 K - 2.87x slower +1.52 μs
mapset_intersection 364.92 K - 3.37x slower +1.93 μs
mapset_disjoint 301.09 K - 4.09x slower +2.51 μs
Memory usage statistics:
Name Memory usage
mapset_intersection_pre2 0.56 KB
mapset_disjoint_pre2 0.0391 KB - 0.07x memory usage -0.52344 KB
mapset_intersection_pre1 1.14 KB - 2.03x memory usage +0.58 KB
brute_force 0.0469 KB - 0.08x memory usage -0.51563 KB
mapset_disjoint_pre1 1.55 KB - 2.76x memory usage +0.99 KB
mapset_intersection 1.72 KB - 3.06x memory usage +1.16 KB
mapset_disjoint 2.13 KB - 3.79x memory usage +1.57 KB
**All measurements for memory usage were the same**
##### With input Small, ordered & match #####
Name ips average deviation median 99th %
mapset_disjoint_pre2 4.42 M 226.35 ns ±17726.05% 188 ns 352 ns
brute_force 4.09 M 244.76 ns ±20625.22% 168 ns 246 ns
mapset_disjoint_pre1 2.41 M 415.38 ns ±16105.81% 318 ns 589 ns
mapset_intersection_pre2 2.08 M 480.55 ns ±14662.17% 335 ns 547 ns
mapset_intersection_pre1 1.70 M 588.42 ns ±7675.81% 445 ns 806 ns
mapset_disjoint 1.68 M 594.42 ns ±8642.53% 438 ns 814 ns
mapset_intersection 1.24 M 808.77 ns ±7649.32% 628 ns 1067 ns
Comparison:
mapset_disjoint_pre2 4.42 M
brute_force 4.09 M - 1.08x slower +18.41 ns
mapset_disjoint_pre1 2.41 M - 1.84x slower +189.03 ns
mapset_intersection_pre2 2.08 M - 2.12x slower +254.19 ns
mapset_intersection_pre1 1.70 M - 2.60x slower +362.06 ns
mapset_disjoint 1.68 M - 2.63x slower +368.07 ns
mapset_intersection 1.24 M - 3.57x slower +582.42 ns
Memory usage statistics:
Name Memory usage
mapset_disjoint_pre2 200 B
brute_force 48 B - 0.24x memory usage -152 B
mapset_disjoint_pre1 392 B - 1.96x memory usage +192 B
mapset_intersection_pre2 464 B - 2.32x memory usage +264 B
mapset_intersection_pre1 656 B - 3.28x memory usage +456 B
mapset_disjoint 584 B - 2.92x memory usage +384 B
mapset_intersection 848 B - 4.24x memory usage +648 B
**All measurements for memory usage were the same**
##### With input Small, ordered & no-match #####
Name ips average deviation median 99th %
mapset_intersection_pre2 3.44 M 290.96 ns ±12444.20% 237 ns 388 ns
mapset_disjoint_pre2 3.33 M 300.19 ns ±13628.08% 264 ns 533 ns
brute_force 2.69 M 372.15 ns ±15736.37% 274 ns 537 ns
mapset_disjoint_pre1 2.18 M 458.40 ns ±7684.04% 390 ns 736 ns
mapset_intersection_pre1 2.01 M 498.74 ns ±14617.80% 345 ns 664 ns
mapset_intersection 1.49 M 670.07 ns ±8514.25% 508 ns 892 ns
mapset_disjoint 1.46 M 685.56 ns ±8019.31% 526 ns 951 ns
Comparison:
mapset_intersection_pre2 3.44 M
mapset_disjoint_pre2 3.33 M - 1.03x slower +9.23 ns
brute_force 2.69 M - 1.28x slower +81.19 ns
mapset_disjoint_pre1 2.18 M - 1.58x slower +167.44 ns
mapset_intersection_pre1 2.01 M - 1.71x slower +207.78 ns
mapset_intersection 1.49 M - 2.30x slower +379.11 ns
mapset_disjoint 1.46 M - 2.36x slower +394.60 ns
Memory usage statistics:
Name Memory usage
mapset_intersection_pre2 176 B
mapset_disjoint_pre2 200 B - 1.14x memory usage +24 B
brute_force 48 B - 0.27x memory usage -128 B
mapset_disjoint_pre1 392 B - 2.23x memory usage +216 B
mapset_intersection_pre1 368 B - 2.09x memory usage +192 B
mapset_intersection 560 B - 3.18x memory usage +384 B
mapset_disjoint 584 B - 3.32x memory usage +408 B
**All measurements for memory usage were the same**
##### With input Small, random & match #####
Name ips average deviation median 99th %
mapset_disjoint_pre2 4.41 M 226.60 ns ±19724.56% 186 ns 357 ns
brute_force 4.09 M 244.72 ns ±21042.87% 166 ns 246 ns
mapset_disjoint_pre1 2.29 M 436.83 ns ±15800.24% 337 ns 630 ns
mapset_intersection_pre2 2.16 M 462.65 ns ±14581.35% 329 ns 514 ns
mapset_intersection_pre1 1.63 M 615.02 ns ±6928.59% 472 ns 856 ns
mapset_disjoint 1.58 M 634.02 ns ±8216.57% 478 ns 851 ns
mapset_intersection 1.19 M 843.24 ns ±7105.41% 663 ns 1109 ns
Comparison:
mapset_disjoint_pre2 4.41 M
brute_force 4.09 M - 1.08x slower +18.11 ns
mapset_disjoint_pre1 2.29 M - 1.93x slower +210.23 ns
mapset_intersection_pre2 2.16 M - 2.04x slower +236.05 ns
mapset_intersection_pre1 1.63 M - 2.71x slower +388.42 ns
mapset_disjoint 1.58 M - 2.80x slower +407.42 ns
mapset_intersection 1.19 M - 3.72x slower +616.63 ns
Memory usage statistics:
Name Memory usage
mapset_disjoint_pre2 200 B
brute_force 48 B - 0.24x memory usage -152 B
mapset_disjoint_pre1 392 B - 1.96x memory usage +192 B
mapset_intersection_pre2 464 B - 2.32x memory usage +264 B
mapset_intersection_pre1 656 B - 3.28x memory usage +456 B
mapset_disjoint 584 B - 2.92x memory usage +384 B
mapset_intersection 848 B - 4.24x memory usage +648 B
**All measurements for memory usage were the same**
##### With input Small, random & no-match #####
Name ips average deviation median 99th %
mapset_intersection_pre2 3.41 M 293.45 ns ±13074.01% 237 ns 375 ns
mapset_disjoint_pre2 3.30 M 302.69 ns ±13233.95% 267 ns 569 ns
brute_force 2.73 M 366.67 ns ±15029.28% 274 ns 536 ns
mapset_disjoint_pre1 2.03 M 492.48 ns ±7245.82% 424 ns 793 ns
mapset_intersection_pre1 1.86 M 537.54 ns ±14331.05% 374 ns 709 ns
mapset_intersection 1.38 M 724.38 ns ±7804.83% 568 ns 967 ns
mapset_disjoint 1.38 M 725.41 ns ±7598.62% 569 ns 996 ns
Comparison:
mapset_intersection_pre2 3.41 M
mapset_disjoint_pre2 3.30 M - 1.03x slower +9.25 ns
brute_force 2.73 M - 1.25x slower +73.23 ns
mapset_disjoint_pre1 2.03 M - 1.68x slower +199.04 ns
mapset_intersection_pre1 1.86 M - 1.83x slower +244.09 ns
mapset_intersection 1.38 M - 2.47x slower +430.93 ns
mapset_disjoint 1.38 M - 2.47x slower +431.97 ns
Memory usage statistics:
Name Memory usage
mapset_intersection_pre2 176 B
mapset_disjoint_pre2 200 B - 1.14x memory usage +24 B
brute_force 48 B - 0.27x memory usage -128 B
mapset_disjoint_pre1 392 B - 2.23x memory usage +216 B
mapset_intersection_pre1 368 B - 2.09x memory usage +192 B
mapset_intersection 560 B - 3.18x memory usage +384 B
mapset_disjoint 584 B - 3.32x memory usage +408 B
**All measurements for memory usage were the same**
Since we are trying to check the performance of finding if any member of a list exist in another list, I don’t think is correct to take away the time it takes to build a secondary data structure to help you perform faster access.
I don’t have time now to repeat the process, but maybe something like this will be the best solution:
def list_collide?(list_a, list_b) do
mapset_b = MapSet.new(list_b)
Enum.any?(list_a, & MapSet.member?(mapset_b, &1))
end
hehe testing performance is unexpectedly a lot of fun
Clearly using MapSet.intersection/2
is the wrong function, MapSet.disjoint?/2
is the correct one.
Here’s the new test & results
defmodule SpeedTest do
def list_collide?(list_a, list_b) do
mapset_b = MapSet.new(list_b)
Enum.any?(list_a, & MapSet.member?(mapset_b, &1))
end
end
Benchee.run(
%{
"mapset_disjoint" => fn({list_a, list_b}) -> MapSet.disjoint?(MapSet.new(list_a), MapSet.new(list_b)) end,
"mix_strategy" => fn({list_a, list_b}) -> SpeedTest.list_collide?(list_a, list_b) end,
"brute_force" => fn({list_a, list_b}) -> Enum.any?(list_a, &Enum.member?(list_b, &1)) end
},
formatters: [Benchee.Formatters.Console],
time: 10,
memory_time: 2,
inputs: %{
"Small, random & no-match" => {Enum.to_list(1..5) |> Enum.shuffle, Enum.to_list(5 + 1 .. 2 * 5) |> Enum.shuffle},
"Large, random & no-match" => {Enum.to_list(1..1000) |> Enum.shuffle, Enum.to_list(1000 + 1 .. 2 * 1000) |> Enum.shuffle},
}
)
##### With input Large, random & no-match #####
Name ips average deviation median 99th %
mix_strategy 6.95 K 143.86 μs ±13.47% 139.51 μs 192.21 μs
mapset_disjoint 3.80 K 263.03 μs ±8.10% 265.75 μs 342.80 μs
brute_force 0.33 K 3027.75 μs ±22.07% 2928.87 μs 4321.17 μs
Comparison:
mix_strategy 6.95 K
mapset_disjoint 3.80 K - 1.83x slower +119.16 μs
brute_force 0.33 K - 21.05x slower +2883.89 μs
Memory usage statistics:
Name average deviation median 99th %
mix_strategy 1.25 KB ±0.00% 1.25 KB 1.25 KB
mapset_disjoint 44.50 KB ±0.39% 44.52 KB 44.80 KB
brute_force 0.0469 KB ±0.00% 0.0469 KB 0.0469 KB
Comparison:
mix_strategy 1.25 KB
mapset_disjoint 44.50 KB - 35.60x memory usage +43.25 KB
brute_force 0.0469 KB - 0.04x memory usage -1.20313 KB
##### With input Small, random & no-match #####
Name ips average deviation median 99th %
brute_force 3.75 M 266.92 ns ±18415.96% 195 ns 499 ns
mix_strategy 2.47 M 404.27 ns ±11984.13% 286 ns 588 ns
mapset_disjoint 2.11 M 473.68 ns ±6384.18% 366 ns 1730 ns
Comparison:
brute_force 3.75 M
mix_strategy 2.47 M - 1.51x slower +137.34 ns
mapset_disjoint 2.11 M - 1.77x slower +206.76 ns
Memory usage statistics:
Name Memory usage
brute_force 48 B
mix_strategy 240 B - 5.00x memory usage +192 B
mapset_disjoint 584 B - 12.17x memory usage +536 B
**All measurements for memory usage were the same**