Check list in list?

How to check if any item of a list is in another list?
for example:

[:a, :b, :c] in [:d, :e, :f] # false
[:a, :b, :c] in [:d, :e, :c] # true

Use MapSet and then check for intersection being non empty?

iex> MapSet.intersection(MapSet.new([:a, :b, :c]), MapSet.new([:d, :e, :f]))
#MapSet<[]>
iex> MapSet.intersection(MapSet.new([:a, :b, :c]), MapSet.new([:d, :e, :c]))
#MapSet<[:c]>
2 Likes

Oh and I totally forgot about MapSet.disjoint?

Edit: iPhone keyboard hates `

2 Likes

oh, good one! TIL :slight_smile:

You can also brute force it (which may en up being faster):

Enum.any?([:a, :b, :c], &Enum.member?([:c, :d, :e], &1))
3 Likes

it got me curious, not a real benchmark, but:

$ mix profile.eprof -e "MapSet.intersection(MapSet.new(0..1000), MapSet.new(0..1000))"
Warmup...

Profile results of #PID<0.142.0>
#                                               CALLS     % TIME µS/CALL
Total                                            7032 100.0 1603    0.23
Map.take/2                                          1  0.00    0    0.00
Map.new/1                                           2  0.00    0    0.00
anonymous fn/0 in :elixir_compiler_1.__FILE__/1     1  0.00    0    0.00
Enum.reverse/1                                      2  0.00    0    0.00
:lists.reverse/1                                    2  0.06    1    0.50
Enum.to_list/1                                      2  0.06    1    0.50
MapSet.intersection/2                               1  0.06    1    1.00
:erlang.apply/2                                     1  0.12    2    2.00
MapSet.new/1                                        2  0.12    2    1.00
:lists.reverse/2                                    2  1.06   17    8.50
:maps.keys/1                                        1  3.12   50   50.00
Map.take/3                                       1002  7.86  126    0.13
anonymous fn/2 in Enum.reverse/1                 2002 17.47  280    0.14
MapSet.new_from_list/2                           2004 18.96  304    0.15
:maps.from_list/1                                   3 23.77  381  127.00
Enum.reduce_range/5                              2004 27.32  438    0.22

Profile done over 16 matching functions
$ mix profile.eprof -e "MapSet.intersection(MapSet.new(0..1000), MapSet.new(1001..1999))"
Warmup...

Profile results of #PID<0.142.0>
#                                               CALLS     % TIME µS/CALL
Total                                            7024 100.0 1540    0.22
Map.take/2                                          1  0.00    0    0.00
Map.new/1                                           2  0.00    0    0.00
:erlang.apply/2                                     1  0.06    1    1.00
anonymous fn/0 in :elixir_compiler_1.__FILE__/1     1  0.06    1    1.00
:lists.reverse/1                                    2  0.06    1    0.50
Enum.to_list/1                                      2  0.06    1    0.50
Enum.reverse/1                                      2  0.06    1    0.50
MapSet.intersection/2                               1  0.06    1    1.00
MapSet.new/1                                        2  0.13    2    1.00
:lists.reverse/2                                    2  1.69   26   13.00
:maps.keys/1                                        1  3.05   47   47.00
Map.take/3                                       1000  6.88  106    0.11
anonymous fn/2 in Enum.reverse/1                 2000 17.53  270    0.14
:maps.from_list/1                                   3 19.74  304  101.33
MapSet.new_from_list/2                           2002 20.06  309    0.15
Enum.reduce_range/5                              2002 30.52  470    0.23

Profile done over 16 matching functions
$ mix profile.eprof -e "Enum.any?(0..1000, &Enum.member?(0..1000, &1))"
Warmup...

Profile results of #PID<0.142.0>
#                                               CALLS     % TIME µS/CALL
Total                                              17 100.0    2    0.12
Enumerable.impl_for/1                               2  0.00    0    0.00
Enumerable.reduce/3                                 1  0.00    0    0.00
Enumerable.member?/2                                1  0.00    0    0.00
Enumerable.impl_for!/1                              2  0.00    0    0.00
anonymous fn/1 in :elixir_compiler_1.__FILE__/1     1  0.00    0    0.00
anonymous fn/0 in :elixir_compiler_1.__FILE__/1     1  0.00    0    0.00
Enumerable.Range.reduce/5                           2  0.00    0    0.00
Enumerable.Range.reduce/3                           1  0.00    0    0.00
Enumerable.Range.member?/2                          1  0.00    0    0.00
anonymous fn/3 in Enum.any?/2                       1  0.00    0    0.00
Enum.member?/2                                      1  0.00    0    0.00
Range.size/1                                        1  0.00    0    0.00
:erlang.apply/2                                     1 50.00    1    1.00
Enum.any?/2                                         1 50.00    1    1.00

Profile done over 14 matching functions
mix profile.eprof -e "Enum.any?(0..1000, &Enum.member?(1001..1999, &1))"
Warmup...

Profile results of #PID<0.142.0>
#                                               CALLS     % TIME µS/CALL
Total                                            9017 100.0 1894    0.21
Enumerable.reduce/3                                 1  0.00    0    0.00
anonymous fn/0 in :elixir_compiler_1.__FILE__/1     1  0.00    0    0.00
Enumerable.Range.reduce/3                           1  0.00    0    0.00
Enum.any?/2                                         1  0.00    0    0.00
:erlang.apply/2                                     1  0.11    2    2.00
anonymous fn/1 in :elixir_compiler_1.__FILE__/1  1001  6.39  121    0.12
Enumerable.impl_for/1                            1002  6.65  126    0.13
Range.size/1                                     1001  7.60  144    0.14
anonymous fn/3 in Enum.any?/2                    1001 12.14  230    0.23
Enumerable.impl_for!/1                           1002 12.41  235    0.23
Enumerable.Range.member?/2                       1001 13.62  258    0.26
Enumerable.member?/2                             1001 13.67  259    0.26
Enumerable.Range.reduce/5                        1002 13.67  259    0.26
Enum.member?/2                                   1001 13.73  260    0.26

Profile done over 14 matching functions

is it fair to say, that for bigger, especially unsorted lists, MapSet might be cheaper, and for short, especially sorted, Enum.any? / Enum.member? combo will do a better job?

i guess for sorted lists, Enum.any? / Enum.member? will return just after first comparison.

5 Likes

Me too lol.

It’s been months since I’ve wanted to try Benchee,

Here’s the code

# SMALL = 5
# MEDIUM = 30
# LARGE = 1000

Benchee.run(
  %{
    "mapset" => fn({list_a, list_b}) -> MapSet.intersection(MapSet.new(list_a), MapSet.new(list_b)) end,
    "brute_force" => fn({list_a, list_b}) -> Enum.any?(list_a, &Enum.member?(list_b, &1)) end
  },
  formatters: [
    Benchee.Formatters.HTML,
    Benchee.Formatters.Console
  ],
  time: 10,
  memory_time: 2,
  inputs: %{
    # Small
    "Small, ordered & match" => {Enum.to_list(1..5), Enum.to_list(1..5)},
    "Small, ordered & no-match"  => {Enum.to_list(1..5), Enum.to_list(5 + 1 .. 2 * 5)},
    "Small, random & match" => {Enum.to_list(1..5) |> Enum.shuffle, Enum.to_list(1..5) |> Enum.shuffle},
    "Small, random & no-match"  => {Enum.to_list(1..5) |> Enum.shuffle, Enum.to_list(5 + 1 .. 2 * 5) |> Enum.shuffle},
    # Medium
    "Medium, ordered & match" => {Enum.to_list(1..30), Enum.to_list(1..30)},
    "Medium, ordered & no-match"  => {Enum.to_list(1..30), Enum.to_list(30 + 1 .. 2 * 30)},
    "Medium, random & match" => {Enum.to_list(1..30) |> Enum.shuffle, Enum.to_list(1..30) |> Enum.shuffle},
    "Medium, random & no-match"  => {Enum.to_list(1..30) |> Enum.shuffle, Enum.to_list(30 + 1 .. 2 * 30) |> Enum.shuffle},
    # Large
    "Large, ordered & match" => {Enum.to_list(1..1000), Enum.to_list(1..1000)},
    "Large, ordered & no-match"  => {Enum.to_list(1..1000), Enum.to_list(1000 + 1 .. 2 * 1000)},
    "Large, random & match" => {Enum.to_list(1..1000) |> Enum.shuffle, Enum.to_list(1..1000) |> Enum.shuffle},
    "Large, random & no-match"  => {Enum.to_list(1..1000) |> Enum.shuffle, Enum.to_list(1000 + 1 .. 2 * 1000) |> Enum.shuffle},
  }
)

And here are the results: Benchee Report

A few key take aways:

Large, random & no-match


Small, random, match


As you said, MapSets are way faster for large number of items, but way slower for fewer items, but in every case MapSets uses a LOT more ram.

For ~30 they are about the same in speed but again, big ram costs.

EDIT:
Inspecting more carefully the data, brute force wins in both fronts for large, random & match.

4 Likes

IMO there are some issues with that benchmark:

  • the work done in the mapset case is considerably larger, since MapSet.intersection creates a full result while Enum.any? can stop checking as soon as it finds one match.

  • the big win when doing these sorts of checks is to maintain the bigger list (usually list_a) as a MapSet and then do many lookups in it. Doing MapSet.new inside the benchmark means that initial cost is paid for every loop of the benchmark.

I’ve added additional cases to the benchmark to address those issues:

# SMALL = 5
# MEDIUM = 30
# LARGE = 1000
Mix.install([{:benchee, ">= 0.0.0"}])

wrap_mapset = fn {l1, l2} -> {l1, l2, MapSet.new(l1), MapSet.new(l2)} end

Benchee.run(
  %{
    "mapset_intersection" => fn({list_a, list_b, _, _}) -> MapSet.intersection(MapSet.new(list_a), MapSet.new(list_b)) end,
    "mapset_disjoint" => fn({list_a, list_b, _, _}) -> !MapSet.disjoint?(MapSet.new(list_a), MapSet.new(list_b)) end,
    "mapset_intersection_pre1" => fn({_, list_b, mapset_a, _}) -> MapSet.intersection(mapset_a, MapSet.new(list_b)) end,
    "mapset_disjoint_pre1" => fn({_, list_b, mapset_a, _}) -> !MapSet.disjoint?(mapset_a, MapSet.new(list_b)) end,
    "mapset_intersection_pre2" => fn({_, _, mapset_a, mapset_b}) -> MapSet.intersection(mapset_a, mapset_b) end,
    "mapset_disjoint_pre2" => fn({_, _, mapset_a, mapset_b}) -> !MapSet.disjoint?(mapset_a, mapset_b) end,
    "brute_force" => fn({list_a, list_b, _, _}) -> Enum.any?(list_a, &Enum.member?(list_b, &1)) end
  },
  formatters: [
    Benchee.Formatters.Console
  ],
  time: 10,
  memory_time: 2,
  inputs: %{
    # Small
    "Small, ordered & match" => wrap_mapset.({Enum.to_list(1..5), Enum.to_list(1..5)}),
    "Small, ordered & no-match"  => wrap_mapset.({Enum.to_list(1..5), Enum.to_list(5 + 1 .. 2 * 5)}),
    "Small, random & match" => wrap_mapset.({Enum.to_list(1..5) |> Enum.shuffle, Enum.to_list(1..5) |> Enum.shuffle}),
    "Small, random & no-match"  => wrap_mapset.({Enum.to_list(1..5) |> Enum.shuffle, Enum.to_list(5 + 1 .. 2 * 5) |> Enum.shuffle}),
    # Medium
    "Medium, ordered & match" => wrap_mapset.({Enum.to_list(1..30), Enum.to_list(1..30)}),
    "Medium, ordered & no-match"  => wrap_mapset.({Enum.to_list(1..30), Enum.to_list(30 + 1 .. 2 * 30)}),
    "Medium, random & match" => wrap_mapset.({Enum.to_list(1..30) |> Enum.shuffle, Enum.to_list(1..30) |> Enum.shuffle}),
    "Medium, random & no-match"  => wrap_mapset.({Enum.to_list(1..30) |> Enum.shuffle, Enum.to_list(30 + 1 .. 2 * 30) |> Enum.shuffle}),
    # Large
    "Large, ordered & match" => wrap_mapset.({Enum.to_list(1..1000), Enum.to_list(1..1000)}),
    "Large, ordered & no-match"  => wrap_mapset.({Enum.to_list(1..1000), Enum.to_list(1000 + 1 .. 2 * 1000)}),
    "Large, random & match" => wrap_mapset.({Enum.to_list(1..1000) |> Enum.shuffle, Enum.to_list(1..1000) |> Enum.shuffle}),
    "Large, random & no-match"  => wrap_mapset.({Enum.to_list(1..1000) |> Enum.shuffle, Enum.to_list(1000 + 1 .. 2 * 1000) |> Enum.shuffle}),
  }
)

which produces on my machine (1.14 / OTP 25):

##### With input Large, ordered & match #####
Name                               ips        average  deviation         median         99th %
brute_force                  2847.78 K        0.35 μs ±28004.15%       0.176 μs        0.34 μs
mapset_disjoint_pre2         1388.47 K        0.72 μs  ±7505.66%        0.50 μs        3.07 μs
mapset_intersection_pre2        6.76 K      147.83 μs    ±15.34%      142.76 μs      214.91 μs
mapset_disjoint_pre1            6.64 K      150.51 μs    ±17.07%      142.71 μs      226.43 μs
mapset_disjoint                 3.31 K      302.29 μs    ±15.82%      287.49 μs      430.78 μs
mapset_intersection_pre1        3.17 K      315.54 μs    ±14.86%      304.50 μs      452.83 μs
mapset_intersection             2.10 K      476.69 μs    ±10.97%      466.41 μs      648.73 μs

Comparison: 
brute_force                  2847.78 K
mapset_disjoint_pre2         1388.47 K - 2.05x slower +0.37 μs
mapset_intersection_pre2        6.76 K - 420.98x slower +147.48 μs
mapset_disjoint_pre1            6.64 K - 428.62x slower +150.16 μs
mapset_disjoint                 3.31 K - 860.84x slower +301.94 μs
mapset_intersection_pre1        3.17 K - 898.58x slower +315.19 μs
mapset_intersection             2.10 K - 1357.51x slower +476.34 μs

Memory usage statistics:

Name                        Memory usage
brute_force                    0.0469 KB
mapset_disjoint_pre2             1.05 KB - 22.50x memory usage +1.01 KB
mapset_intersection_pre2        83.74 KB - 1786.50x memory usage +83.70 KB
mapset_disjoint_pre1            30.12 KB - 642.50x memory usage +30.07 KB
mapset_disjoint                 49.95 KB - 1065.50x memory usage +49.90 KB
mapset_intersection_pre1       112.78 KB - 2406.00x memory usage +112.73 KB
mapset_intersection            132.63 KB - 2829.50x memory usage +132.59 KB

**All measurements for memory usage were the same**

##### With input Large, ordered & no-match #####
Name                               ips        average  deviation         median         99th %
mapset_intersection_pre2       37.17 K       26.90 μs    ±27.51%       25.76 μs       56.63 μs
mapset_disjoint_pre2           25.80 K       38.76 μs    ±23.63%       36.61 μs       67.08 μs
mapset_intersection_pre1        5.41 K      184.70 μs    ±12.42%      175.12 μs      273.28 μs
mapset_disjoint_pre1            5.08 K      196.84 μs    ±12.90%      186.64 μs      293.61 μs
mapset_intersection             2.94 K      340.10 μs    ±11.61%      323.11 μs      474.96 μs
mapset_disjoint                 2.88 K      347.06 μs    ±11.30%      332.89 μs      499.21 μs
brute_force                     0.23 K     4306.95 μs    ±12.21%     4084.56 μs     5740.68 μs

Comparison: 
mapset_intersection_pre2       37.17 K
mapset_disjoint_pre2           25.80 K - 1.44x slower +11.86 μs
mapset_intersection_pre1        5.41 K - 6.86x slower +157.80 μs
mapset_disjoint_pre1            5.08 K - 7.32x slower +169.93 μs
mapset_intersection             2.94 K - 12.64x slower +313.20 μs
mapset_disjoint                 2.88 K - 12.90x slower +320.16 μs
brute_force                     0.23 K - 160.08x slower +4280.04 μs

Memory usage statistics:

Name                             average  deviation         median         99th %
mapset_intersection_pre2        15.72 KB     ±0.00%       15.72 KB       15.72 KB
mapset_disjoint_pre2            31.32 KB     ±0.00%       31.32 KB       31.32 KB
mapset_intersection_pre1        44.84 KB     ±0.00%       44.84 KB       44.84 KB
mapset_disjoint_pre1            36.18 KB     ±0.35%       36.16 KB       36.55 KB
mapset_intersection             64.55 KB     ±0.00%       64.55 KB       64.55 KB
mapset_disjoint                 80.15 KB     ±0.00%       80.15 KB       80.15 KB
brute_force                    0.0469 KB     ±0.00%      0.0469 KB      0.0469 KB

Comparison: 
mapset_intersection_pre2        15.72 KB
mapset_disjoint_pre2            31.32 KB - 1.99x memory usage +15.60 KB
mapset_intersection_pre1        44.84 KB - 2.85x memory usage +29.13 KB
mapset_disjoint_pre1            36.18 KB - 2.30x memory usage +20.46 KB
mapset_intersection             64.55 KB - 4.11x memory usage +48.83 KB
mapset_disjoint                 80.15 KB - 5.10x memory usage +64.43 KB
brute_force                    0.0469 KB - 0.00x memory usage -15.67188 KB

##### With input Large, random & match #####
Name                               ips        average  deviation         median         99th %
mapset_disjoint_pre2         1392.32 K        0.72 μs  ±7655.04%        0.50 μs        3.07 μs
brute_force                   404.44 K        2.47 μs  ±1072.32%        2.24 μs        3.18 μs
mapset_disjoint_pre1            6.71 K      149.08 μs    ±42.15%      141.30 μs      232.94 μs
mapset_intersection_pre2        6.62 K      151.13 μs    ±12.42%      144.97 μs      223.85 μs
mapset_disjoint                 3.30 K      302.85 μs    ±12.29%      287.76 μs      439.65 μs
mapset_intersection_pre1        3.15 K      317.89 μs    ±86.85%      302.77 μs      483.75 μs
mapset_intersection             2.11 K      473.18 μs    ±10.49%      463.65 μs      652.93 μs

Comparison: 
mapset_disjoint_pre2         1392.32 K
brute_force                   404.44 K - 3.44x slower +1.75 μs
mapset_disjoint_pre1            6.71 K - 207.56x slower +148.36 μs
mapset_intersection_pre2        6.62 K - 210.42x slower +150.41 μs
mapset_disjoint                 3.30 K - 421.67x slower +302.13 μs
mapset_intersection_pre1        3.15 K - 442.61x slower +317.18 μs
mapset_intersection             2.11 K - 658.82x slower +472.46 μs

Memory usage statistics:

Name                        Memory usage
mapset_disjoint_pre2             1.05 KB
brute_force                    0.0469 KB - 0.04x memory usage -1.00781 KB
mapset_disjoint_pre1            30.12 KB - 28.56x memory usage +29.06 KB
mapset_intersection_pre2        83.74 KB - 79.40x memory usage +82.69 KB
mapset_disjoint                 49.95 KB - 47.36x memory usage +48.89 KB
mapset_intersection_pre1       112.78 KB - 106.93x memory usage +111.73 KB
mapset_intersection            132.63 KB - 125.76x memory usage +131.58 KB

**All measurements for memory usage were the same**

##### With input Large, random & no-match #####
Name                               ips        average  deviation         median         99th %
mapset_intersection_pre2       37.17 K       26.90 μs    ±25.88%       25.81 μs       56.44 μs
mapset_disjoint_pre2           25.87 K       38.65 μs    ±29.75%       36.54 μs       69.16 μs
mapset_intersection_pre1        5.53 K      180.83 μs    ±11.82%      171.86 μs      263.88 μs
mapset_disjoint_pre1            5.09 K      196.45 μs    ±12.23%      186.31 μs      287.30 μs
mapset_intersection             2.93 K      341.75 μs    ±11.96%      324.39 μs      487.32 μs
mapset_disjoint                 2.85 K      350.99 μs    ±11.14%      336.21 μs      491.05 μs
brute_force                     0.24 K     4250.21 μs    ±11.08%     4029.24 μs     5646.10 μs

Comparison: 
mapset_intersection_pre2       37.17 K
mapset_disjoint_pre2           25.87 K - 1.44x slower +11.74 μs
mapset_intersection_pre1        5.53 K - 6.72x slower +153.93 μs
mapset_disjoint_pre1            5.09 K - 7.30x slower +169.55 μs
mapset_intersection             2.93 K - 12.70x slower +314.85 μs
mapset_disjoint                 2.85 K - 13.05x slower +324.08 μs
brute_force                     0.24 K - 157.98x slower +4223.30 μs

Memory usage statistics:

Name                             average  deviation         median         99th %
mapset_intersection_pre2        15.72 KB     ±0.00%       15.72 KB       15.72 KB
mapset_disjoint_pre2            31.32 KB     ±0.00%       31.32 KB       31.32 KB
mapset_intersection_pre1        44.84 KB     ±0.00%       44.84 KB       44.84 KB
mapset_disjoint_pre1            36.18 KB     ±0.34%       36.16 KB       36.54 KB
mapset_intersection             64.55 KB     ±0.00%       64.55 KB       64.55 KB
mapset_disjoint                 80.15 KB     ±0.00%       80.15 KB       80.15 KB
brute_force                    0.0469 KB     ±0.00%      0.0469 KB      0.0469 KB

Comparison: 
mapset_intersection_pre2        15.72 KB
mapset_disjoint_pre2            31.32 KB - 1.99x memory usage +15.60 KB
mapset_intersection_pre1        44.84 KB - 2.85x memory usage +29.13 KB
mapset_disjoint_pre1            36.18 KB - 2.30x memory usage +20.46 KB
mapset_intersection             64.55 KB - 4.11x memory usage +48.83 KB
mapset_disjoint                 80.15 KB - 5.10x memory usage +64.43 KB
brute_force                    0.0469 KB - 0.00x memory usage -15.67188 KB

##### With input Medium, ordered & match #####
Name                               ips        average  deviation         median         99th %
brute_force                     3.51 M      284.95 ns ±24465.15%         173 ns         251 ns
mapset_disjoint_pre2            1.70 M      588.35 ns  ±9653.26%         249 ns        2998 ns
mapset_disjoint_pre1            1.22 M      817.22 ns  ±6922.87%         547 ns        3247 ns
mapset_disjoint                 1.11 M      901.48 ns  ±4319.02%         729 ns        1294 ns
mapset_intersection_pre2        0.38 M     2613.75 ns  ±1580.08%        2365 ns        3605 ns
mapset_intersection_pre1        0.34 M     2903.53 ns   ±803.65%        2600 ns        4392 ns
mapset_intersection             0.31 M     3192.43 ns   ±780.39%        2866 ns        5175 ns

Comparison: 
brute_force                     3.51 M
mapset_disjoint_pre2            1.70 M - 2.06x slower +303.40 ns
mapset_disjoint_pre1            1.22 M - 2.87x slower +532.27 ns
mapset_disjoint                 1.11 M - 3.16x slower +616.53 ns
mapset_intersection_pre2        0.38 M - 9.17x slower +2328.80 ns
mapset_intersection_pre1        0.34 M - 10.19x slower +2618.59 ns
mapset_intersection             0.31 M - 11.20x slower +2907.48 ns

Memory usage statistics:

Name                        Memory usage
brute_force                    0.0469 KB
mapset_disjoint_pre2           0.0391 KB - 0.83x memory usage -0.00781 KB
mapset_disjoint_pre1             1.55 KB - 33.17x memory usage +1.51 KB
mapset_disjoint                  2.13 KB - 45.50x memory usage +2.09 KB
mapset_intersection_pre2         2.21 KB - 47.17x memory usage +2.16 KB
mapset_intersection_pre1         2.79 KB - 59.50x memory usage +2.74 KB
mapset_intersection              3.37 KB - 71.83x memory usage +3.32 KB

**All measurements for memory usage were the same**

##### With input Medium, ordered & no-match #####
Name                               ips        average  deviation         median         99th %
mapset_intersection_pre2     1220.56 K        0.82 μs  ±3928.60%        0.70 μs        1.09 μs
mapset_intersection_pre1      942.86 K        1.06 μs  ±4223.21%        0.92 μs        1.51 μs
mapset_intersection           751.87 K        1.33 μs  ±2354.74%        1.21 μs        2.01 μs
mapset_disjoint_pre2          725.23 K        1.38 μs  ±2776.98%        1.23 μs        1.87 μs
mapset_disjoint_pre1          598.81 K        1.67 μs  ±2577.99%        1.47 μs        2.45 μs
mapset_disjoint               537.13 K        1.86 μs  ±1549.84%        1.64 μs        2.77 μs
brute_force                   492.79 K        2.03 μs  ±1735.52%        1.82 μs        2.54 μs

Comparison: 
mapset_intersection_pre2     1220.56 K
mapset_intersection_pre1      942.86 K - 1.29x slower +0.24 μs
mapset_intersection           751.87 K - 1.62x slower +0.51 μs
mapset_disjoint_pre2          725.23 K - 1.68x slower +0.56 μs
mapset_disjoint_pre1          598.81 K - 2.04x slower +0.85 μs
mapset_disjoint               537.13 K - 2.27x slower +1.04 μs
brute_force                   492.79 K - 2.48x slower +1.21 μs

Memory usage statistics:

Name                        Memory usage
mapset_intersection_pre2         0.56 KB
mapset_intersection_pre1         1.14 KB - 2.03x memory usage +0.58 KB
mapset_intersection              1.72 KB - 3.06x memory usage +1.16 KB
mapset_disjoint_pre2           0.0391 KB - 0.07x memory usage -0.52344 KB
mapset_disjoint_pre1             1.55 KB - 2.76x memory usage +0.99 KB
mapset_disjoint                  2.13 KB - 3.79x memory usage +1.57 KB
brute_force                    0.0469 KB - 0.08x memory usage -0.51563 KB

**All measurements for memory usage were the same**

##### With input Medium, random & match #####
Name                               ips        average  deviation         median         99th %
brute_force                  3255.81 K        0.31 μs ±22401.87%       0.196 μs        0.29 μs
mapset_disjoint_pre2         1708.52 K        0.59 μs  ±9133.31%        0.25 μs        3.01 μs
mapset_disjoint_pre1          605.86 K        1.65 μs  ±2248.63%        1.50 μs        2.45 μs
mapset_intersection_pre2      380.21 K        2.63 μs  ±1541.61%        2.37 μs        3.65 μs
mapset_disjoint               347.02 K        2.88 μs  ±1389.02%        2.53 μs        5.01 μs
mapset_intersection_pre1      263.65 K        3.79 μs   ±607.07%        3.49 μs        5.54 μs
mapset_intersection           196.51 K        5.09 μs   ±526.02%        4.66 μs        8.28 μs

Comparison: 
brute_force                  3255.81 K
mapset_disjoint_pre2         1708.52 K - 1.91x slower +0.28 μs
mapset_disjoint_pre1          605.86 K - 5.37x slower +1.34 μs
mapset_intersection_pre2      380.21 K - 8.56x slower +2.32 μs
mapset_disjoint               347.02 K - 9.38x slower +2.57 μs
mapset_intersection_pre1      263.65 K - 12.35x slower +3.49 μs
mapset_intersection           196.51 K - 16.57x slower +4.78 μs

Memory usage statistics:

Name                        Memory usage
brute_force                    0.0469 KB
mapset_disjoint_pre2           0.0391 KB - 0.83x memory usage -0.00781 KB
mapset_disjoint_pre1             1.55 KB - 33.17x memory usage +1.51 KB
mapset_intersection_pre2         2.21 KB - 47.17x memory usage +2.16 KB
mapset_disjoint                  2.13 KB - 45.50x memory usage +2.09 KB
mapset_intersection_pre1         2.79 KB - 59.50x memory usage +2.74 KB
mapset_intersection              3.37 KB - 71.83x memory usage +3.32 KB

**All measurements for memory usage were the same**

##### With input Medium, random & no-match #####
Name                               ips        average  deviation         median         99th %
mapset_intersection_pre2     1230.57 K        0.81 μs  ±4179.38%        0.70 μs        1.07 μs
mapset_disjoint_pre2          728.62 K        1.37 μs  ±2683.67%        1.22 μs        1.88 μs
mapset_intersection_pre1      578.31 K        1.73 μs  ±1529.87%        1.56 μs        2.53 μs
brute_force                   486.68 K        2.05 μs  ±1601.31%        1.82 μs        2.59 μs
mapset_disjoint_pre1          429.29 K        2.33 μs  ±1131.22%        2.07 μs        3.43 μs
mapset_intersection           364.92 K        2.74 μs   ±280.48%        2.52 μs        4.38 μs
mapset_disjoint               301.09 K        3.32 μs  ±1042.22%        2.96 μs        5.33 μs

Comparison: 
mapset_intersection_pre2     1230.57 K
mapset_disjoint_pre2          728.62 K - 1.69x slower +0.56 μs
mapset_intersection_pre1      578.31 K - 2.13x slower +0.92 μs
brute_force                   486.68 K - 2.53x slower +1.24 μs
mapset_disjoint_pre1          429.29 K - 2.87x slower +1.52 μs
mapset_intersection           364.92 K - 3.37x slower +1.93 μs
mapset_disjoint               301.09 K - 4.09x slower +2.51 μs

Memory usage statistics:

Name                        Memory usage
mapset_intersection_pre2         0.56 KB
mapset_disjoint_pre2           0.0391 KB - 0.07x memory usage -0.52344 KB
mapset_intersection_pre1         1.14 KB - 2.03x memory usage +0.58 KB
brute_force                    0.0469 KB - 0.08x memory usage -0.51563 KB
mapset_disjoint_pre1             1.55 KB - 2.76x memory usage +0.99 KB
mapset_intersection              1.72 KB - 3.06x memory usage +1.16 KB
mapset_disjoint                  2.13 KB - 3.79x memory usage +1.57 KB

**All measurements for memory usage were the same**

##### With input Small, ordered & match #####
Name                               ips        average  deviation         median         99th %
mapset_disjoint_pre2            4.42 M      226.35 ns ±17726.05%         188 ns         352 ns
brute_force                     4.09 M      244.76 ns ±20625.22%         168 ns         246 ns
mapset_disjoint_pre1            2.41 M      415.38 ns ±16105.81%         318 ns         589 ns
mapset_intersection_pre2        2.08 M      480.55 ns ±14662.17%         335 ns         547 ns
mapset_intersection_pre1        1.70 M      588.42 ns  ±7675.81%         445 ns         806 ns
mapset_disjoint                 1.68 M      594.42 ns  ±8642.53%         438 ns         814 ns
mapset_intersection             1.24 M      808.77 ns  ±7649.32%         628 ns        1067 ns

Comparison: 
mapset_disjoint_pre2            4.42 M
brute_force                     4.09 M - 1.08x slower +18.41 ns
mapset_disjoint_pre1            2.41 M - 1.84x slower +189.03 ns
mapset_intersection_pre2        2.08 M - 2.12x slower +254.19 ns
mapset_intersection_pre1        1.70 M - 2.60x slower +362.06 ns
mapset_disjoint                 1.68 M - 2.63x slower +368.07 ns
mapset_intersection             1.24 M - 3.57x slower +582.42 ns

Memory usage statistics:

Name                        Memory usage
mapset_disjoint_pre2               200 B
brute_force                         48 B - 0.24x memory usage -152 B
mapset_disjoint_pre1               392 B - 1.96x memory usage +192 B
mapset_intersection_pre2           464 B - 2.32x memory usage +264 B
mapset_intersection_pre1           656 B - 3.28x memory usage +456 B
mapset_disjoint                    584 B - 2.92x memory usage +384 B
mapset_intersection                848 B - 4.24x memory usage +648 B

**All measurements for memory usage were the same**

##### With input Small, ordered & no-match #####
Name                               ips        average  deviation         median         99th %
mapset_intersection_pre2        3.44 M      290.96 ns ±12444.20%         237 ns         388 ns
mapset_disjoint_pre2            3.33 M      300.19 ns ±13628.08%         264 ns         533 ns
brute_force                     2.69 M      372.15 ns ±15736.37%         274 ns         537 ns
mapset_disjoint_pre1            2.18 M      458.40 ns  ±7684.04%         390 ns         736 ns
mapset_intersection_pre1        2.01 M      498.74 ns ±14617.80%         345 ns         664 ns
mapset_intersection             1.49 M      670.07 ns  ±8514.25%         508 ns         892 ns
mapset_disjoint                 1.46 M      685.56 ns  ±8019.31%         526 ns         951 ns

Comparison: 
mapset_intersection_pre2        3.44 M
mapset_disjoint_pre2            3.33 M - 1.03x slower +9.23 ns
brute_force                     2.69 M - 1.28x slower +81.19 ns
mapset_disjoint_pre1            2.18 M - 1.58x slower +167.44 ns
mapset_intersection_pre1        2.01 M - 1.71x slower +207.78 ns
mapset_intersection             1.49 M - 2.30x slower +379.11 ns
mapset_disjoint                 1.46 M - 2.36x slower +394.60 ns

Memory usage statistics:

Name                        Memory usage
mapset_intersection_pre2           176 B
mapset_disjoint_pre2               200 B - 1.14x memory usage +24 B
brute_force                         48 B - 0.27x memory usage -128 B
mapset_disjoint_pre1               392 B - 2.23x memory usage +216 B
mapset_intersection_pre1           368 B - 2.09x memory usage +192 B
mapset_intersection                560 B - 3.18x memory usage +384 B
mapset_disjoint                    584 B - 3.32x memory usage +408 B

**All measurements for memory usage were the same**

##### With input Small, random & match #####
Name                               ips        average  deviation         median         99th %
mapset_disjoint_pre2            4.41 M      226.60 ns ±19724.56%         186 ns         357 ns
brute_force                     4.09 M      244.72 ns ±21042.87%         166 ns         246 ns
mapset_disjoint_pre1            2.29 M      436.83 ns ±15800.24%         337 ns         630 ns
mapset_intersection_pre2        2.16 M      462.65 ns ±14581.35%         329 ns         514 ns
mapset_intersection_pre1        1.63 M      615.02 ns  ±6928.59%         472 ns         856 ns
mapset_disjoint                 1.58 M      634.02 ns  ±8216.57%         478 ns         851 ns
mapset_intersection             1.19 M      843.24 ns  ±7105.41%         663 ns        1109 ns

Comparison: 
mapset_disjoint_pre2            4.41 M
brute_force                     4.09 M - 1.08x slower +18.11 ns
mapset_disjoint_pre1            2.29 M - 1.93x slower +210.23 ns
mapset_intersection_pre2        2.16 M - 2.04x slower +236.05 ns
mapset_intersection_pre1        1.63 M - 2.71x slower +388.42 ns
mapset_disjoint                 1.58 M - 2.80x slower +407.42 ns
mapset_intersection             1.19 M - 3.72x slower +616.63 ns

Memory usage statistics:

Name                        Memory usage
mapset_disjoint_pre2               200 B
brute_force                         48 B - 0.24x memory usage -152 B
mapset_disjoint_pre1               392 B - 1.96x memory usage +192 B
mapset_intersection_pre2           464 B - 2.32x memory usage +264 B
mapset_intersection_pre1           656 B - 3.28x memory usage +456 B
mapset_disjoint                    584 B - 2.92x memory usage +384 B
mapset_intersection                848 B - 4.24x memory usage +648 B

**All measurements for memory usage were the same**

##### With input Small, random & no-match #####
Name                               ips        average  deviation         median         99th %
mapset_intersection_pre2        3.41 M      293.45 ns ±13074.01%         237 ns         375 ns
mapset_disjoint_pre2            3.30 M      302.69 ns ±13233.95%         267 ns         569 ns
brute_force                     2.73 M      366.67 ns ±15029.28%         274 ns         536 ns
mapset_disjoint_pre1            2.03 M      492.48 ns  ±7245.82%         424 ns         793 ns
mapset_intersection_pre1        1.86 M      537.54 ns ±14331.05%         374 ns         709 ns
mapset_intersection             1.38 M      724.38 ns  ±7804.83%         568 ns         967 ns
mapset_disjoint                 1.38 M      725.41 ns  ±7598.62%         569 ns         996 ns

Comparison: 
mapset_intersection_pre2        3.41 M
mapset_disjoint_pre2            3.30 M - 1.03x slower +9.25 ns
brute_force                     2.73 M - 1.25x slower +73.23 ns
mapset_disjoint_pre1            2.03 M - 1.68x slower +199.04 ns
mapset_intersection_pre1        1.86 M - 1.83x slower +244.09 ns
mapset_intersection             1.38 M - 2.47x slower +430.93 ns
mapset_disjoint                 1.38 M - 2.47x slower +431.97 ns

Memory usage statistics:

Name                        Memory usage
mapset_intersection_pre2           176 B
mapset_disjoint_pre2               200 B - 1.14x memory usage +24 B
brute_force                         48 B - 0.27x memory usage -128 B
mapset_disjoint_pre1               392 B - 2.23x memory usage +216 B
mapset_intersection_pre1           368 B - 2.09x memory usage +192 B
mapset_intersection                560 B - 3.18x memory usage +384 B
mapset_disjoint                    584 B - 3.32x memory usage +408 B

**All measurements for memory usage were the same**
3 Likes

Since we are trying to check the performance of finding if any member of a list exist in another list, I don’t think is correct to take away the time it takes to build a secondary data structure to help you perform faster access.

I don’t have time now to repeat the process, but maybe something like this will be the best solution:

def list_collide?(list_a, list_b) do
  mapset_b = MapSet.new(list_b)
  Enum.any?(list_a, & MapSet.member?(mapset_b, &1))
end
2 Likes

hehe testing performance is unexpectedly a lot of fun :slight_smile:

Clearly using MapSet.intersection/2 is the wrong function, MapSet.disjoint?/2 is the correct one.

Here’s the new test & results

defmodule SpeedTest do
  def list_collide?(list_a, list_b) do
    mapset_b = MapSet.new(list_b)
    Enum.any?(list_a, & MapSet.member?(mapset_b, &1))
  end
end

Benchee.run(
  %{
    "mapset_disjoint" => fn({list_a, list_b}) -> MapSet.disjoint?(MapSet.new(list_a), MapSet.new(list_b)) end,
    "mix_strategy" => fn({list_a, list_b}) -> SpeedTest.list_collide?(list_a, list_b) end,
    "brute_force" => fn({list_a, list_b}) -> Enum.any?(list_a, &Enum.member?(list_b, &1)) end
  },
  formatters: [Benchee.Formatters.Console],
  time: 10,
  memory_time: 2,
  inputs: %{
    "Small, random & no-match"  => {Enum.to_list(1..5) |> Enum.shuffle, Enum.to_list(5 + 1 .. 2 * 5) |> Enum.shuffle},
    "Large, random & no-match"  => {Enum.to_list(1..1000) |> Enum.shuffle, Enum.to_list(1000 + 1 .. 2 * 1000) |> Enum.shuffle},
  }
)
##### With input Large, random & no-match #####
Name                      ips        average  deviation         median         99th %
mix_strategy           6.95 K      143.86 μs    ±13.47%      139.51 μs      192.21 μs
mapset_disjoint        3.80 K      263.03 μs     ±8.10%      265.75 μs      342.80 μs
brute_force            0.33 K     3027.75 μs    ±22.07%     2928.87 μs     4321.17 μs

Comparison:
mix_strategy           6.95 K
mapset_disjoint        3.80 K - 1.83x slower +119.16 μs
brute_force            0.33 K - 21.05x slower +2883.89 μs

Memory usage statistics:

Name                    average  deviation         median         99th %
mix_strategy            1.25 KB     ±0.00%        1.25 KB        1.25 KB
mapset_disjoint        44.50 KB     ±0.39%       44.52 KB       44.80 KB
brute_force           0.0469 KB     ±0.00%      0.0469 KB      0.0469 KB

Comparison:
mix_strategy            1.25 KB
mapset_disjoint        44.50 KB - 35.60x memory usage +43.25 KB
brute_force           0.0469 KB - 0.04x memory usage -1.20313 KB

##### With input Small, random & no-match #####
Name                      ips        average  deviation         median         99th %
brute_force            3.75 M      266.92 ns ±18415.96%         195 ns         499 ns
mix_strategy           2.47 M      404.27 ns ±11984.13%         286 ns         588 ns
mapset_disjoint        2.11 M      473.68 ns  ±6384.18%         366 ns        1730 ns

Comparison:
brute_force            3.75 M
mix_strategy           2.47 M - 1.51x slower +137.34 ns
mapset_disjoint        2.11 M - 1.77x slower +206.76 ns

Memory usage statistics:

Name               Memory usage
brute_force                48 B
mix_strategy              240 B - 5.00x memory usage +192 B
mapset_disjoint           584 B - 12.17x memory usage +536 B

**All measurements for memory usage were the same**
2 Likes