Bin Counting in Elixir: Why is Enum.group_by much faster than Enum.reduce

kasparinho · May 6, 2019, 8:17pm

Bin Counting elements in a collection by updating a counter map is much slower than first grouping the elements by bin and then count each bin.

I would be curious to learn why. My intuition was the opposite, as the ‘reduce’ approach requires only one pass through the input collection.

Example:

iex> range = 1..1_000_000
iex> is_even? = fn x -> rem(x,2) == 0 end

iex> Enum.reduce(range, %{}, fn elem, acc -> Map.update(acc, is_even?.(elem), 1, &(&1 + 1)) end)
%{false: 500000, true: 500000}

is much slower than

iex> Enum.group_by(range, &is_even?.(&1)) |> Map.new(fn {key, list} -> {key, length(list)} end)
%{false: 500000, true: 500000}

arnomi · May 6, 2019, 8:34pm

My best guess is that the map lookup is the factor that slows down the ‘reduce’ approach. Although you need to traverse the list only once, you need to perform a key lookup on every item. With the group_by approach on the other hand, might have the group lookup better optimized (although I cannot think of a much better optimization out of hand than using a map for looking up the group…)

On my system the difference is, however, also not that significant. The ‘reduce’ approach is about 1.2 times slower…

lpil · May 6, 2019, 8:52pm

For reference, here’s the implementation of group_by:

  def group_by(enumerable, key_fun, value_fun \\ fn x -> x end)

  def group_by(enumerable, key_fun, value_fun) when is_function(key_fun) do
    reduce(reverse(enumerable), %{}, fn entry, acc ->
      key = key_fun.(entry)
      value = value_fun.(entry)

      case acc do
        %{^key => existing} -> Map.put(acc, key, [value | existing])
        %{} -> Map.put(acc, key, [value])
      end
    end)
  end

I too am surprised that this is faster, especially since group_by reverses the list, adding another iteration.

arnomi · May 6, 2019, 8:59pm

Hm… the following benchmark now shows a different result…

is_even? = fn x -> rem(x, 2) == 0 end

Benchee.run(
  %{
    "reduce" => fn range ->
      Enum.reduce(range, %{}, fn elem, acc -> Map.update(acc, is_even?.(elem), 1, &(&1 + 1)) end)
    end,
    "group" => fn range ->
      Enum.group_by(range, &is_even?.(&1)) |> Map.new(fn {key, list} -> {key, length(list)} end)
    end
  },
  inputs: %{
    "Range" => 1..1_000_000
  }
)

##### With input Range #####
Name             ips        average  deviation         median         99th %
reduce          4.65      215.08 ms    ±12.47%      209.29 ms      338.53 ms
group           2.47      405.21 ms    ±18.65%      411.79 ms      552.56 ms

Comparison:
reduce          4.65
group           2.47 - 1.88x slower +190.13 ms

So your original hunch, might be correct after all.

benwilson512 · May 6, 2019, 10:18pm

Make sure you are measuring with something like Benchee and not measuring iex. code constructed in iex is interpreted, which means it will run slower.

kasparinho · May 7, 2019, 7:12am

Indeed, it appears that when running in iex, the ‘group_by’ version gets an unfair advantage, probably because it gets to use a compiled reduce loop behind the scenes, while the ‘reduce’ version has an interpreted reduce loop.

Thank you for your help!