Efficient double loop sum

Hello, I am learning Elixir and am currently trying to find a more efficient solution to this problem.

Given the following example list: a=[1,2,3,4,5], generate a new list which contains the UNIQUE sum of each member of “a” with every other member of “a” (Including itself).

When dealing with large lists, around 7000 numbers, it takes about 80 seconds to run this on my computer:

sums=ab |> Enum.reduce(%{},fn x,list ->
        # Loop over each number again
        ab |> Enum.reduce(list,fn y,acc ->
          # Get sum
          Map.put(acc,y+x,true)
        end)
      end)
uniq=Map.keys(sums)

I have tried using Lists (with prepend) instead of a Map and adding an Enum.uniq to keep unique numbers but the performance seems to be a bit slower. Doing the same thing in Ruby runs in under 10 seconds. It looks like having an immutable state is causing me some issues, but I am sure it is due to my lack of experience with Elixir.

If you have any pointers, I would very much appreciate it.

Thank you,

Sly

how does it compare with using for-comprehensions?

a = [1,2,3,4,5]
(for i <- a, j <- a, into: [], do: i + j) |> Enum.uniq
3 Likes

Can you share the comparable Ruby code?

1 Like

Hello,

@allyraza: thank you for the recommendations. The comprehension example is slower than the the original attempt using double Enum.reduce.

@gregvaughn: Please see the 3 options below. I ran each piece of code on Linux with “time COMMAND SCRIPT”

Ruby:

Define range and convert to array

a=(1…7000).to_a

Prepare result array

result=[]

First loop

a.each do |x|
# Second loop (inner loop)
a.each do |y|
# add to array
result.push(x+y)
end
end

Keep only uniq

sums=a.uniq

Print count of sums

puts(sums.size)

7000

real 0m8.524s
user 0m7.989s
sys 0m0.308s

Elixir with Enum.reduce + Map

Set range

a=1…7000

Perform the sum of the range with each element of itself

result=a |> Enum.reduce(%{}, fn x,acc ->
a |> Enum.reduce(acc, fn y,acc2 ->
# Add to map because this ensures uniqueness
Map.put(acc,x+y,true)
end)
end)

Extract keys (sums)

sums=Map.keys(result)

Print count of sums

IO.inspect Enum.count(sums)

7000

real 0m27.161s
user 0m26.230s
sys 0m0.176s

Elixir with Comprehensions + [] (does not seem to provide the same count for some reason)

Set range a=1…7000

Perform the sum of the range with each element of itself

result=(for i <- a, j <- a, into: [], do: i + j)

Get uniq sums

sums=result |> Enum.uniq

Print count of sums

IO.inspect Enum.count(sums)

13999

real 4m55.373s
user 0m23.989s
sys 0m7.747s

Thank you very much for your time and help.

Sly

I had the following times:

  • Your Elixir code: 19.105s
  • Ruby - 5.851s
  • Optimized Elixir version: 3.567s

This is the code of the optimized version:

defmodule Test do
  def unique_sums(list) do
    table = :ets.new(:unique_sums, [:set])

    try do
      collect_unique_sums(list, table)
      :ets.select(table, [{{:"$1"}, [], [:"$1"]}])
    after
      :ets.delete(table)
    end
  end

  defp collect_unique_sums([], _table), do: :ok

  defp collect_unique_sums([el | rest], table) do
    :ets.insert(table, {el + el})
    Enum.each(rest, &:ets.insert(table, {&1 + el}))
    collect_unique_sums(rest, table)
  end
end

1..7000
|> Enum.to_list()
|> Test.unique_sums()

I did two things here:

  1. An algorithmical optimization to use combinations without repetitions. I.e. in a list 1,2,3 we’ll only consider the pairs (1, 2), (1, 3), and (2, 3).
  2. Replaced map with an ets table - an off-heap structure which doesn’t generate any garbage.

If you apply the 1st optimization to the Ruby code, I suspect that Ruby would again be faster. The difference should be much smaller though.

1 Like

And this can be :ets.insert(table, Enum.map(rest, &{&1 + el})) , though either produce 13999 results?

1 Like

Is there a way to do this in parallel? I had attempted and then realized I was not including every combination given the disjoint sets used to create work across Tasks.

Yes, that would work too, and it even seems to reduce the running time a bit more. Good catch! :slight_smile:

I usually avoid this approach though, because we need to create a potentially large list on the heap before moving it to the table. But since in this case the intermediate list is not that big, it works fine.

You can easily using Flow.

“when you have a hammer everything looks like nail” he might be able to get the sum with flow, but I think it is overkill for what he is trying to do

Perhaps. But if he wants to do it, Flow will flow him to his solution.

@sasajuric’s solution times for me at about ~2500ms. Just for interest:

defmodule Test do
  defp make_all_sums(top),
    do: &all_sums(&1, top, [])

  defp all_sums(_, 0, results),
    do: results

  defp all_sums(value, top, rest),
    do: all_sums(value, top - 1, [top + value | rest])

  def stream_run(top) do
    1..top
    |> Stream.flat_map(make_all_sums(top))
    |> Stream.uniq()
    |> Enum.to_list()
  end

  def enum_run(top) do
    1..top
    |> Enum.flat_map(make_all_sums(top))
    |> Enum.uniq()
  end

  def run(arg) do
    top = 7000

    fun =
      case arg do
        ["enum" | _] ->
          fn -> Test.enum_run(top) end

        _ ->
          fn -> Test.stream_run(top) end
      end

    {time, _} = :timer.tc(fun)
    IO.puts("Time: #{div(time, 1000)}ms")
  end
end

Test.run(System.argv())
$ elixir test1.ex
Time: 4872ms
$ elixir test1.ex enum
Time: 7192ms
$

Optimization:

(1+1) (1+2) (1+3) ... (1+7000)
[1+2] (2+2) (2+3) ... (2+7000)
[1+3] [2+3] (3+3) ... (3+7000)
...
[1+7000] ...       (7000+7000)
  defp all_sums(value, top, results) when top < value, # remaining sums are known duplicates
    do: results
  defp all_sums(value, top, rest),
    do: all_sums(value, top - 1, [top + value | rest])
$ elixir test1.ex
Time: 2362ms
$ elixir test1.ex enum
Time: 3550ms
$

Smart use of the guard @peerreynders.

Because I was interested, and since my own take on using Task did not result in significant improvements, I tried using Flow. This resulted in significant speed and memory usage improvements.

The following is a comparison of the different approaches from sasajuric and peerreynders and the Flow modification. Benchee used for bench marking.

Name               ips        average  deviation         median         99th %
flow              1.46         0.68 s     ±3.28%         0.69 s         0.72 s
flow+ets          0.70         1.44 s     ±1.19%         1.44 s         1.46 s
stream            0.37         2.69 s     ±1.71%         2.69 s         2.75 s
sync              0.36         2.81 s     ±3.09%         2.81 s         2.91 s

Comparison:
flow              1.46
flow+ets          0.70 - 2.10x slower
stream            0.37 - 3.94x slower
sync              0.36 - 4.12x slower

Memory usage statistics:

Name             average  deviation         median         99th %
flow             8.35 MB     ±0.26%        8.34 MB        8.37 MB
flow+ets         0.22 MB     ±0.00%        0.22 MB        0.22 MB
stream         944.08 MB     ±0.00%      944.08 MB      944.08 MB
sync           374.68 MB     ±0.00%      374.68 MB      374.68 MB

Comparison:
flow             8.34 MB
flow+ets         0.22 MB - 0.03x memory usage
stream         944.08 MB - 113.16x memory usage
sync           374.68 MB - 44.91x memory usage

CODE

defmodule Test do
  defp make_all_sums(top),
    do: &all_sums(&1, top, [])

  defp all_sums(value, top, results) when top < value, # remaining sums are known duplicates
    do: results

  defp all_sums(value, top, rest),
    do: all_sums(value, top - 1, [top + value | rest])

  def stream_run(top) do
    1..top
    |> Stream.flat_map(make_all_sums(top))
    |> Stream.uniq()
    |> Enum.to_list()
  end

  def flow_run(top) do
    1..top
    |> Flow.from_enumerable()
    |> Flow.flat_map(make_all_sums(top))
    |> Flow.uniq()
    |> Enum.uniq()
  end

  defp make_all_sums_ets(top, table),
    do: &all_sums_ets(&1, top, table, [])

  defp all_sums_ets(value, top, table, results) when top < value, # remaining sums are known duplicates
    do: results

  defp all_sums_ets(value, top, table, rest) do
    :ets.insert(table, {top + value})
    all_sums_ets(value, top - 1, table, [])
  end

  def flow_ets_run(top) do
    table =
      :ets.new(:unique_sums, [
        :set,
        :public,
        read_concurrency: true,
        write_concurrency: true
      ])

    1..top
    |> Flow.from_enumerable()
    |> Flow.flat_map(make_all_sums_ets(top, table))
    |> Enum.to_list()

    :ets.select(table, [{{:"$1"}, [], [:"$1"]}])
  end

  def unique_sums(list) do
    table = :ets.new(:unique_sums, [:set])

    try do
      collect_unique_sums(list, table)
      :ets.select(table, [{{:"$1"}, [], [:"$1"]}])
    after
      :ets.delete(table)
    end
  end

  defp collect_unique_sums([], _table), do: :ok

  defp collect_unique_sums([el | rest], table) do
    :ets.insert(table, {el + el})
    Enum.each(rest, &:ets.insert(table, {&1 + el}))
    collect_unique_sums(rest, table)
  end

end
2 Likes

The memory measurement in flow is not accurate. Benchee measures only the main process’ memory use and flow uses several processes. The memory results are probably meaningless because of this. The general speed results are certainly all valid, though.

2 Likes

Good to know, thx @michalmuskala

Thank you everyone. Your help and guidance is really appreciated. I learned many things from this post.

2 Likes