How to aggregating the value of given key in a list of maps in an efficient way

Can anyone give me a recommendation on how to do this better? Is there any build in functions to Enum/List/Map/Etc I may not know about to do this efficiently and in a more readable way?

I have a data structure similar to this

    [
        %{customer_id: "123456", amount: %Money{amount: 332211, currency: :USD}},
        %{customer_id: "234567", amount: %Money{amount: 321123, currency: :USD}},
        %{customer_id: "345678", amount: %Money{amount: 123321, currency: :USD}},
        %{customer_id: "123456", amount: %Money{amount: 112233, currency: :USD}}
    ]

and I want to aggregate the amounts based on customer_id so I end up with the following

    [
        %{customer_id: "123456", amount: %Money{amount: 444444, currency: :USD}}, # No longer duplicated
        %{customer_id: "234567", amount: %Money{amount: 321123, currency: :USD}},
        %{customer_id: "345678", amount: %Money{amount: 123321, currency: :USD}}
    ]

The way I am doing it now is

defmodule Aggregator do
  def aggregate_entries(entries) do
    res = entries
    |> Enum.reduce(%{}, fn entry, acc ->
      current_id = Map.get(entry, :customer_id)
      value_in_accumulator = Map.get(acc, String.to_atom(current_id))
      accumulate_updated_amount(acc, entry, current_id, value_in_accumulator)
    end)
    |> Enum.reduce([], fn {id, amount}, acc ->
      [%{customer_id: id, amount: Money.new(amount, :USD)} | acc]
    end)
  end

  defp accumulate_updated_amount(acc, entry, current_id, value_in_accumulator) do
    if !is_nil(value_in_accumulator) do
      aggregate_money_for_entry(entry, value_in_accumulator, acc)
    else
      acc |> Map.put(String.to_atom(current_id), Map.get(entry, :amount).amount)
    end
  end

  defp aggregate_money_for_entry(entry, value_in_accumulator, acc) do
    acc |> Map.replace!(String.to_atom(entry.customer_id), entry.amount.amount + value_in_accumulator)
  end
end
iex(1)> defmodule Money do                                                               
...(1)> defstruct [amount: nil, currency: nil]
...(1)> end
iex(2)> entries = [                                                                      
...(2)>         %{customer_id: "123456", amount: %Money{amount: 332211, currency: :USD}},
...(2)>         %{customer_id: "234567", amount: %Money{amount: 321123, currency: :USD}},
...(2)>         %{customer_id: "345678", amount: %Money{amount: 123321, currency: :USD}},
...(2)>         %{customer_id: "123456", amount: %Money{amount: 112233, currency: :USD}}
...(2)>     ]
iex(3)> entries |> 
Enum.group_by(&Map.get(&1, :customer_id)) |> 
Enum.map(fn {k, list} -> Enum.reduce(list, %{customer_id: k}, fn %{amount: amount}, acc -> Map.update(acc, :amount, amount, & %{&1 | amount: &1.amount + amount.amount}) end) end)
[
  %{amount: %Money{amount: 444444, currency: :USD}, customer_id: "123456"},
  %{amount: %Money{amount: 321123, currency: :USD}, customer_id: "234567"},
  %{amount: %Money{amount: 123321, currency: :USD}, customer_id: "345678"}
]

As it is not really readable, Here it is formatted…

entries 
|> Enum.group_by(&Map.get(&1, :customer_id)) 
|> Enum.map(fn {k, list} -> 
  Enum.reduce(list, %{customer_id: k}, fn %{amount: amount}, acc 
    -> Map.update(acc, :amount, amount, & %{&1 | amount: &1.amount + amount.amount}) 
  end) 
end)
3 Likes

The simplest way may be to first group by customer_id and then reduce:

defmodule Accumulate do
  @list  [
        %{customer_id: "123456", amount: %Money{amount: 332211, currency: :USD}},
        %{customer_id: "234567", amount: %Money{amount: 321123, currency: :USD}},
        %{customer_id: "345678", amount: %Money{amount: 123321, currency: :USD}},
        %{customer_id: "123456", amount: %Money{amount: 112233, currency: :USD}}
    ]

  def list do
    @list
    |> Enum.group_by(fn i -> Map.get(i, :customer_id) end, fn c -> Map.get(c, :amount) end)
    |> Enum.map(fn {c, values} -> {c, sum_values(values)} end)
    |> Map.new
  end

  def sum_values(values) do
    acc = Enum.reduce(values, 0, fn v, acc ->
      acc + v.amount
    end)

    %Money{currency: hd(values).currency, amount: acc}
  end
end

Produces:

iex> Accumulate.list
%{
  "123456" => %Money{amount: 444444, currency: :USD},
  "234567" => %Money{amount: 321123, currency: :USD},
  "345678" => %Money{amount: 123321, currency: :USD}
}

Note that like in your code, I ignore the currency code which is not a good idea in practise.

2 Likes

Nice, but You return a Map, instead of a List :slight_smile:

Still iterable like a list, or pass it into Enum.into() :slight_smile:

1 Like

accumulate(list, &sum_values/1)

That way a more currency aware function can be injected at a later time.

1 Like

I like this response since it it more readable than what I had before. I benchmarked it against mine and speed differences seem to be nominal. I changed list/1 to be

entries
|> Enum.group_by(fn i -> Map.get(i, :customer_id) end, fn c -> Map.get(c, :amount) end)
|> Enum.map(fn {c, values} -> {c, sum_values(values)} end)
|> Enum.map(fn {id, amount} ->
  %{customer_id: id, amount: amount}
end)

so that it gives me the same data structure I had (it makes it easier to work with later on in this format). I will probably add @peerreynders suggestion in as well. Thanks for the replies.

2 Likes

Here is my version:

defmodule Money do
  defstruct [:amount, :currency]

  @type currency :: String.t()
  @type t :: %__MODULE__{amount: :integer, currency: currency}
end

defmodule Example do
  def sample(data),
    do: data |> Enum.group_by(& &1[:customer_id], & &1[:amount]) |> Enum.map(&do_sample/1)

  defp do_sample({customer_id, amounts}), do: %{customer_id: customer_id, amount: sum(amounts)}

  defp sum([%{amount: amount, currency: currency, __struct__: module} = head | tail]) do
    if Enum.all?(tail, &(&1.currency == currency)) do
      amount_sum = Enum.reduce(tail, amount, &(&1.amount + &2))
      Map.replace!(head, :amount, amount_sum)
    else
      name = module |> Module.split() |> Enum.join(".")
      raise "TODO: Write sum code for `t:[#{name}.t]` with different `t:#{name}.currency`."
    end
  end
end

data = [
  %{customer_id: "123456", amount: %Money{amount: 332_211, currency: :USD}},
  %{customer_id: "234567", amount: %Money{amount: 321_123, currency: :USD}},
  %{customer_id: "345678", amount: %Money{amount: 123_321, currency: :USD}},
  %{customer_id: "123456", amount: %Money{amount: 112_233, currency: :USD}}
]

Example.sample(data)

If you have multiple currency for same customer_id it would raise.
If you have multiple currency, but each customer_id have its own it would not raise.

In this example:
Changing 1st or 4th entry’s currency would raise.
Changing any or both 2nd and 3rd entry’s currency would not raise.

1 Like

This is possibly presuming too much.

I want to aggregate the amounts based on customer_id

The simplest interpretation (rather than simplest implementation) is that amounts with differing currencies cannot be combined. Given there is no hard and fast rule that there is only one map per customer ID (they’re in a list, not a map) in the output it is just as likely that you’ll have one map per customer/currency.

i.e. it’s time to go back and ask some questions.

My earlier comment was based on the presumption that there is a default currency and that conversion will occur. But that could also be misguided as there could be separate currency accounts per customer in order to avoid conversion losses.

The more generic solution is to only aggregate “like things”, e.g,

    [
        %{customer_id: "234567", amount: %Money{amount: 321123, currency: :USD}},
        %{customer_id: "345678", amount: %Money{amount: 123321, currency: :USD}},
        %{customer_id: "123456", amount: %Money{amount: 332211, currency: :EUR}},
        %{customer_id: "123456", amount: %Money{amount: 112233, currency: :USD}}
    ]

At the current time everything is defaulted to :USD however I want to leave it open to handle different types later on. That is why I liked the idea you had about passing in a sum_values/1 function that would allow the caller to determine how to handle it.

Yes, but the code that is using that function is only prepared to take a single %Money{} value. When supporting multiple currencies getting a %{currency, money} would be more flexible. That could be either dropped directly into %{customer, accounts} or split into multiple %{customer, amount}.

1 Like

For sure I’m not working like advanced AI which trying to interprets something on its own. From my experience assuming too much things gives you lots of misunderstand problems in future. My solutions are mostly as strict as compiler. If I see edge-case and its not defined in description I’m just giving a warning in some way. Here I give simple raise code which allows to understand why my code stopped working.

I saw currency and just got “red light”. As I have learned … really often some things are described too less and changing them later after lots of made assumptions costs you a lot of time - sometimes you even need to rewrite code. Giving a solution with clearly marked warning often forces author to rethink of some edge cases. If such edge case is not allowed then simply raise related code should be removed without any problem in future.

2 Likes

The Money library on hex.pm would probably be better to use for the actual ‘records’ of money as well. :slight_smile:

1 Like