# Why should we iterate a list twice to pipe 2 aggregate functions?

I have structs `Company`, `Employee` and a list of struct values `employees`. Struct `Employee` contains 2 fields - `hours` and `bill`.

I need to know total hours and total bill. Should I iterate once (for loop?!) to calculate both together or should we do the following? One of my colleagues say the following is Elixir-y. Why is iterating multiple times encouraged when it is inefficient?

``````def calculate(company) do
company
|> calculate_total_hours()
|> calculate_total_bill()
end

def calculate_total_hours(company) do
total = company.employees
|> Enum.map(& &1.hours)
|> Enum.sum()

%{company | total_hours: total}
end

def calculate_total_bill(company) do
total = company.employees
|> Enum.map(& &1.bill)
|> Enum.sum()

%{company | total_bill: total}
end
``````

EDIT: Updated code

Because it’s easier to read operations that are cleanly decomposed into smaller parts, and because “inefficient” is relative. Traversing a list is a very cheap operation on the BEAM (it’s almost a Lisp under the hood).

It’s hard to say more about the specific example since it doesn’t work: `calculate_total_hours` returns a number but is piped into a function that expects a list, and it’s unclear where the total hours and total bill should be stored in `company`.

1 Like

“Make it work, then make it beautiful, then if you really, really have to, make it fast. 90 percent of the time, if you make it beautiful, it will already be fast. So really, just make it beautiful!”

4 Likes

I would likely write this as a single reduce, and I don’t think it suffers elixir-y-ness

``````{total_hours, total_bill} =
Enum.reduce(company.employees, {0, 0}, fn %{hours: h, bill: b}, {h_acc, b_acc} ->
{h_acc + h, b_acc + b}
end)
``````
10 Likes

Thanks. I made up the example and the code was not correct. Sorry about that. Updated.

I made up the example to capture the problem. Sorry that it was not a working code. I updated code.
The question remains the same - why iterate twice when I can achieve the same goal by iterating just once? In future, what should I answer when somebody coming from OOP background asks me?

Why not just use streams? They are lazy so as long as you only call Stream functions, you can chain many operations together and only evaluate them at the end for a single iteration. You can keep the multiple functions calls so you don’t have to mash them into one, and you can still iterate only once. Take a look at the paragraphs at the top of the Stream docs for a very similar example: https://hexdocs.pm/elixir/Stream.html

@jswny , each operation (calculate_total_hours and calculate_total_bill) needs to be evaluated. So, these 2 operations do not seem to be streamable.

I am curious how You would make it in one go…

If You have those 2 ways to solve this problem, use Benchee to check the speed of both. You might be surprised by the result

UPDATE: I fixed the repo path, I was not on the right tab

2 Likes

@kokolegorille , see @gregvaughn’s answer above that uses a single reduce function to calculate both in one go.

I saw once an Enum.reduce going slower than Enum.map, followed by Enum.into…

If you want only `calculate_total_hours` you can call that single function. It may not bee a good idea to bind two independent pieces of logic together, unless you are 100% sure that you will always need to call the two computations. And in that case I would just follow above advice: make it beautiful then move on.

1 Like

It would take very careful benchmarking to clearly show whether the single `reduce` is faster or slower than the two passes over the list. My back-of-the-envelope calculation:

Split loops, per element in the list:

• two “get rest of list”
• two map accesses (to get `bill` or `hours`)