This is driving me crazy, any explanation will be rewarded with a fresh beer here in Spain…
I’ve written a short macro to build tuples with more than two elements in parallel using Task. The macro is named ‘parallel’ so if you type:
s = 100
[:timer.tc(fn -> {:first, :timer.sleep(s), :timer.sleep(s)} end),
:timer.tc(fn -> parallel {:second, :timer.sleep(s), :timer.sleep(s)} end)
]
> [{201235, {:first, :ok, :ok}}, {104703, {:second, :ok, :ok}}]
Parallel version is faster (double) than sequential. In fact the parallel version is faster for any value of s greater than zero. For s=0 values are:
> [{11, {:first, :ok, :ok}}, {242, {:second, :ok, :ok}}]
So the overhead of the macro is about 230 millis. But, when i use paralell in the library I’m building (exun), performance degrades substantially. Is there something I’m missing? This test probes that for any operation over 1 millis for each element, the macro will produce benefits… or not?
The macro:
defmacro parallel({:{}, _attrs, lista}) do
task_list =
lista
|> Enum.map(fn calculus ->
{{:., [], [{:__aliases__, [alias: false], [:Task]}, :async]}, [],
[{:fn, [], [{:->, [], [[], calculus ]}]}]}
end)
quote do
unquote(task_list) |> Task.await_many() |> List.to_tuple()
end
end