This is a present of the end of year for Elixir programmers from me: The new package “Pelemay Fast Parallel map” has been released! It drives multi cores and is 3x faster than Enum! Check it out, soon!
Pelemay Fast Parallel map (or pelemay_fp) - Fast parallel map function for Elixir
Pelemay Fast Parallel map: provides fast Parallel map function, similar to the Enum module, although computations will be executed in parallel using Process.spawn/4.
Here is a quick example on how to calculate the square of each element with PelemayFp:
list
|> PelemayFp.map(& &1 * &1)
We conducted performance evaluation of PelemayFp, Pelemay, Flow, Enum and Pmap on iMac Pro (2017):
I may have misunderstood the specification of spawn_monitor . Perhaps once a child process is launched by spawn_monitor , will it be started again even if it quit by executing exit(:normal) ?
As you mentioned, the threshold parameter is very sensitive to performance, so I guess it requires some kind of parameter optimization technology for practical usage. I have a plan to develop such a technology.
Why is there such a big performance difference with the pmap here:
Is it because the particular benchmark you’re doing benefits from batching? If so, could you get the same benefit by calling Enum.chunk_every before the map, and Enum.concat after? (modifying the async task to call Enum.map as well)
Oh you’re right, I played with it a bit and Enum.chunk_every is quite slow because the implementation is generic for all enumerables. If I write a specialized version just for lists instead, then I get similar timings for PelemayFp, Task.async_stream, and pmap (with batches of 12,000 items).
I noticed a small detail, the benchmark code is using a module constant for the list:
@list Enum.to_list(1..100000)
This may affect the timing a bit since constants aren’t subject to garbage collection and don’t get copied when they’re sent from one process to another, unlike variable data.
This may affect the timing a bit since constants aren’t subject to garbage collection and don’t get copied when they’re sent from one process to another, unlike variable data.
To solve it, I guess it should be implemented using before_each_bench and after_each_bench.
I’ve implemented PelemayFp.ParallelSplitter.split/7 as a faster integrated function like Enum.chunck_every/2 with Process.spawn/4. See it hexdoc of it: PelemayFp.ParallelSplitter — PelemayFp v0.1.2