How to spawn many, many processes really fast?

I’m currently playing around with GenServers and for a totally unrealistic, but still interesting problem, I wondered whether there is a faster way of spawing 40.000+ processes than this:

pids = for id <- 1..40_000 do
  {:ok, pid} = GenServer.start_link(MyGenServerModule, id: id)
  pid
end

Since this for-loop runs synchronously, it takes around 10 to 12 seconds on my macBook to spawn all 40k processes. I wondered, whether this could somehow be achieved asynchronously?

You want just to spawn them or get their PIDs as well? Should these processes be inside supervision tree?

I need their I PIDs to send messages to them. A supervisor is not needed. I want to spawn them inside a LiveView process and have them linked to that LV process.

1…40_000 |> Task.async_stream(fn _ → YOUR_CODE end) |> Enum.to_list

Not sure what evil thing you are trying to do… I won’t ask :thinking:

2 Likes

Thank you! That indeed decreased the execution time from ~12s to ~1.2s, but unfortunately, I believe the processes weren’t linked to the original process correctly. Maybe because the GenServer.start_link/2 was called inside the Task.async_stream? Will those processes then correctly be linked to the “parent process” in which I used to call the GenServer.start_link/2 function originally?

PS: I’m currently playing around with simulating the Game of Life with 50.000 erlang processes. Because, why not? :smiley:

Screenshot 2021-09-04 at 14.41.47

10 Likes

Hi, That’s pretty cool.

To link the process to your main process, you can do this:
1..40_000 |> Task.async_stream(fn _ -> YOUR_CODE end) |> Enum.map(fn {:ok, pid} -> Process.link(pid) end)

6 Likes

Unfortunately, that took even longer than the sequential loop. Sequential was 12s and this version was something between 60 and 85s :frowning_face: I decided on using the sequential version now, but thank you anyway :heart:

how about starting the processes in some DynamicSupervisors?

That’s weird. On my machine they hardly differ. :thinking:

Spawning processes, even GenServer processes, is fast in Elixir. I just wrote this script based on your code to test:

defmodule MyGenServerModule do
  use GenServer

  @impl true
  def init(state) do
    {:ok, state}
  end

  @impl true
  def handle_call(:ping, _from, state) do
    {:reply, {:pong, state[:id]}, state}
  end
end

{t, pids} =
  :timer.tc(fn ->
    for id <- 1..40_000 do
      {:ok, pid} = GenServer.start_link(MyGenServerModule, id: id)
      pid
    end
  end)

IO.inspect("started #{length(pids)} processes in #{div(t, 1000)} ms")

Enum.random(pids) |> GenServer.call(:ping) |> IO.inspect()

The output I got:

"started 40000 processes in 521 ms"
{:pong, 2861}

So maybe you’re doing too much work in the init/1 callback? If true, you can move the hard work into a handle_info/2 callback, by sending a message to self in the init callback. For example:

def init(state) do
  send(self(), :start)
  {:ok, state}
end

def handle_info(:start, state) do
  # start living
end

Edit: :conitune tuple mentioned by @lud is better.

2 Likes

I played with that problem and had 4 processes spawn 10 processes spawn 10 processes spawn 10 processes, etc, to spawn 40K processes.

It is fast, but not that much faster than spawning them sequentially. And if you do that then I don’t know how you could supervise those processes.

For better synchronization I would pick a timestamp 3 seconds in the future, spawn all processes sequentially and have them wait that precise timestamp to start the logic in sync.

Here is my test code:

defmodule Serv do
  use GenServer

  def start_nolink(id, parent) do
    GenServer.start(__MODULE__, {id, parent})
  end

  def init({id, parent}) do
    Process.link(parent)
    send(parent, {:started, id})
    {:ok, id}
  end
end

defmodule Spawner do
  def rec_spawn(parent, scheme) do
    rec_spawn(parent, scheme, 0)
  end

  defp rec_spawn(parent, [], sum) do
    Serv.start_nolink(sum, parent)
  end

  defp rec_spawn(parent, [range | ranges], sum) do
    sum = sum * 10
    starter = fn -> Enum.map(range, &rec_spawn(parent, ranges, sum + &1)) end
    spawn(starter)
  end
end

defmodule Control do
  def check_started(max) do
    _check_started(max + 1)
  end

  defp _check_started(max) do
    case Process.info(self(), :message_queue_len) do
      {:message_queue_len, ^max} ->
        IO.puts("all started OK")

        flush_all()

      {:message_queue_len, n} when n < max ->
        IO.puts("started #{n}/#{max}")
        Process.sleep(100)
        _check_started(max)
    end
  end

  defp flush_all() do
    receive do
      {:started, _} ->
        flush_all()
    after
      0 -> :ok
    end
  end

  def sum_ranges(ranges) do
    {sum, _} =
      List.foldr(ranges, {0, 1}, fn range, {sum, size} ->
        {sum + Enum.max(range) * size, size * 10}
      end)

    sum
  end
end

ranges = [0..3, 0..9, 0..9, 0..9, 0..9]

parent = self()
Spawner.rec_spawn(parent, ranges)

ranges
|> Control.sum_ranges()
|> Control.check_started()

2 Likes

There is a feature to do just that:

  @impl true
  def init(state) do
    {:ok, state, {:continue, :after_start}}
  end

  @impl true
  def handle_continue(:after_start, state) do
    # start living
  end
7 Likes

Nice, I thought :continue would block. Glad I was wrong.

1 Like

Thank you all very much for your input! You folks are awesome!

Unfortunately, I still can’t reproduce the speed gains by the asynchronous execution. Async takes 2x longer than sync (10.000 processes - 650ms vs 1200ms, 50.000 processes - 15s vs 28s).

Here’s my code: PeterAndCode/page_live.ex at main · PJUllrich/PeterAndCode · GitHub

I checked that the Cell.ex doesn’t do too much work in its init/2 function and rewrote it using the handle_continue-pattern, but there were no notable performance gains. This isn’t surprising since the Cell wasn’t doing much in its init/2 anyway, but only sent a message back to the LiveView.

Any hints about how to decrease the time to spawn 50.000+ processes is still much appreciated, but I have the feeling that it simply takes this long :man_shrugging:

PS: Does it maybe have something to do with my Elixir/Erlang setup? :thinking: Maybe the BEAM is only using one out of the 6 cores I have instead of all 6?

PPS: The BEAM starts up with the following info:

Erlang/OTP 24 [erts-12.0.4] [source] [64-bit] [smp:12:12] [ds:12:12:10] [async-threads:1] [jit] [x86_64-apple-darwin20.6.0]

It appears to use all 6 cores and their multithreading ([smp:12:12]), but has only one async-thread? This seems to be the default :thinking:
Could somebody who’s execution is faster in async verify that you only have 1 async-thread as well?

Just to be a bit picky here, starting a GenServer does more than just spawn a process. So the GenServer.start_link will spawn the process and then sit and wait until the new process does the necessary initialisation to become an OTP process and then calls the behaviour’s init callback. When that has completed the GenServer.start_link will return {:ok,pid}. This is a synchronous operation which is well defined in how a GenServer behaviour works. Can you get around it? NO, all you can do is try and make the init callback as fast as possible. Check the documentation for tips for doing that.

If you want to check how fast you can spawn processes just do a spawn instead, of course then you won’t get a GenServer. :smile:

8 Likes

Well, there’s your bottleneck. The LiveView has to handle 50K messages.

6 Likes

Uuuuuuh that’s so smart! And YES! That was the problem!

I removed sending a message from the init/1 of the cell back to the LiveView. I can now spawn 40.000 processes in only 300ms! Previously it took around 7 seconds. Amazing! Thanks so much!

The async version still takes 2x the time though, but I’ve given up on understanding why. If anybody wants to continue the investigation, this is my code.

Thanks again @gregvaughn and @rvirding and all the others for your very helpful input! :slight_smile:

3 Likes

I wrote down my findings in a blog post :slight_smile:

4 Likes

@Sebb regarding your question about the timing of tick/tock: I just played around with the timings. When it was too fast the board wouldn’t update consistently. Some cells would update, others wouldn’t. When it was too slow, I got bored :grinning_face_with_smiling_eyes:
So, simply trial and error.

1 Like

I was just wondering. I have something similar and tried to find a better solution than trial and error - but didn’t.