Background
I’m working on a Discord bot in Elixir that runs as two separate applications, the bot itself, and all the backend logic and data handling that needs to be done on behalf of the bot. The back end application has to do a lot of data processing on startup which it’s currently handling asynchronously (curretly, each component in the backend is a GenServer
, and the init callback just immediately returns a continue instruction that triggers the actual data processing needed for initialization). This is working rather well overall, except for one specific component which has a long and computationally intensive initialization sequence after a change in either the code or the initialization data for that component.
The component in question needs to process a very large list (roughly 2900 items) by computing an SQL transaction for each item and then running that against a database. The amount of processing here is time-prohibitive if it needs to be done serially (each item in the list takes about 50-100ms to process and then run the SQL transaction, so the full list takes almost 5 minutes if run one-by-one), but there are a handful of computations that can be shared across all the items, so my current code is using Stream.chunk_every/1
and Task.async_stream/3
to run the initialization in a number of parallel chunks equal to the number of online schedulers like so:
items
|> Stream.chunk_every(div(length(items), System.schedulers_online()) + 1)
|> Task.async_stream(&process_chunk/1, ordered: false)
|> Enum.to_list()
This is working in terms of actually processing things correctly and making the initialization fast enough to be useful, but causing a completely different issue in that it’s blocking scheduling of other processes for a long time, which is causing the bot itself to fail initialization because it can’t finish starting up before this starts running.
The question
My first instinct here based on experience elsewhere is to have the process_chunk/1
function voluntarily yield scheduling priority (I suppose this translates to voluntarily moving to the end of the run-queue for the scheduler in BEAM terms) before it processes each individual item. Right now, I’m doing this by running :timer.sleep(1)
at the beginning of each iteration within process_chunk/1
, which seems to be working to ensure that other things can run but feels like a bit of a hack TBH and also adds to the overall initialization time for this component (it’s only ~91ms of extra time on my development box, but translates to ~734ms on the production system it will be running on due to a much lower scheduler count).
Is there some more efficient way to voluntarily yield scheduling priority in Elixir or Erlang? Or is there perhaps some other approach I could take here that still lets other things run without significantly impacting the initialization times for the component in question?