I have just started performance testing for mozart (BPM platform) and I am seeing something I don’t understand and am hoping to get some advice.
I have a GenServer module named ProcessEngine. Instances of this module are spawned via a DynamicSupervisor.
Each ProcessEngine instance spawned is initialized with a data structure representing a defined business process. To clarify, a “business process” is not an Elixir process.
The ProcessEngine instances runs until the “business process” has ran out of work to do, that is, it has finished it’s intended function.
So, here is the issue that I am trying to understand.
If I spawn 1,000 GenServers, they finish execution in about 300,000 microseconds:
iex [09:24 :: 6] > :timer.tc(fn -> run_process_n_times(%{}, :process_with_single_service_task, 1000) end)
{287086, :ok}
If I spawn 10 times that number, i.e. 10,000, they finish execution in 26,602,559, or about 100 times longer than the execution of a 1000 instances.
iex [09:24 :: 8] > :timer.tc(fn -> run_process_n_times(%{}, :process_with_single_service_task, 10000) end)
{26602559, :ok}
So, executing 10 times more GenServer instances takes 100 times the time to complete. I had assumed that execution time would increase linearly with the number of GenServer instances.
If I run the observer, I do see scheduler utilization go to 100% for 2 out of 12 schedulers. It’s always scheduler 1 & 2 that to to 100%. Couple of questions:
Why don’t I see more schedulers become active?
Is the 100% for two schedulers indicative of a problem?
Finally, is there any advice on how to analyze this?