Benchmark parallel calls to a function

I would like to benchmark function calls made in parallel (for example to some function in an Ecto Context, or in my case an Axon/Nx model’s function).
I don’t have much experience in benchmarking, but I am assuming in this scenario that the number of parallel calls is increased incrementally.

Here’s what I would like to find out:

  1. What is the avg speed/latency/benchmark for a certain number of parallel calls.
  2. How many parallel calls can I make before the avg speed crosses a certain threshold or before an error is passed.

As far as I know, Benchee doesn’t support this sort of benchmarking. I would prefer an Erlang/OTP native solution, though I am okay with using an external tool and exposing the function over a REST API or similar.
I’ve thought about Tsung and an API, but I feel it’s overkill and I don’t feel upto editing XML files. I don’t know if I can use it from Elixir, despite it being written in Erlang.

Not sure if I am not saying that is no longer true but a while ago I think was said that benchee is not designed to benchmark parallel code – even though it has parallel facilities, you can check their README for it – so my recommendation would be:

Just make separate Elixir function that each does things slightly differently e.g. the code you want to assess whether is slower or faster, and use hyperfine with e.g. mix run -e 'MyModule.my_function(args)'.