Struggling to get this implementation faster

I am writing a simple elixir application for a hackathon. The idea of the application is to have two instances of an API behind a load balancer.

This API has two endpoints one for adding transactions, credit and debit. The other for getting summary of transactions.

In the transactions, debit transactions cannot be accepted it they are out of the limit - each customer has a max limit how much they can owe.

I made a simple implementation, even without database. But I can’t beat some other implementations I’ve seen. I didn’t expect to beat the performance of Java or Rust but I’d expect to beat others like PHP and NodeJS.

My p99 is around 90ms but I have seen some implementations in NodeJS handling 40ms - even less in some cases. I found even more faster Elixir implementations using way more libraries, which tells me I am the problem.

So far I have an Erlang cluster of two nodes. Once they get up, five global GenServers are started. The stress test simulates five customers. In my application, each of them is a gen server which contains the state of limit, balance and the latest events

  def start_link({client_id, limit}) do
    case GenServer.start_link(__MODULE__, {client_id, limit}, name: {:global, process_name(client_id)}) do
      {:ok, pid} -> {:ok, pid}
      {:error, {:already_started, pid}} -> {:ok, pid}
      error -> error

  @spec init({client_id :: integer(), limit :: integer()}) :: {:ok, {balance :: integer(), limit :: integer(), latest_txns :: list()}}
  def init({client_id, limit}) do"start client #{inspect(process_identifier(client_id))} | #{inspect(node())} - #{inspect(Node.list())}")
    {:ok, {0, limit, []}}

Besides some validations, whenever a request comes in I just perform the for that specific customer - hence specific process.

  def handle_transaction("c", payload, req) do
    {:ok, balance, limit} =["client_id"], payload)

    :cowboy_req.reply(200, %{
      <<"content-type">> => <<"application/json">>
        }, <<"{\"limite\":#{-1*limit},\"saldo\":#{balance}}">>, req)

I am pretty sure it can perform better. However, I can’t see where is the bottleneck. Would somebody have any clue? Having better ideas how I can profile that would be useful as well.

Here is the source code: GitHub - geeksilva97/rinha-de-novo: tentando mais um pouco... com cowboy puro com limao
Stress test using Gatling: rinha-de-novo/load-test/user-files/simulations/rinhabackend/RinhaBackendCrebitosSimulation.scala at master · geeksilva97/rinha-de-novo · GitHub / rinha-de-novo/ at master · geeksilva97/rinha-de-novo · GitHub

Quick update. And more information.

I was executing, in Docker, but in Apple M3. 12 cores, 18 GB RAM. I was getting p99 around 90ms

I executed in Manjaro, also in Docker. Intel i5, 8 cores, 16GB RAM. I got a p99 of 4ms.

Is it all Docker overhead?


Someone else also ran into this. Likely a mac issue.


TBF everything runs faster in Manjaro. :smiley:

But I’ve heard about spiking latencies in Docker on ARM Macs as well.

1 Like

Got it, guys. Thanks a lot for your help!!

1 Like

Yeah, my Mac was overall slower too (I’m the guy from the thread on the other forum). I’ve been running on my Ubuntu since

1 Like