Why cant elixir/beam comfortably compete in the sub milliseconds to microseconds range for network applications?

This is a very loaded question and i have been sparing with my LLM about this. My mental model of elixir/beam for running programs is that there’s the BEAM vm that runs the bytecode. Inside the BEAM it spawns 100k-millions of processes, that can be isolated from one another by default etc if I have a network application that’s taking requests from an API there will be processes ready to grab the various requests and do an operation, hence no blocking etc

Now compared to other pl with a vm or a runtime, they can have GC spikes that pause the world or slow things down, since beam has things in isolated process this is not an issue. But now i wonder, why cant elixir compete with something like c++/rust in terms of a network application for example listens to an api gets [req] → [does work] → [send back a response]. I can concede on understanding there is a bottle neck during “does work”, elixir is bad at that depending on the computation. But for a simple computation, shouldn’t elixir comfortably be on par with rust/c++ etc because it doesn’t have GC spikes/pauses and it can have way better reliability compared to other PL like java, python, go etc

Im asking claude and its giving me various answers, I understand elixir will never beat a statically compiled language like rust/c++ in terms of raw performance , but at the same time I just cant see why it cant compete with them in the same arena of simply waiting for a [req] → [doing some work] ->[resp] , given the fact that I can spawn millions of processes just to handle requests and not have to worry about GC pausing and living in 2025 compute is cheap as ever, so couldn’t i just throw more compute at this problem for say elixir (in a reasonable manner)

also given the fact that network applications have a limit to how much they can send and receive I dont see how elixir doesn’t crush all benchmarks and comfortably sit in the sub millisecond to microsecond throne, in my head a benchmark like this would be measured in how many beam processes can you run on x cpu by requests it can handle b4 it degrades etc I may be thinking about all this wrong and Im open to learn

Can you substantiate this claim?

In general you would expect worse raw throughput (due to VM overhead, though JIT helps) but better p99 etc from a soft-realtime system like the BEAM. The VM is able to perform the work incrementally which smooths latency.

A common problem is people running literal “hello world” benchmarks and so on, where there is no useful work done, and then comparing across languages. This is meaningless, it’s completely unrepresentative of a real workload. In general, and this has nothing to do with Erlang/Elixir: benchmarks are mostly worthless unless you run them on your actual workloads.

6 Likes

What are you doing that is making it > 1ms? There’s plenty of examples/benchmarks/evidence of Phoenix, including template, of a couple of hundred µs. This is from 9 years ago: The Erlangelist - Low latency in Phoenix

In terms of ability to delivery consistent latency, and slow degradation (rather than performance falling off a cliff), BEAM/Elixir/Phoenix does have quite a few tricks. You mentioned low cost processes. Performance also comes from minimising the work actually performed. There’s also some very smart template processing that minimised CPU cycles to write templated strings to the socket. Looks like the best article that I’ve read has disappeared, but IO lists for assembling strings minimise a lot of the reading and writing of string fragments to memory typical of many other frameworks.

2 Likes

BEAM offers temporal fault tolerance. A runaway process will not affect others as much as they would in a Rust, C++, Java, C#, Ruby, … framework.

Consider a request that hits a de-facto infinite loop due to a bug (e.g. catastrophic backtracking in a regex). With BEAM you won’t even notice it other than as higher CPU usage and requests timing out, and they won’t even leak memory because timing out kills the processes related to the bugged request.

With any other stack, you’re done for.

In thread-per-request setups you will leak memory until you crash. In M:N-scheduling setups (Node.js, tokio, Java virtual threads, and so on) all you need are N schedulers hitting the bug for everything to grind to a halt. (Edited to add: with Go, things will keep trucking because goroutines can be pre-empted, but there is no way to kill a runaway goroutine so you’ll run out of memory before long.)

This comes at the cost of making everything interruptible, which can harm “sub-ms” latency figures a bit since we’re likely to have rescheduled the related processes a few times before the request finishes, compared to e.g. C++ where everything would run in one go.

There’s no such thing as a free lunch, and for the applications that BEAM/Erlang was designed for we prefer having this kind of fault tolerance over shaving off a few microseconds per request.

23 Likes