Webserver Benchmark: Erlang vs Go vs Java vs NodeJS

Our opinions may differ and i prefer to test the whole ecosystem before i take any decisions. If you are satisfied with some string messages and a sleep function.

That is also good for you. But in the real world we have all I mentioned at the minimum(web server, db, auth).

So i was genuine curious if he/she can really make money only with that.

I had no ill intent just a different opinion.

Thanks for your opinion @bennelsonweiss

1 Like

I think people evaluating technology can gain a lot from reading this paragraph in the article:

Notably, all Erlang-based servers, once overloaded, maintained stable response latency, with Cowboy 1.x keeping it around 250 milliseconds! Go and Java servers were getting progressively slower. Non-clustered NodeJS, limited by utilizing only one CPU, was the slowest. The results for clustered NodeJS and Rapidioid were inconclusive since they ran out of available RAM.

So, if you want predictable performance, Erlang/Elixir is the tech for you.

1 Like

booooooooooring! :rofl:

…Until you get called as a consultant on a project they fired you for doing a good job at (and missing a 3-month deadline by 1 week because that’s obviously a huge deal-breaker) and you fix all their scaling problems in the next 2 weeks. :stuck_out_tongue:


I hope you gave them the special 2s-complement discount :wink:

Of course I did. I charged 2x more than before. (But that was like 18 months ago.)

I absolutely love it how people constantly underestimate Elixir and then praise the programmer that saves their business with it. And then still have the audacity to ask “this is in PHP, right”? :003:


This is SPARTA! :slight_smile:

1 Like

With a better tuned webserver you can reach 3x cowboy thruput, see here 300k requests per second webserver in elixir! OTP21.2 - 10 cores.

Could even get up to 6x RPS (compared to cowboy) if you simply echo a static reply.

Probably the most important factor that limits webserver performance is how many requests are “in-flight” at any given time.

In your test you had no delay before returning hello string. In Stressgrid test there was 100 millisecond delay. This produces very different number of in-flight requests for a given rate of requests. So it is comparing apples to oranges.

We argue that adding 100 millisecond delay makes the workload similar to typical web application that does one or more backend requests.


That’s not Cowboy 2’s fault, but rather the setup the author of the article did. Cowboy 2 by default runs in HTTP2 mode, which spawns at least two processes per connection (1 for the connection, 1 for each HTTP2 stream), Cowboy 1 does only a single process per connection by default. Configuring the Cowboy2 code to not use HTTP2 mode (build a streaming handler and so forth) then it should actually be faster than cowboy1.

Already addressed as far as I know?

Yep this new active,N on the latest OTP’s is a nice boon!

100ms is HUGE. I work on a near-terabyte database at work that is queried on never every request (sometimes dozens of times) and the average response speed is measured in <1ms on average with some spikes up to 20ms on especially heavy pages (then there are the report pages that take over a full second, but those are super rarely called).

And I put this last even though it appeared earlier because this is the big thing. Erlang/Elixir/lfe/gleam/whateverBeamThing is designed for maximum uptime in the face of failures. If you want the fastest network handling then you’d probably want to go with Rust’s actix-web or so as for raw metal web serving it constantly benches faster than everything else (I’m surprised it wasn’t on this list when go is?).


Aye, I skimmed some of those replies (including Essen’s) [re: write a stream handler to bypass the HTTP2 required defaults].

I think it’s fair to bench the out of the box experiences though. Writing that stream handler sounds easy for Essen but, while not impossible appearing, decidedly less so for me. Particularly in that I am far from idiomatic Erlang and nuanced implications of certain choices under heavy load.


It’s not an internal API, it’s fully external and designed to be overridden. :slight_smile:

1 Like

That’s true if your DB is very close to the app, but if you have say 5 ms round-trip latency, then even with an instant app + DB a dozen queries will take 60 ms. A lot of web apps need to query distant systems (DBs or APIs) so I think 100 ms is pretty reasonable for a simulation.

1 Like

In three apps I maintained, we had to fetch currently signed in user and their entire cart on almost every request. Also their user profile which in two of the apps was a separate DB table. All of that finished in 12ms and the entire request took 13ms in total. This was consistently the average latency of those requests, for months.

Not sure it’s fair to make Phoenix look like something slower (like Rails).

You need average numbers for every framework for your tests to paint a realistic picture and honestly, 100ms for Phoenix is an abnormally slow response. I’ve only seen it happen in one project: a big API gateway deployed on a very cheap and weak hosting (2-core Atom CPU with 128MB RAM).

Well you’re getting to something I’d been avoiding saying. I’ve encountered a decent proportion of Rails developers who don’t know where their database access time is going. The look at the time it takes to complete the query command issued in Ruby and think that’s the database, not realizing that in many cases it’s 1% waiting on the database, and 99% ActiveRecord parsing and building the record objects.

(I’m not kidding about those numbers, in fact that understates what I’ve measured on some queries that return larger (thousand rows) results, where it was 0.1% database access and 99.9% processing the results in the Rails libs.)


Oh I know. I went through the same measuring dance several times before giving up on Rails somewhere in 2017. A lot of time is spent in ActiveRecord.

Currently the bottleneck is the inet_drv for massive RPS, once the NIF based sockets module is fully implemented (if its done right/well), that should speed things up a significant chunk.

Next is the parsing of the HTTP Request itself, if you use a rust or C NIF for this, that will be the next significant speed up.

Next is the serialization of the HTTP Response, same thing as above, C Nif vs managed vs static (the response headers/body never change/minimally).

This is not considering roundtrip times to the Database or other actions, as this is beyond the scope of the “useless, not an actual workload; GET RPS benchmark”.

Currently the :gen_tcp uses inet_drv.
Currently cowboy does not support using CNifs to ser/deser.

Lets check back next year in 2021 or whenever OTP23 rolls around?

Until then the current numbers wont really change.


HTTP 1 maybe, with HTTP 2 I think that there is not much different, and reading Erlang/Elixir parser would be much easier (thanks to the binary pattern matching).

I really don’t, 60ms is huge as well. Query’ing remote servers is not at all common and even when I do have to do so I keep a persistent connection up, such as logging in via the local LDAP server here at work adds less than 5ms to the overall request (making it still less than 5ms). 100ms is huge

I’ve heard this as well. Accessing a database is lot faster than most people think.

And still, cowboy2 running in HTTP2 mode when none of the others server types are is a huge benchmark failing, not testing the same things.


I know you rarely have your app server terminate TLS, but would be interesting to see those numbers as well.

1 Like