At work we have a lot of Scala code in our low latency/high throughput product and a couple of us have been advocating Elixir as an alternative to the current codebase. We recently undertook a 24 hour hackathon and tried to replace a chunk of our code with Elixir but immediately ran into performance issues. I’m hoping other people on the list might have some suggestions for me because I’ve picked the brains of the IRC channel and been unable to improve things.
To begin with I’ll set the stage for the testing we did. We wanted to start by understanding the raw performance of Elixir “out of the box”, and so we whipped up a couple of codebases to accept POSTs and return 204 regardless of what happens.
We used this code:
Alongside a less complex piece of code:
We used wrk and wrk2 to performance various POST tests against this code. We used the following script to feed wrk with data to post:
For an example of the flags we used with wrk:
We did the testing on DigitalOcean machines as well as machines inside our own datacenter. We were unable to get beyond 22,000 QPS with any of our testing, no matter what we tried tweaking. On the same hardware we can swap in some scala and do over 300,000 QPS easily.
Things that we tried:
- Tweaking sysctls.
- Running wrk from the same box to rule out the network.
- GET instead of POST (45k QPS)
- A whole bunch of changes to the code (can see the git history for that)
- Using elli instead of cowboy (much worse, ~2200 QPS).
What I’m hoping is that other people on this forum can grab the code and take a look for obvious problems, as well as potentially running “wrk” against it in whatever environments they have. I spent some time with eflame and Observer trying to figure out why things are slow but I’m not familiar enough with BEAM/Elixir to make any real headway into figuring out why things are so slow.
Any help would be really appreciated!