Elixir vs Go Performance (Anton Putra)

I think this twitter post and youtube video didn’t get as much attention as I hoped
I am still new to Elixir, so can’t really judge

https://x.com/antonvputra/status/1865845828192977341

Its generally understandable that a compile statically typed language like go will be faster than Elixir (being dynamic and running on a vm) but I a bit surprised the availability performance

2 Likes

:wave:

Some ideas:

  • setting +sbwt none to avoid busy wait consuming the allocated CPU slice (might be irrelevant in this benchmark if there is no wait under load) it was already set
  • ensuring System.onilne_schedulers is equivalent to the available CPUs inside the container should be done automatically already: OTP 23 Highlights - Erlang/OTP
  • ensuring TCP socket options (recbuf (rcvbuf), sndbuf, backlog, reuseport, etc.) are the same for Go and Elixir
  • switching to :json
  • (possibly) disabling Logger or increasing the log level (is Go logging?)
  • separating the Web and DB benchmarks to see where the slowdown is coming from
  • (possibly) spawning processes handling the requests / connections with larger heaps to avoid reallocs
  • (possibly) switching from Bandit to an alternative web server, I remember playing with an Elli-like HTTP/1.1 server and it was outperforming both Bandit and Cowboy in benchmarks with wrk
  • (possibly) switching from Postgrex to an alternative PostgreSQL client and/or connection pool
  • (possibly) running multiple Postgrex pools

However, all of these steps are not common among Elixir developers, so I think the benchmarked app is about as representative (of what we typically write) as it gets. And the benchmark itself highlights some areas for possible improvement :slight_smile:

1 Like

The issue is not Postgrex, but rather the fact it is comparing Ecto (which would be akin to a ORM):

With something that directly sends INSERT commands over the connection:

And if you are going to set up 500 database connections, I would at least split over a few connection pools (by setting pool_count). Otherwise you are likely making the pool the bottleneck.

Overall, I’d expect the Go to have better throughput, but there is definitely a bit of apples to oranges going on. I honestly don’t understand why benchmark authors do not ask for some community vetting before publishing.

20 Likes

The first test where it was mentioned that go uses a performant third-party JSON library vs using Jason in elixir is also geared towards putting go in a better light.

If raw performance is the main scope, then he could have used a library like jiffy or to make the comparison fair (since using NIFs can be called cheating), only limit to standard library.

3 Likes

What responses to that video would you have liked to see?

2 Likes

To re-iterate few points , I am very new to elixir
i was expecting elixir to be slower, but to do better on availability

So i was expecting replies on how to improve the benchmark where elixir should naturally do better than Go , which is availability

In other words, what is the trade off, what are we sacrificing performance for?

1 Like

Are you not satisfied with the answer that more DB connections should be given in the pool?

1 Like

If this current implementation handles 10k req/s comapred to 60k/s on golang side, how can it by design provide availability? I am missing something on how that measurement is done?

Set the timeout on requests to longer times and you will receive the responses, as the nature of how schedulers work in erlang is that the more processes you have to handle (in this case 1 connection = 1 process most probably), the slower all the system becomes.

1 Like

IMO Elixir is much better than Go for a backend, and the thing all these benchmarks miss is the reality of how apps are actually used. The real environment a backend operates in can be thought of as rough terrain, while a benchmark environment is a paved road. These benchmarks take Elixir (a Land Cruiser SUV) and Go (a Corolla), measure how fast the car can go one mile burning x amount of fuel, and conclude the Corolla is a better car because the numbers are better. Meanwhile people who actually have to get over rough terrain drive the Land Cruiser.

There’s a highly upvoted post on the Go subreddit about a nil deference crashing the entire app and causing $100k in lost revenue. https://www.reddit.com/r/golang/comments/18sncxt/go_nil_panic_and_the_billion_dollar_mistake/

It seems extremely unlikely for a similar programming mistake to take down an Elixir app.

12 Likes

A couple more differences after looking into the serialization aspect:

  1. They are using Jason.encode! instead of the more efficient Jason.encode_to_iodata!. Doing this change makes it 30% faster on my machine and drastically reduces memory allocation (this is what Phoenix’ json would have done by default)

  2. They are not using the derived App.Device when encoding, which means the shape is not preallocated

  3. The Go version encodes the current time once. The Elixir version encodes it 4x (twice to send it to the database, twice for JSON). Those things tend to cause a large difference in benchmarks, I replaced it here and it brought a 10%-20% improvement

Honestly, I don’t understand why it suddenly falls over in the first test. However, without a way to reproduce it, it is impossible for me to answer it. I cannot reproduce it on my machine. I am confidently pushing over 80k req/s (while Go sits at 110k).

18 Likes

Here is a pull request that makes the Elixir code closer to the code in Go and uses better defaults (for example, what Phoenix would have used): Make benchmarks with Elixir fairer by josevalim · Pull Request #370 · antonputra/tutorials · GitHub

29 Likes

great, I wonder if logger level should be turned down to :error or similar, but can’t quite make out what the go server logs.

2 Likes

Oh, I’ve seen this vid. I am preparing a PR for the Peep library too. It appears that bucket search code there was linear, but not binary. My change will improve the bucket search performance about like 30 times (cause the source has around 240 buckets (why???))

7 Likes

That’s really cool. I also did a pass to optimize the export itself: Optimize prometheus export by josevalim · Pull Request #24 · rkallos/peep · GitHub :slight_smile:

10 Likes

Even if Elixir “only” reaches half of Go’s performance it is already a huge win given everything else that Elixir gives/enables. I’m saying this as a main Go developer. Also, if performance is the ultimate focus, C or Rust should be used instead of Go.

5 Likes

Indeed. Golang is optimal for bigger companies with many devs. It allows for easy onboarding in new codebases.

1 Like

The company I am working in has recently started to use this Peep library and my colleague made the PR to Anton Putra’s repo to add this benchmark.

Outcome is that this Peep library now has some contributions improving performance and detailed reviews, haha

Btw the new video is up but the numbers seem quite strange:

3 Likes

Thanks for sharing. I am a bit surprised indeed the new round did not see any improvement, given my pull request yielded a 20-30% improvement locally. I cannot reproduce the failures from the first benchmark on my machine either, I even used a separate machine to push traffic locally and I get 80k req/s for Elixir vs 110k req/s in Go.

However, it is worth saying that it is 100% expected for Go to be faster than Elixir. After all, we are comparing a static imperative compiled language to a dynamic functional language running on a VM, so a reasonable difference (within the same order of magnitude) is expected. For example, TechEmpower main branch puts Go around 5x faster than Phoenix for single-query runs. I am mostly curious about the failure rate, which is not expected, but as far as I know, there is no way to reproduce the benchmarks and the production logs were not shared with us. So there isn’t much we can do at the moment.

19 Likes

To me, it would be interesting to continue the run the simulation for both languages. To see how the different applications break. I am impressed how the Elixir application can continue when it is clearly over-worked.

In the second video, part 1, Anton stops the benchmarks at around 60k req/s when it starts becoming super interesting. I would love to see the result if he continues to increase the load until say 200k req/s. So we can see both applications completely break.

2 Likes