mpugach
Elixir synthetic performance test
A boss of mine is not convinced to use Elixir for our production apps.
Another round of our polemics was the test.
He wrote two small apps using Vert.x and Elixir.
They render static text and real data from MongoDB serialized into JSON.
For static response Vert.x performed about 10% better. Also having better stability when JVM is warmed enough. Both have some lag after certain amount of requests. We think it is GC. But on Elixir version the lag appears more frequently.
For real data response the Mongo serialization was broken in Elixir version.
So I have fixed the serialization, removed Plug (since it is not needed here) and experimented with HiPE compilation. Moved to cowboy 2.0.0-pre.6 (but did not measured this part enough, nothing changed for me it seems). Here is the test app (You can tune the wrk and ab params for your machine, cause I have an old one).
Can you, please, help me to build the most optimized app for the test?
Personally I have doubts about this part. MongoDB driver first creates some structs, then we transform those structs to strings, too much work I think.
Also what can we do to optimize the static part more?
Most Liked Responses
sasajuric
I wrote some tips in this post. It’s about Phoenix, but some points can be applied to plain cowboy:
- Bench an OTP release built for production.
- Change the setting of
max_keepaliveoption. The default of 100 means that cowboy is dropping the connection after it serves 100 requests, so there will be a lot of reconnecting. - Tune wrk parameters to use as few connections as possible to bring the server a little below overload. The reason is that we want to test the behaviour of the server in the normal mode of operation. Overload is not sustainable for longer periods, so measuring an overloaded server doesn’t tell us much.
I usually do this by starting a test with e.g. 4 threads and 16 connections and observe the load in htop. If it’s locked to 100% then I need to reduce the number of connections. The target I aim for is constant load above 90% but less than 100%. Keep in mind that the number of conns must be divisible by the number of threads, so if I you need to reduce the load, then the next step is e.g. 3 threads with 15 conns.
Once you get satisfying numbers for a brief test (e.g. 10s), run a slightly longer test (e.g. 60s). If all went well, the numbers should be roughly similar in a longer.
Once you have a stable behaviour for both implementations, I think a much longer test would be needed to take GC effects into account. I’d likely go for a test of a couple of hours, paying attention that nothing else runs on the test server.
However, no disrespect, but I personally think that these simple bench tests are a pretty shallow criteria for choosing a technology. Raw speed only matters to some extent, and past that point it might be even counter productive. Way back when I was evaluating Erlang, I made a quick simulation of the target server, and then performed a 12 hours test with 10x of the estimated capacity to verify whether the performance is good enough. Once I proved that it was, I didn’t care anymore whether something else is faster, because it was good enough for my case.
The thing is that there’s more to a system than just plain speed. There are other important factors to consider, such as fault-tolerance, fair share of CPU distribution, support for runtime analysis (so we can understand what goes on in a system that handles thousands or millions of different “things”).
I never heard of Vert.x before, but seeing that it’s based on non-blocking event-based approach and runs on Java, I’d guess that it suffers from issues such as cooperative scheduling, anonymous implicit activities, and callback hell.
Here’s one way how we could compare Elixir (or any other BEAM language) against that. Make a simple server which has two requests: short and infinite. Make short do something trivial like return “Hello World!”. Make infinite run an infinite tight CPU bound loop. For example in Elixir it’s as easy as defp infinite(), do: infinite(). Now start the server, and for the sake of simplicity specify you want to use just one scheduler thread (you can do it with --erl "+S 1"). Issue one infinite request. Then verify that your CPU usage is constantly at 100%. Now issue a short request and observe how you get an immediate response.
This should prove that an occasional long running CPU bound request will not block your entire system nor significantly affect your latency. Then you can try the same thing with Vert.x and see the behaviour. Assuming you properly configure just one worker OS thread, I’d be very surprised if Vert.x wasn’t completely blocked by the infinite request.
Another interesting test is debugging the production. Keep the previous system running, and make sure that infinite request is sill running. Our aim is to discover what causes high CPU usage without restarting the system, or needing to add additional logs and redeploy. A simple way to do this is to start the observer (:observer.start) and go to processes tab. Wait for the next refresh (it make take a couple of seconds). At the top of the list you should see your top CPU burner process. By double clicking on it, you should see its current stacktrace. Finally you will be able to kill the process by right clicking on it in the processes list.
What this test proves is that BEAM goes way beyond “you can start up a lot of small activities”, and offers us some additional ways of managing our production and understanding what went wrong, which is very important if we plan on handling thousands or millions of different requests. We were able to quickly find what causes our CPU problems, and kill the thing without disturbing anything else in the system, or needing to restart the whole thing. A more realistic report of how observer was used to analyze a remote server can be found in the famous 2M Phoenix sockets article. AFAIK this is something not possible with most (if not all) other technologies out there.
Btw. if anyone’s interested, and in the area, I plan to demo this live at my upcoming ElixirDaze talk next month ![]()
sasajuric
Oh I’m positive there’s always a workaround, even if it’s not explicitly supported by the library itself. Worse comes to worst, you can always start such activity in a separate OS process.
But the thing is that you have to know upfront whether e.g some request processing is blocking. And that becomes increasingly harder as the project becomes more complex (which IME inevitably happens for every software project other than the ones which are cancelled
). The thing is that blocking might happen unintentionally, due to a bug, or non-optimal piece of code. Not only have I seen such thing happen and paralyze the production completely, but I actually caused it myself by introducing a suboptimal code
With Elixir/Erlang, such mistake is much less likely to take the whole production down, or even have observable effects on it.
Another problem with explicitly identifying the blocking code is this. How can I know that foo() is potentially blocking for a long time? To know that, I need to understand the complete stack trace of foo including my own code, as well as the code of all dependencies invoked from it. And I need to consider every possible input that can arrive to foo. And when I make my decision, it’s only based on the current code snapshot. A seemingly simple and unrelated change might break my expectations tomorrow. I’m exaggerating, yes, but it’s a thing that becomes increasingly harder to manage as the code becomes more complex.
That problem is in Elixir/Erlang non-existent. If you want to run things separately you run them in different processes. It’s as simple as that ![]()
I’m really not familiar with JVM, but I’m not surprised that there’s something like that given it’s maturity. However, libraries such as Vert.x implement additional lightweight mechanism on top of VM, and therefore the request handlers are likely not special VM entities. In fact, in event-based technologies, request handlers are usually completely anonymous.
Now given that you could have a single thread multiplexing thousands of different requests there are some questions. Can you trace the execution of a single request? Can you get info (e.g. memory usage, stack trace) of a single request handler? Can you terminate a single request handler without disturbing anything else, even if that request is blocking? If the answer is no, then the tech is nowhere near capabilities of Elixir/Erlang when it comes to analyzing and fixing a live running system.
I’d be somewhat surprised if in any case Elixir turned out to be faster. But as said, considering only the speed, and measuring it in a 15s synthetic bench is IMO not a good comparison. The question should be whether both technologies are sufficiently performant for the real problem you’re solving. If yes, then it’s perhaps time to consider other aspects of both technologies, such as e.g. fault-tolerance support
If not, and assuming you invested some effort into making it faster, then I guess you need to discard the option which is not performant enough, even if that option is Elixir ![]()
Mandemus
Vert.x is a polygot implementation of the node paradigm on the JVM. You can spawn ‘verticles’ to handle a single API endpoint which all communicate through a messaging backend that extends to the clients Think node + messageMQ perhaps. They do have some good abstractions to reduce callback hell.
I played with it for awhile but in the end it was a lonely affair, with very little activity on their Google group.
It should not be the GC. GC in Erlang is per-process, so there is no stop the world event.
Popular in Questions
Other popular topics
Categories:
Sub Categories:
Forums
Popular Tags
- #ecto
- #liveview
- #troubleshooting
- #learning-elixir
- #deployment
- #library
- #erlang
- #testing
- #genserver
- #mix
- #absinthe
- #remote-other
- #otp
- #plug
- #how-to-question
- #macros
- #postgres
- #channels
- #elixirconf
- #exunit
- #discussion
- #javascript
- #podcasts
- #code-sync
- #onsite
- #dialyzer
- #docker
- #authentication
- #umbrella
- #full-time-contract
- #podcasts-by-brainlid
- #ecto-query
- #elixir-ls
- #phoenix_html
- #iex
- #blog-post
- #graphql
- #genstage
- #ai
- #websockets
- #supervisor
- #advent-of-code
- #elixirconf-us
- #distillery
- #processes
- #forms
- #api
- #metaprogramming
- #security
- #performance









