Techempower benchmarks

sasajuric · June 1, 2017, 5:49pm

What I’m saying is that I can’t conclude anything reliably from that benchmark. That’s my main point I’m arguing in this thread.

When I first evaluated Erlang, I did a quick, dirty, and a very simple simulation of the target system, issuing a 10x of the expected load for the duration of 12 hours. I also did a few minor experiments, just to make sure I can do some typical stuff, such as talking to the database, working with XMLs, and such. I needed to see first hand that I can handle the desired load. Once I established that, I didn’t care much about few microseconds here/there.

Sure, but the thing is that raw speed is usually not the only, nor the most important factor. Once you have the tech which can handle your load, other things start to matter, such as the support for fault-tolerance, stable latency, and troubleshooting a running system. Erlang/Elixir excel at this, and that matters, because it improves the uptime and availability, and makes the life of developers much easier. I find this important, because the cost of downtime and the cost of developers are IME much higher than the cost of hardware.

Moreover, with proper algorithmical and technical optimizations, in most cases the observed speed between two different stacks usually becomes much smaller than in contrived benchmarks, such as TE. In particular, when it comes to Erlang, based on the fact that I worked with it for 7 years, and that others have done wonders with it, I’m confident that it will suffice in the vast majority of the cases. In the cases when it doesn’t (e.g. heavy number crunching), I can always step outside and reach for say, C, Rust, Go, or some other more performant language.

That might be true, but one big problem I have with TE is that they put the ranking list right in your face. It’s literally the first thing they show you. So there’s a big implication that a higher positioned stack is necessarily better.

Another problem is that I think the tests themselves are shallowly executed, which has also been argued by others in this thread.

Finally, the tasks themselves seem quite contrived. Take a look at the updates task. My first idea to optimize this would be to try to update everything in a single round-trip to the database. If that didn’t work, I’d at least remove the needless record read. But that’s not allowed by task rules. Which makes the task highly contrived IMO.

There’s a whole other bag of tricks we can throw at the problem. They usually come with some associated trade-off, but those trade-offs can only be evaluated in real life (which is not TE). Caching is for example not allowed, but that’s one of the main optimization techniques. Since you can do easy caching in Erlang without needing to run an additional process and serialize to/from json, it could do wonders for the perf, especially compared to many other techs. But that’s not tested at all in TE. So what can I conclude from the results of the TE bench? For me, the answer is: nothing Not even about performance, let alone about any other properties of a tech stack.

Assuming that you pay some developers, administrators, support, and others to manage the system with such large userbase, 1.5k/mo doesn’t seem like a significant cost in the total expenses sheet. You could get bigger savings by choosing a technology which allows developers to efficiently and confidently manage that kind of load, and to keep the system stable and running as much as possible, and to reduce the load on the support team. Again, raw performance is just a part of the story. It matters, sure, but up to a point. You need to balance it with other properties.

Also, in most cases, I don’t expect a dramatical difference between Erlang and other technologies in terms of hardware costs. While for some cases (e.g. computation heavy tasks) Erlang can be an order of magnitude slower, that doesn’t necessarily mean that you can save an order of magnitude in terms of hardware cost, unless the 100% of what you do is crunching numbers.