Anyone know why Elixir's CPU usage is so high in this benchmark?

OvermindDL1 · February 8, 2019, 6:07pm

Ahhh, that makes so much more sense! That does sound like it would be extremely slow. o.O

zacksiri · February 9, 2019, 7:03am

Yeah slowness is not a problem as long as you design the architecture to handle it. I learned that the hard way.

abtrapp · February 14, 2019, 7:59am

For those who don’t follow the blog: busy waiting test on https://stressgrid.com/blog/beam_cpu_usage/

Your article? That’s what I call a detailed analysis - TNX!

mythicalprogrammer · February 14, 2019, 12:35pm

BEAM VM have a scheduler for each core and with no work they stay busy with busy wait.

It also does preemptive scheduling.

Looking at raw CPU means nothing without context. BEAM VM is just does a lot of stuff for you that all other VM does not in term of concurrency and within regard to actor model.

DevotionGeo · February 14, 2019, 5:41pm

I think Elixir with NIFs created in Rust is the combo which can be the safest, fastest and the most scalable if done the right way.
Rust is on my todo-list from a long time, but somehow I’m not getting time for it.

OvermindDL1 · February 14, 2019, 5:58pm

Ooo, reading. ^.^

In our test, we found that BEAM’s busy wait settings do have a significant impact on CPU usage.

Yep, on this workload this is also what I would expect. Busy waiting increases responsiveness when the system is loaded down by other work as well. The way the scheduler works on linux is that it reduces the timeslice of programs that are waiting. A very high-level overview of how linux scheduling works is there there is a 40ms (or was it 20ms…) ‘timeslice’ that gets allocated based on the activity of all ‘active’ (non-sleeping/zombie/etc) programs, and among those active programs the ones needing the most CPU gets the largest percentage of the timeslice based on the overall needed CPU usage of them all. Thus if you wait on I/O and there are other running processes using CPU then you could be delayed by up to 40/80ms, which can be quite substantial for a web result that you want to be served very fast. Thus busy waiting ensures that the beam will remain ‘hot’ in the timeslices allocated to it when there is contention on the system.

However, when there is no contention then there should be any difference in responsiveness, thus the results shown in the article.

The safest in specific contexts, but don’t forget that NIF’s should not take long or use much CPU in general (there is processing threads for NIF’s to use but they are very limited in number). In general anything taking anything more than a modicum of CPU power should be a Port. NIF’s should be only for very quick functions or things that I/O wait often (I/O waiting threads are cheaper than CPU processing threads). And regardless of how Safe Rust is NIF’s still have specific performance characteristics that need to be followed.

sribe · February 14, 2019, 6:41pm

I don’t remember where I read it, but within the past year or two I read a gnarly article going into some guts of drivers. It turns out that with high-end SSDs, interrupt service time is substantially greater than device latency, so in order to take advantage of them, drivers have to revert back to polling instead of interrupt-driven. Which is kind of interesting, in order to take advantage of modern storage, we have to go back to a technique from the dawn of time which has been “out of date” for decades…

axelson · February 14, 2019, 7:23pm

That’s very interesting, do you have a source for that? I’d be interested in reading more (although it should probably be posted as a new thread).