Performance of Erlang/Elixir in Docker/Kubernetes

If you try to compact the load as much as possible, the emulator will run using one scheduler. You may get a bit higher latency for work in that specific instance. However, you will also have less threads running at the same time which should make it easier for the OS to schedule all of your containers efficiently.

So by compacting load more aggressively you will sacrifice performance in a specific emulator for the greater good of all the other systems running on the same machine.

At least that is my theory, I don’t really have any way to test it and I’m sure it will vary depending on what you applications does, how many applications run on the same machine etc etc.

4 Likes

So this is the flag I have been thinking might have been causing me trouble, when running under CFS quotas with other processes.

For reference, see this design doc for the CFS: https://www.kernel.org/doc/Documentation/scheduler/sched-design-CFS.txt

TL;DR:

CFS’s task picking logic is very simple: it always tries to run the task with the smallest runtime value (i.e., the task which executed least so far). CFS always tries to split up CPU time between runnable tasks as close to “ideal multitasking hardware” as possible.

The theory is that the Erlang scheduler busy-looping is causing the CFS to record more CPU usage against the Erlang scheduler threads than they are using to do useful work in my code, and thus the Erlang scheduler thread gets throttled, to be fair to other threads e.g. if the quota is 100us of work, and if I do 50us of work in my code, before needing to wait on a socket, and there’s no other work for the Erlang scheduler to do, the busy-looping kicks in, and burns the remaining 50us for possibly no benefit, and thus the CFS thinks I’ve used my entire allocation, and throttles me, instead of allowing me to do more work shortly after…

So in my case, running Erlang in Docker, assuming the theory is correct, I would want to set +sbwt to none or very_short to avoid being throttled by the CFS, assuming that being throttled is worse than not immediately reacting to an event that might allow the Erlang scheduler to run my process.

If you are not being constrained in your use of CPU by a kernel scheduler, but trying to out-compete other processes on the same box, you won’t have the same rationalisation.

2 Likes

Here’s another performance snag that might effect at least some of us; not specific to Erlang, but to those on older AWS instance types:

Two frequently used system calls are ~77% slower on AWS EC2

and also, with particular reference to Docker (scroll to the end past all the Java stuff):

Yet another reason your Docker containers may be slow on EC2: clock_gettime, gettimeofday and seccomp

TL;DR
this call [gettimeofday] in Docker can take almost 6 times as long as a non-containerized call due to the overhead of seccomp filters running on the system call

1 Like

If anyone (like myself) stumbles across this old thread I want to say that +sbwt none made a significant impact on our CPU usage (as CFS measures it). We’re seeing a 50% drop. If you’re running in an environment where CFS is in play you should experiment with this!

@ellispritchard your explanation makes a lot of sense and I think you’re correct. Just here to +1 and encourage others to try this flag, I really appreciated this thread and your findings. Thanks :slight_smile:

2 Likes

Thanks @mattbaker glad to find it’s still relevant (and helpful!)

1 Like