Any tips on tuning the BEAM VM?

nhpip · October 15, 2022, 2:55am

Is anyone aware of a guide that describes the optimal, or even just recommended, VM configuration options for specific deployments?

Scheduler, inet, memory etc. for cloud, containerized, bare metal, embedded etc

scottmessinger · October 17, 2022, 9:17pm

We had massive issues running Elixir in a container until we tweaked the vm settings. This is in our vm.args.eex

## Enable kernel poll
+K true

## Aysnc threads
+A 128

## Increase number of concurrent ports/sockets
## Same as setting: -env ERL_MAX_PORTS 65536
+Q 65536

## Something about scheduling
## https://stressgrid.com/blog/beam_cpu_usage/
+sbwt none
+sbwtdcpu none
+sbwtdio none

dimitarvp · October 17, 2022, 10:00pm

Thanks for this! Bookmarking it.

mattbaker · October 18, 2022, 5:22pm

We’ve had an easy time running in containers but I want to call out the sbwt options in particular, they significantly dropped our CPU usage (at least as CFS was measuring it)

I’m so curious about these other options @scottmessinger, if you ever write something up about what led you to them I’d love to read it

mattbaker · October 18, 2022, 5:28pm

It looks like +K was removed into OTP 21 and +A may have less impact now with the introduction of the dirty scheduler. +Q appears to default to 65536 so maybe that’s not needed anymore?

scottmessinger · October 18, 2022, 6:46pm

We were seeing our Elixir/Phoenix servers crash after ~1000 simultaneous connections.
Also, we had to change the tcp_keepalive_time and file-max

sysctl -w net.ipv4.tcp_keepalive_time=250 fs.file-max=1024000

It looks like +K was removed into OTP 21 and +A may have less impact now with the introduction of the dirty scheduler. +Q appears to default to 65536 so maybe that’s not needed anymore?

It would be wonderful if the vm settings I used weren’t needed anymore – it was miserable and confusing. Also, there was very little written about it, so there was a bunch of guessing and hoping on my part.

+sbwt none
+sbwtdcpu none
+sbwtdio none

I think the import parts were these one above. Do you know much about those settings, @mattbaker? I’d love to know what I was changing – at the time, it felt like reciting a magical incantation.

nhpip · October 19, 2022, 12:32am

These are the options I’m using as of last night;

   +c true 
   +C multi_time_warp
   +sub true
   +swt very_low
   +swtdio very_low
   +sbwt none
   +sbwtdcpu none
   +sbwtdio none

dimitarvp · October 19, 2022, 1:31am

Is this exclusively for containers? Or universally?

AstonJ · October 19, 2022, 2:00am

For anyone coming to this thread in future here’s some of what @garazdawi said in the thread on EFS:

See his full response and the rest of the thread here:

nhpip · October 19, 2022, 3:28am

It’s the VM putting a scheduler in busy wait so the cpu core doesn’t go to sleep. The problem in containerized environments, like AWS, is that it can eat into your cpu quotas.

donghyun · November 1, 2022, 12:53pm

In my experience, I believe the BEAM was not quite optimized for containerized environments before OTP 23. For example, scheduler counts did not match the CPU cores in our k8s manifests. My team found the Rabbit MQ Runtime Tuning, which has helped us a lot. After OTP 23 was out, we noticed that we didn’t need most of the tunings anymore so we removed it.