Is anyone aware of a guide that describes the optimal, or even just recommended, VM configuration options for specific deployments?
Scheduler, inet, memory etc. for cloud, containerized, bare metal, embedded etc
Is anyone aware of a guide that describes the optimal, or even just recommended, VM configuration options for specific deployments?
Scheduler, inet, memory etc. for cloud, containerized, bare metal, embedded etc
We had massive issues running Elixir in a container until we tweaked the vm settings. This is in our vm.args.eex
## Enable kernel poll
+K true
## Aysnc threads
+A 128
## Increase number of concurrent ports/sockets
## Same as setting: -env ERL_MAX_PORTS 65536
+Q 65536
## Something about scheduling
## https://stressgrid.com/blog/beam_cpu_usage/
+sbwt none
+sbwtdcpu none
+sbwtdio none
Thanks for this! Bookmarking it.
We’ve had an easy time running in containers but I want to call out the sbwt options in particular, they significantly dropped our CPU usage (at least as CFS was measuring it)
I’m so curious about these other options @scottmessinger, if you ever write something up about what led you to them I’d love to read it
It looks like +K was removed into OTP 21 and +A may have less impact now with the introduction of the dirty scheduler. +Q appears to default to 65536 so maybe that’s not needed anymore?
We were seeing our Elixir/Phoenix servers crash after ~1000 simultaneous connections.
Also, we had to change the tcp_keepalive_time and file-max
sysctl -w net.ipv4.tcp_keepalive_time=250 fs.file-max=1024000
It looks like +K was removed into OTP 21 and +A may have less impact now with the introduction of the dirty scheduler. +Q appears to default to 65536 so maybe that’s not needed anymore?
It would be wonderful if the vm settings I used weren’t needed anymore – it was miserable and confusing. Also, there was very little written about it, so there was a bunch of guessing and hoping on my part.
+sbwt none
+sbwtdcpu none
+sbwtdio none
I think the import parts were these one above. Do you know much about those settings, @mattbaker? I’d love to know what I was changing – at the time, it felt like reciting a magical incantation.
These are the options I’m using as of last night;
+c true
+C multi_time_warp
+sub true
+swt very_low
+swtdio very_low
+sbwt none
+sbwtdcpu none
+sbwtdio none
Is this exclusively for containers? Or universally?
For anyone coming to this thread in future here’s some of what @garazdawi said in the thread on EFS:
See his full response and the rest of the thread here:
It’s the VM putting a scheduler in busy wait so the cpu core doesn’t go to sleep. The problem in containerized environments, like AWS, is that it can eat into your cpu quotas.
In my experience, I believe the BEAM was not quite optimized for containerized environments before OTP 23. For example, scheduler counts did not match the CPU cores in our k8s manifests. My team found the Rabbit MQ Runtime Tuning, which has helped us a lot. After OTP 23 was out, we noticed that we didn’t need most of the tunings anymore so we removed it.