Elixir / Erlang docker containers RAM usage on different OS's / Kernels

On my development VM (specs see below) - starting an elixir / erlang container takes up 2G of physical memory while running iex locally only takes up a few MB.
I only started an empty container with no modules loaded or code executed.
When more containers are started beyond the memory capabilities of the host, running Beam VM’s running in other (running) containers crash for the new instance to start.

Image of docker running with htop and system info (dev-vm)

observer_cli showing the memory usage

While memory usage is highly alarming on its own, other docker hosts are able to start the container with normal / reasonable memory consumption.

Production server running empty elixir container with normal memory usage

(Both instances running the same docker run -it elixir)

Other users could reproduce the problematic memory usage with the Ubuntu Desktop (kernel v5.15) but Ubuntu Server (kernel v6.4) is supposedly running just fine.
(Both running docker 24.0.2)

So far I tried:

  • running different versions of the image elixir:slim, elixir:alpine, …
  • limiting memory usage on docker → container crashes with errno 137 (out of memory);
  • checking for swap → swap is disabled on both devices;
  • running different versions of docker;

Any ideas on this problem would be apreciated since I’m unable to narrow down this problem to a Beam / Docker / OS / kernel level.

Kind Regards, Jannus

I’d be curious how hexpm/elixir images fare. They’re imo the much more useful images over the elixir ones.

1 Like

Is this true for a downgraded version of OTP, lets say OTP-25?

Just a quick test:

docker run --name elixirhexpm --rm -it hexpm/elixir:1.15.4-erlang-26.0.2-debian-buster-20230612-slim

Memory usage is 5MB; then I ran iex in the container. Memory usage is 2GB.

hexpm/elixir:1.15.4-erlang-25.3.2.4-debian-buster-20230612-slim (so OTP 25) has the exact same issue, but funny enough the memory usage is 1.5GB in stead of 2.0GB.

I had some screenshots as well in the (closed and duplicated topic): Elixir docker container very high memory usage

I ran this on a ARM instance server and it uses less than 5 MB, maybe there is a problem with hardware?

I think with only 5MB you forgot to run iex :).

As said: I cannot reproduce it everywhere. My desktop has this issue, but my linux vm/vps server cannot reproduce it. So there is definitely something else (docker, containerd, linux kernel, ??) that affects this issue. I have only tested on amd64.

But @luechtdev obviously has the same issue on some of his linux systems.

I’m not a linux master to understand how to correctly get the memory usage, however from my 8GB instance, it is using 1.6% with IEX opened, witch should be roughly 128 MB.

I check the memory usage with docker stats.

Anyway, when limiting the memory to 500M (which should be plenty for just an iex sesison) the container immediately crashes:

From dmesg on the host:

[49050.983989] Tasks state (memory values in pages):
[49050.983990] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[49050.983991] [  19403]     0 19403     1405        0    49152       96             0 bash
[49050.983993] [  19431]     0 19431   821098   127079  2166784   127776             0 beam.smp
[49050.983995] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=docker-14bbfbecdada29e0e959b20b6d5e273609fcbb7d3440959d143a41b4ecebccdc.scope,mems_
allowed=0,oom_memcg=/system.slice/docker-14bbfbecdada29e0e959b20b6d5e273609fcbb7d3440959d143a41b4ecebccdc.scope,task_memcg=/system.slice/docker-14bbfbecdada29
e0e959b20b6d5e273609fcbb7d3440959d143a41b4ecebccdc.scope,task=beam.smp,pid=19431,uid=0
[49050.984007] Memory cgroup out of memory: Killed process 19431 (beam.smp) total-vm:3284392kB, anon-rss:508188kB, file-rss:128kB, shmem-rss:0kB, UID:0 pgtabl
es:2116kB oom_score_adj:0
2 Likes

Here you go:


8fda87bc816b   elixirhexpm                   0.00%     52.47MiB / 500MiB     10.49%    946B / 0B         0B / 0B           22

I created a new (arm) vm with fedora 38 and can reproduce it there as well (seems that fedora 38 seems to be one of the distros with new enough kernel/containerd/etc to reproduce).

docker run --name elixirhexpm --rm -it hexpm/elixir:1.15.4-erlang-26.0.2-debian-buster-20230612-slim iex

CONTAINER ID   NAME          CPU %     MEM USAGE / LIMIT    MEM %     NET I/O       BLOCK I/O   PIDS
b9f0d3e19106   elixirhexpm   0.00%     2.04GiB / 3.712GiB   54.95%    1.36kB / 0B   0B / 0B     21
[root@fedora-4gb-fsn1-2 ~]# uname -a
Linux fedora-4gb-fsn1-2 6.2.15-300.fc38.aarch64 #1 SMP PREEMPT_DYNAMIC Thu May 11 16:54:06 UTC 2023 aarch64 GNU/Linux
[root@fedora-4gb-fsn1-2 ~]# docker version
Client: Docker Engine - Community
 Version:           24.0.5
 API version:       1.43
 Go version:        go1.20.6
 Git commit:        ced0996
 Built:             Fri Jul 21 20:37:12 2023
 OS/Arch:           linux/arm64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          24.0.5
  API version:      1.43 (minimum version 1.12)
  Go version:       go1.20.6
  Git commit:       a61e2b4
  Built:            Fri Jul 21 20:35:57 2023
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          1.6.21
  GitCommit:        3dce8eb055cbb6872793272b4f20ed16117344f8
 runc:
  Version:          1.1.7
  GitCommit:        v1.1.7-0-g860f061
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

Yeah I’m running ubuntu:

Welcome to Ubuntu 22.04.2 LTS (GNU/Linux 5.15.0-72-generic aarch64)

Yup, I can reproduce on Manjaro (kernel 6.3.*) with both Erlang 26 and 25:

# Erlang 26
CONTAINER ID   NAME          CPU %     MEM USAGE / LIMIT     MEM %     NET I/O         BLOCK I/O         PIDS
00fcdb049774   elixirhexpm   0.14%     2.043GiB / 31.18GiB   6.55%     8.23kB / 0B     0B / 0B           26
# Erlang 25
CONTAINER ID   NAME          CPU %     MEM USAGE / LIMIT     MEM %     NET I/O          BLOCK I/O         PIDS
84462b9c494b   elixirhexpm   0.15%     1.538GiB / 31.18GiB   4.93%     3.73kB / 0B      0B / 0B           26

However, on my Mac it isn’t happening:

# Erlang 26
CONTAINER ID   NAME          CPU %     MEM USAGE / LIMIT     MEM %     NET I/O       BLOCK I/O    PIDS
5ece3af3e231   elixirhexpm   0.00%     62.81MiB / 23.48GiB   0.26%     1.11kB / 0B   283kB / 0B   58
# Erlang 25
CONTAINER ID   NAME          CPU %     MEM USAGE / LIMIT     MEM %     NET I/O     BLOCK I/O   PIDS
b4d64a258d1d   elixirhexpm   0.08%     57.21MiB / 23.48GiB   0.24%     806B / 0B   0B / 0B     58

My Docker-fu is pretty rusty, I would guess this might have something to do with how is memory managed or pre-reserved or something similar. But not certain.

Erlang on most Unixes allocates a lot of virtual memory at startup, but only uses a small amount of it. If your OS physically reserves all virtual memory (or limits memory usage by looking at virtual memory) you can change how much is reserved by Erlang though the +MIscs switch.

2 Likes

I wonder if there’s a way to make Docker report the right amount of memory.

I tried with the “old” ubuntu that reports a low (50MB) memory usage: I can start iex with --memory 100m in the docker command without issues.

On the newer systems (with a newer kernel/containerd version) I cannot start the container with iex with -memory 750m, it will be OOM killed.

1 Like

The problem seems to be the default Port limits visible under runtime_info/0.
In some cases (with excess memory usage) the ERL_MAX_PORTS value is set to 134,217,727 instead of the default 1,024. (approx. 50% of the max allowed value)

docker run -it -e ERL_MAX_PORTS=1024 elixir fixes the problem – in my case – but it still is just a workaround and doesn’t really explain this behavior.

8 Likes

The default ERL_MAX_PORTS is taken from the sysconf value of OPEN_MAX. So if you have that set very high on your system, the port table will use a lot of memory.

You can get your value (on Linux) like this:

$ getconf OPEN_MAX
1024

This behaviour is described in the docs for erl +Q.

7 Likes

Thank you very much for this clarification and the link to the Erlang documentation!

It didn’t immediately click for me that ERL_MAX_PORTS is the same as the file discriptors ulimit -n.

With this information we know how to fix the issue, but it did not explain why it only happens on some systems and only when running elixir/erlang in a container. Then I stumbled upon this pull request for containerd: https://github.com/containerd/containerd/pull/7566

From this discussion it’s clear that at some point/version the file discriptor limits for containerd (and something also changed with systemD) changed from a real limit (1_048_576) to unlimited. With unlimited the mentioned +Q will take it’s maximum allowed value (134_217_727).

From that discussion it’s clear that having unlimited number of file descriptors is affecting other software as well.

In my opinion the Erlang numbers are reasonable enough (even with the maximum number it will still work fine, just use 1GB of memory), but I hope containerd will revert back to a “normal” limit. As shown here, for most people it’s not immediately intuitive what the cause of this excessive memory usage is and how to resolve. If your software needs (very) high limits of file descriptors, you’ll know and have the knowledge to manually override the settings/flags for it.

9 Likes

Thank you all, I’ve been losing my mind trying to figure out why my app suddenly needed 2GB of RAM when it wasn’t even close before. Turns out my Docker version got upgraded and I was suddenly hitting this issue. Setting ERL_MAX_PORTS fixed it!

3 Likes