I came across this article about memory bloat in ruby. https://www.joyfulbikeshedding.com/blog/2019-03-14-what-causes-ruby-memory-bloat.html
I was wondering, does Erlang have the same problem and it is less noticeable or does Erlang avoid this somehow?
EDIT: Adding a bit of summary of the article because it’s rather long.
Ruby memory bloat has historically been claimed to be due to memory fragmentation. There is free space available, but the objects that need to be allocated won’t fit in any of those slots, so more memory is allocated for them. So, suggestions have been to use a smarter allocator, like
jemalloc or to set
MALLOC_ARENA_MAX=2 (default is 8 times the number of virtual CPUs).
The author noticed through experimentation that memory fragmentation wasn’t the problem. Instead, the problem seemed related to having multiple concurrent allocators. So, each allocator, would attempt to not step on each others toes by using its own distinct space. The
MALLOC_ARENA_MAX=2 addresses this by ensuring there are less concurrent allocators. However, the author found another way to use less memory, by having the ruby vm return the memory to the OS by calling
I hope others with a deeper understanding can provide more details, but at the high level, the main differences with BEAM is:
Each process has its own separate heap. This means less data for any particular garbage collector to have to deal with. It also means for short lived processes, there may be no need for any GC at all because the whole memory allocated to the process is freed in a single block when it completes.
Persistent data structures. These immutable data structures share parts of themselves with new “copies” that are derived from them. This means new allocations are often smaller diffs, resulting in less danger of fragmentation.
Disclaimer: I didn’t read the article
Once of the most common reasons for memory bloat and memory leaks in BEAM are binaries. In particular when you only use part of the entire binary. This is usually paired with long lived processes.
When your processes are short-lived the BEAM will GC any allocated memory including stray binaries.
The BEAM is also greedy. Once it allocates a chunk of memory it rarely releases it back to the OS since it manages the memory internally. With this a spike in memory usage would not drop down dramatically post-spike.
To fight fragmentation, BEAM uses its own memory allocation library instead of using
@garazdawi held a talk about memory management at Erlang Factory SF 2014.
For a lot of details, see erts_alloc.
OTP 22 will attempt to return memory to the OS:
OTP-15075 Application(s): erts
The emulator will now mark free blocks in pooled
carriers with madvise(2) + MADV_FREE (or similar),
letting the OS reclaim the associated physical memory
UPDATE: There is also a blog post about Memory instrumentation in OTP 21.
That is great to hear, thank you for the insight and the great links!
I always considered it one of Ruby’s downfalls not to return memory to the OS. That way a heavy monthly job could bloat your sidekiq memory usage by ~1 Gigabyte or so until you restarted the process and all kinds of people would wonder if you had a “memory leak” - no it’s just bloat.
I understand that the runtime thinks “oh I needed this much memory once it’s likely that I’ll need it again” it’s just detrimental a lot of times. Have never noticed it with the BEAM and was surprised to find out that it also doesn’t give memory back that often.
Happy to see that’ll change Looking forward to OTP 22 a lot now!
Just to be clear. Erlang has always returned memory to the OS (using munmap or free). What will happen in OTP-22 is that mapped but currently unused memory will be made available to the OS.