Newxan
August 16, 2023, 8:18am
1
Background
Like many others we’re currently using Alpine images for our deployments. Specifically at the moment we rely on:
hexpm/elixir:1.15.4-erlang-26.0.2-alpine-3.17.4
.
When looking around for a current unrelated issue i stumbled upon this issue Overrun stack and heap OTP-26.0 · Issue #7292 · erlang/otp · GitHub
which had this comment
opened 02:06PM - 24 May 23 UTC
team:VM
bug
**Describe the bug**
Application down few seconds after the run release in AWS … eks
**To Reproduce**
Unfortunately i have no idea how to reproduce.
**Affected versions**
26.0
**Additional context**
AWS EKS
erlang 26.0
alpine 3.18.0
Application logs
```
hend=0x00007f4f6f34c7f0
stop=0x00007f4f6f34c670
htop=0x00007f4f6f34c678
heap=0x00007f4f6f349600
beam/erl_gc.c, line 735: <0.3141.0>: Overrun stack and heap
```
This made me curious, and I didn’t want to pollute the issue with unrelated comments so I thought I would ask my question here instead.
Looking around in the forum and other places I can’t find any information stating that performance would be different depending on if your system is using GNU libc
or musl
and since Alpine images are very popular and by default rely on musl
I felt it warranted a post.
Question
Is there any performance difference if I use a images such as Alpine that rely on musl
rather than for example a Ubuntu image with libc
for production deployments? and is any such difference jit
related?
Thank you in advance.
Yes, there’s a significant loss in performance with musl
. I made a PR for OTP 27 that will remove these differences, the comment describes things in a bit more detail:
erlang:master
← jhogberg:john/jit/refactor-unix-sigaltstack/OTP-18568
opened 10:26AM - 26 Apr 23 UTC
Erlang code compiled to x86 native code uses `RSP` as its stack pointer. This im… proves performance in several ways:
- It permits the use of the x86 `CALL` and `RET` instructions, which reduces code volume and improves branch prediction.
- It avoids stealing a callee-save register to act as a stack pointer.
Unix signal handlers are by default delivered onto the current stack, i.e. `RSP`. This is a problem since our native-code stacks are small and may not have room for the Unix signal handler.
There is a way to redirect signal handlers to an "alternate" signal stack by using the `SA_ONSTACK` flag with the `sigaction(2)` system call. Unfortunately this has to be specified explicitly for each signal, and it is impossible to enforce given the presence of libraries.
We used to attempt to override the C library's signal handler setup procedure with our own that added the `SA_ONSTACK` flag, but it only worked with `GNU libc` which is not always the current libc. As many of our users liked to run docker images with `Alpine` which uses `musl` instead, they got needlessly bad performance without knowing it.
Instead, we now explicitly add `SA_ONSTACK` to our own uses of `sigaction(2)` and ignore the library problem altogether because:
1. We don't care about this problem on non-scheduler threads: if a library wants to fiddle around with signals on its own threads then it doesn't affect us.
2. We don't care about this problem when executing on the runtime stack: if a NIF or driver uses signals in a creative manner locally during a call, then that's fine as long as they restore them before returning to Erlang code.
A NIF or driver that doesn't do this is misbehaving to begin with and we can't shield ourselves against that.
3. If a library that we're statically linked to messes around with signals in the initialization phase (think C++ constructors of static objects), all of it will happen before `main` runs and we'll set things straight in `sys_init_signal_stack`.
If a dynamically linked library does the same, the same restrictions as ordinary NIF/driver calls apply to the initialization phase and the library must restore the signals before returning.
If any threads are created in either of these phases, they're still not scheduler threads so we don't have to care then either.
7 Likes