We haven’t been able to start using Erlang 25, yet, because I’ve been unable to get our base images built with it. Half of our developers use Macs with the new arm processors, so we’ve switched to a process that builds our images with multi arch support.
I’ve created a dumbed down process as an example of how we build it using Github Actions, here:
You can see here where this problem happens with all supported versions (from Docker) when using Erlang 25:
I stopped it after 12.5 minutes.
On an ubuntu machine I have I can also duplicate this hanging:
docker buildx build --platform linux/arm64,linux/amd64 -t test-arm64-qemu-issue --build-arg ELIXIR_VERSION=1.13.4 --build-arg ERLANG_VERSION=25.0 --build-arg ALPINE_VERSION=3.16.0 .
Has anyone had any luck with this?
I don’t know if this is your problem, but because of bugs in qemu when emulating arm64 the JIT does not work there, so you need to run the arm64 docker images on a native arm64 machine, or disable the JIT (which is done when compiling Erlang).
It should manifest as a segfault and not as a hang, but maybe something is masking the segfault.
Edit: You can find some more information here: OTP 25.0-rc3 (release Candidate 3) is released - #25 by jhogberg - Erlang News - Erlang Programming Language Forum - Erlang Forums
@garazdawi thank you for your work! I wanted to ask you if you have heard/know about attempts of using the about the
--platform flag Dockerfiles (
FROM --platform=linux/amd64 python:3.7-alpine , link to a suggestion). I’ll experiment and report back, but wondered if you knew of any successes using it. Thanks!
@garazdawi sorry, I had not understood the problem before. Turns out that you can still
build those images on the arm platform and they will work as expected. The broken feature is trying to
run images built on other non-arm architectures on the arm one. Much like beam releases that should be built on machines that will run them.
Thanks again for all that you do!
Did you ever resolve your problem?
Not quite, we’ve just stuck with 24 for the time being.
I had this problem on my m1. I have a solution that is not ideal but it works, if it can help someone:
- I have activated Rosetta
- I have added
export DOCKER_DEFAULT_PLATFORM=linux/amd64 in my .zshrc file (
.env file should works also).
- For the rare images where the image doesn’t work well in emulated amd64, I use the arm64 platform. In order not to bother my colleagues by placing my arm64 in the Dockerfile, there is the Docker variable
FROM --platform=$BUILDPLATFORM hexpm/elixir:1.14.3-erlang-25.2.2-ubuntu-jammy-20221130
This is the only image where I need to do that. So all my other images (Postgres, Redis, etc) stay in amd64.
I hope this trick will not be required for too long.
The QEMU bug for the OTP 25 JIT compiler failing was supposedly fixed in qemu 8.1. So maybe the problem we’re seeing with docker cross-builds is a different problem?
I’m running a docker cross-build targeting linux/arm64 from a linux/amd64 host using qemu v8.1.2 and I see a similar problem as the OP, with OTP 24 it builds fine, but with OTP 25 or 26 it will hang forever at a
RUN mix deps.get step. Note: For me
mix local.rebar and
mix local.hex steps complete successfully
Are the only solutions still what @garazdawi recommends? 1) build arm64 images on a native arm64 host, or 2) use an Erlang build with the JIT disabled?
I was having trouble building a amd64 docker image on my MacBook Pro (arm64).
I was about to give up until I tried https://orbstack.dev/ instead of Docker Desktop.
Now it’s working. Maybe orb is using a newer qemu.
There is a new open issue for QEMU: Segmentation fault when compiling elixir app on qemu aarch64 on x86_64 host (#1953) · Issues · QEMU / QEMU · GitLab. Maybe it will gain some traction, but they say it’s hard to diagnose and work on in a container.