Mix deps.get memory explosion when doing cross-platform Docker build

I’m trying to do a cross-platform build for a Phoenix app, so on my ARM macOS, I’m running

docker buildx build --platform linux/amd64 [more, less relevant options] .

The relevant part of the Dockerfile contains:

FROM hexpm/elixir:1.15.2-erlang-26.0.2-ubuntu-focal-20230126 AS build-stage

# install build dependencies
RUN apt-get update -y && apt-get install -y build-essential git \
    && apt-get clean && rm -f /var/lib/apt/lists/*_*

# prepare build dir
WORKDIR /app

# install hex + rebar
RUN mix local.hex --force && \
    mix local.rebar --force

# set build ENV
ENV MIX_ENV="prod"

# install mix dependencies
COPY mix.exs mix.lock ./
RUN mix deps.get --only $MIX_ENV

When it reaches the RUN mix deps.get line, it blows up after a brief moment with BEAM running out of memory:

 => ERROR [build-stage  6/17] RUN mix deps.get --only prod                                                                             3.5s
------
 > [build-stage  6/17] RUN mix deps.get --only prod:
3.040 no next heap size found: 2305842975251512930, offset 0
3.045
3.045 Crash dump is being written to: erl_crash.dump...done
3.483 qemu: uncaught target signal 6 (Aborted) - core dumped
3.493 Aborted
------
Dockerfile:22
--------------------
  20 |     # install mix dependencies
  21 |     COPY mix.exs mix.lock ./
  22 | >>> RUN mix deps.get --only $MIX_ENV
  23 |     RUN mkdir config
  24 |
--------------------

Things I’ve already tried:

  1. Omitting --platform linux/amd64 makes the memory explosion go away. But then the binaries built will be Aarch64 which won’t run on an x64 Linux server.
  2. Skipping --only prod makes no difference, getting the deps still has a memory explosion.

I understand this is not a common thing to do, but if someone has an idea on how to solve this, I’d be grateful.

3 Likes

hi @mikl! I was reading a thread a while back that may help you:

Could you please try that out and report back? Thanks!

3 Likes

Wow, that’s quite the rabbit hole.

I’m very happy to report that adding ENV ERL_FLAGS="+JPperf true" to the build-section of my Dockerfile fixes this.

Thanks, @pdgonzalez872.

(To anyone else needing this workaround, if you consult the Erlang docs for that flag, this is indeed recommended as a workaround for Qemu-specific problems, not something you should have set on your server).

8 Likes

This is most probably to the fact that qemu was not supported for OTP for quite a while now because of the JIT optimisation, now we can call it a bleeding edge feature that will most probably be fixed out of the box soon.

Excellent! :heart:

I added the +JPperf true flag to build the Elixir image for the arm64 architecture, and the image size will increase significantly. I deleted the jit-*.dump and perf-*.map files in the /tmp path to reduce the size. But this phenomenon seems to only exist when building Elixir, and does not happen when building Elixir apps.