How to copy dependencies from outside Docker image build?

mikl · August 4, 2023, 1:37am

I have a multi-step Gitlab CI pipeline, where I’m trying to avoid downloading Hex dependencies multiple times, but I’m struggling with the step where the Docker image is built.

In previous steps, I set MIX_HOME to a _mix folder inside the build structure, to be able to cache it with Gitlab CI cache and to pass it and the _build and deps folders around for use by separate lint and test steps. That works fine, but when I try to re-use said folders for the step that builds a Docker image, it doesn’t work as expected. The relevant parts of my Dockerfile:

FROM hexpm/elixir:1.15.4-erlang-26.0.2-debian-bookworm-20230612-slim AS build

# install build dependencies
RUN apt update -y && apt install -y build-essential curl git

# prepare build dir
WORKDIR /app

# set build ENV
ENV MIX_ENV=prod
ENV MIX_HOME=/app/_mix

# copy Mix dependencies from previous step
COPY _build _mix deps mix.exs mix.lock /app/
COPY config config

# install hex + rebar
RUN mix local.hex --force && \
  mix local.rebar --force

RUN mix deps.compile

COPY priv priv
COPY assets assets
COPY lib lib
RUN mix compile

However, despite copying both _build, deps and the MIX_HOME folder, when Docker reaches the mix compile step here, it errors:

Unchecked dependencies for environment prod:
* telemetry_metrics (Hex package)
  the dependency is not available, run "mix deps.get"
* phoenix_live_view (Hex package)
  the dependency is not available, run "mix deps.get"
[and so on for every package]

Shouldn’t it be finding the packages, since I’ve copied over the deps folder?

I’ve also tried running RUN mix compile --no-deps-check instead, but that just throws different errors like module Ecto.Query is not loaded and could not be found.. So Elixir is truly unable to find said modules.

It seems that copying the aforementioned folders into the Docker environment should work, so what am I missing here?

sergio-ocon · August 4, 2023, 7:34am

Have you tried more advanced Dockerfile syntax?

Slides 13 an on,
RUN --mount=type=cache

D4no0 · August 4, 2023, 7:44am

This line is a bit suspicious:

COPY _build _mix deps mix.exs mix.lock /app/

Are you using the gitlab’s CI cache? And if yes, where this cache is populated in the first place?

I would suggest to drop the _build cache if you compiled these files anywhere else before, because I also had this problem, I ended up dropping cache on “docker in docker” builds since I had some cryptic compile errors.

mikl · August 4, 2023, 8:30am

That might be faster in this case, but is there any reason why it shouldn’t work with plain old copy?

mikl · August 4, 2023, 8:38am

Yeah, I have a previous step looking like this:

.marsvin:
  image: hexpm/elixir:1.15.4-erlang-26.0.2-debian-bookworm-20230612-slim
  variables:
    MIX_ENV: "test"
    # To have hex and rebar included in cache/artifacts, put MIX_HOME inside
    # the build root.
    MIX_HOME: "${CI_PROJECT_DIR}/packages/marsvin/_mix"

marsvin_mix_install:
  extends: .marsvin
  stage: init
  needs: []
  dependencies: []
  cache:
    key: "marsvin_mix_install"
    paths:
      - packages/marsvin/_mix
      - packages/marsvin/_build
      - packages/marsvin/deps
  before_script:
    - cd packages/marsvin
    - mix local.hex --force --if-missing
    - mix local.rebar --force --if-missing
  script:
    - mix deps.get
    - mix deps.clean --unused
    - mix compile --warnings-as-errors
  artifacts:
    paths:
      - packages/marsvin/_mix
      - packages/marsvin/_build
      - packages/marsvin/deps
    expire_in: 7 days

The exported artifacts are then used in the build step, like this:

marsvin_build:
  extends: .marsvin
  stage: build
  needs:
    - "marsvin_mix_install"
    - "marsvin_mix_test"
  dependencies: ["marsvin_mix_install"]
  image: ${BUILDKIT_IMAGE}
  variables:
    APP_VERSION: "${APP_VERSION}"
    IMAGE_URL_INTERNAL: "${DOCKER_REGISTRY_INTERNAL}/${DOCKER_NS_INT}/marsvin"
  script:
    - cd packages/marsvin
    - ${BUILDKIT_SCRIPT}
      --tag=${IMAGE_URL_INTERNAL}:${IMAGE_VERSION}
      --cache-repo=${IMAGE_URL_INTERNAL}:cache
      --context=.
      --file=Dockerfile

Yeah, I only added that to try to solve aforementioned problem, does not appear to make a difference anyway.

sergio-ocon · August 4, 2023, 9:21am

This solves the external cache problem, but I don’t know why it is failing. Are you using the same user to run your container?

Why don’t you use a release for this? Then you can have a multistage Dockerfile, and keep the running container lean (you don’t need Elixir there, just the libraries and the minimal requirements needed to run the release. Actually, mix phx.gen.release uses that approach and works really well (an the final image is smaller and does not contain anything not needed so it is more secure and faster to copy)

mikl · August 4, 2023, 2:28pm

True, it might be (much) easier to simply build the release outside of the Docker-in-Docker environment, and reduce the Dockerfile to just copy that and call it done. That neatly sidesteps the dependency problem. I’ll try that instead.

Edit: Building the release outside the Dockerfile works great. Thanks for the idea.

w0rd-driven · August 4, 2023, 11:46pm

It’s probably not worth mentioning now but your cache key being static would mean every build that uses marsvin_mix_install would try to share the same cache. That’s every commit on every branch.

I use the variable $CI_COMMIT_REF_NAME which creates a cache for each branch. A good visual primer is A visual guide to GitLab CI/CD caching | GitLab to understand the various strategies. A single key may be fine if you don’t branch at all since I think later commits drop earlier jobs.

You can also do print debugging by adding RUN ls -al cache/directory to output contents in the logs. Its hacky as hell and there may be better approaches like setting up a runner locally. You could also do stuff like that in the .gitlab-ci.yml script or before script sections.

spaceCowboy · December 6, 2024, 7:28pm

Just reviewing your .gitlab-ci.yml. In marsvin_build you have both needs: and dependencies: . This might work just fine, but Gitlab recommends not having both in the same layer: CI/CD YAML syntax reference | GitLab, as one relies on current stage, and the other relies on previous build stage. IDK if that changes anything, but thanks for sharing your scripts.