Docker build started failling today!?

I’ve been working on many projects based on Phoenix in the last few years and worked on the transition from mix phx.server (for production environment) to OTP release with distillery then mix release. Starting last night, for a reason I can not explain, splitting mix deps.get and mix compile on different docker intermediate layers does not work anymore :

COPY mix.* ./
RUN mix deps.get --only ${MIX_ENV}

COPY . .
RUN mix compile

And fails with the following errors:

Unchecked dependencies for environment prod:
* parse_trans (Hex package)
  lock mismatch: the dependency is out of date. To fetch locked version run "mix deps.get"
* mimerl (Hex package)
  lock mismatch: the dependency is out of date. To fetch locked version run "mix deps.get"

…

* phoenix_ecto (Hex package)
  lock mismatch: the dependency is out of date. To fetch locked version run "mix deps.get"
** (Mix) Can't continue due to errors on dependencies
The command '/bin/sh -c mix compile' returned a non-zero code: 1

Moving RUN mix deps.get after COPY . . works, but it defies the caching purposes of isolating dependencies… I tried on both MacOS and Ubuntu 18.04 with the same result :frowning:

We have been using this setup for as long as I can remember for every ruby/node/elixir projects we are packaging as Docker images. Take a look at the elixir-boilerplate project we are maintaining to start every web projects at Mirego.

1 Like

There was an update to hex last night, there were some other errors prior to the update around parsing the old lock format. I saw similar errors to yours after the update, but a rerun resolved them.

I’m pretty sure it has to do with the docker caching, so you may need to break the cache.

4 Likes

+1 on cache clearing. Spent some time battling similar errors on CI. Removing the cache fixed the problem.

3 Likes

I had to mix local.hex to update hex locally, then mix deps.get to update the lock file to the “fixed” format, then my docker build started working again!

> mix local.hex
Found existing entry: /Users/gcauchon/.asdf/installs/elixir/1.9.4-otp-22/.mix/archives/hex-0.20.1
Are you sure you want to replace it with "https://repo.hex.pm/installs/1.9.0/hex-0.20.4.ez"? [Yn] y
* creating /Users/gcauchon/.asdf/installs/elixir/1.9.4-otp-22/.mix/archives/hex-0.20.4

For future references to anyone who might come across the same issue, here are the related issues:

  1. https://github.com/hexpm/hex/issues/744
  2. https://github.com/hexpm/hex/issues/748
3 Likes

If anyone is having Hex+Docker-related failures on CircleCI in particular, note that their layer caching is non-deterministic and can’t be intentionally cache-busted, so you’ll have to turn it off altogether for about a week:

https://circleci.com/docs/2.0/docker-layer-caching/#how-dlc-works

2 Likes

This happens when you have a manifest file that is newer than your mix.lock file. That’s because you run mix deps.get which updates your manifest and lock file but then your dockefile’s COPY command introduces the old lockfile again. I would recommend against doing COPY . . in your dockerfile, you should instead only copy what is needed for the next build step.

I will publish a new release of Hex that mitigates this issue.

3 Likes

Which manifest file is this?

I haven’t deployed anything yet, but the lock file issue usually happens to me in development (because my volume mounted lock file has the old lock file). Running a docker-compose run web mix deps.get fixes it and runs almost instantly. That gets the newly built lock file from the running container back to the Docker host.

But I’m curious how that would happen in a CI or production environment because typically you wouldn’t be using a volume mount in those cases.

It’s an internal manifest to Hex we write to deps/$DEP/.hex with information about the fetched dependency. We diff the information in the manifest file against the lockfile to determine if the the local dependency is outdated and needs to be fetched again to match the lock.

In this case we marked the dependency as outdated because the manifest had a new field :outer_checksum that was missing in the lock.

It happens when you do COPY . . after RUN mix deps.get, because it copies everything from the host to the container, including mix.lock which is potentially outdated compared to the manifest in the container.

4 Likes

I’m trying this as I type those lines!

Thanks for the detailed explanations AND for shipping 0.20.5 already!

v0.20.5 (2020-02-05)

Enhancements

  • Add timestamps to entries in registry cache for easier debugging
  • Bump registry cache version to invalidate old caches
  • Warn if fetching registry without outer checksum

Bug fixes

  • Do not require that the registry supports outer checksums
  • Missing outer checksum is not a mismatch, this will fix “out of date” errors when the manifest is newer than the lockfile

Backward compatibility helps a lot as we don’t have to update all our active projects; but will do asap…

:beers: Cheers to everyone involved in solving this issue today, the community is amazing…

5 Likes

Oh, now I see why I never encountered this issue or heard of that file.

One of the first things I did in my app configuration was to set this in my mix.iex:

      build_path: "/elixir/_build",
      deps_path: "/elixir/deps",

Now all of the dependencies are outside the scope of where you would COPY . . and you don’t need to worry about all of your deps leaking back to your volume in development too.

I do the same thing with Yarn too for node_modules/.

Am I right to think that this diff:

- "bcrypt_elixir": {:hex, :bcrypt_elixir, "1.0.7", "e79d84b666bfad0e461ed217860e1c7de6bc4a1ae919cdb39d473ca784817653", [:make, :mix], [{:elixir_make, "~> 0.4", [hex: :elixir_make, repo: "hexpm", optional: false]}], "hexpm"},
+ "bcrypt_elixir": {:hex, :bcrypt_elixir, "1.0.7", "e79d84b666bfad0e461ed217860e1c7de6bc4a1ae919cdb39d473ca784817653", [:make, :mix], [{:elixir_make, "~> 0.4", [hex: :elixir_make, repo: "hexpm", optional: false]}], "hexpm", "acafec0a00cc76c1d7cd73245b0015cdc7c6667faee45d26920e07c495354c03"},

Means that the mix.lock file will now contain a checksum of all dependencies? @ericmj I am not criticising, simply trying to understand. I have to justify a rather big diff to our project’s mix.lock file and I am looking for the exact argumentation to present.

We have started using a new checksum for Hex packages that plugs a possible vulnerability with the old checksum. Because of backwards compatibility we cannot replace the old checksum so instead we add a new field with the new checksum.

In the past we have only added new fields when the dependency changed (either upgrade or downgrade) but since this new checksum was added for security reasons we decided to always add it after dependency resolution.

1 Like

Thanks a lot, this both improves my understanding and it gives me an official answer to help push through our local bureaucracy so a big mix.lock diff PR gets approved. :slight_smile:

1 Like