GitHub Action Cache Elixir always recompiles dependencies (Elixir 1.13.3)

Looking to speed up our build on GitHub Actions. Right now the slowest task is compiling dependencies, which takes over 3 minutes. This is surprising, because the _build folder should be cached, but it behaves as though it isn’t.

I’ve found that the command mix deps.compile does a full recompile locally too, even if I run it twice in a row. Not sure if this is a regression in Elixir 1.13 or expected behaviour?

I’ll walk through the relevant portions of the GitHub Actions config.

setup-beam

For starts, using erlef/setup-beam, which is fairly standard. I don’t know if we need to specify anything extra for rebar? This is a fairly standard Phoenix app.

env:
  MIX_ENV: test
steps:
  - uses: actions/checkout@v2
  - uses: erlef/setup-beam@v1
    with:
      otp-version: "24.2.1"
      elixir-version: "1.13.3"

cache

Then onto actions/cache which is using the standard example for Elixir, other than that we’ve added a v2- prefix at some point to bust an old cache.

- uses: actions/cache@v2
  id: cache
  with:
    path: |
      deps
      _build
    key: v2-${{ runner.os }}-mix-${{ hashFiles('**/mix.lock') }}
    restore-keys: |
      v2-${{ runner.os }}-mix-

download dependencies

Downloading dependencies, including some private ones, is a step that we can reliably skip if there is a cache-hit. If the lock file hasn’t changed, everything still works and these two steps take zero seconds, proving that the cache is working.

- name: Authenticating Hex
  if: steps.cache.outputs.cache-hit != 'true'
  run: mix hex.organization auth ${ORG_NAME} --key ${HEX_ORG_KEY}
  env:
    HEX_ORG_KEY: ${{ secrets.HEX_ORG_KEY }}
    ORG_NAME: ${{ secrets.ORG_NAME }}
- name: Install dependencies
  if: steps.cache.outputs.cache-hit != 'true'
  # NOTE: AppSignal has its hooks in other dependencies, so we tell it to compile first
  run: |
    mix do deps.get, deps.compile appsignal --include-children

We do have some special handling for AppSignal here (which should maybe be moved into a separate step).

code formatting

We found that we can check code formatting before compiling, which saves some cycles by “failing fast.”

- name: Check code formatting
  run: mix format --check-formatted

Compiling

The main event. I recently separated out the compiling of dependencies, just to see the time they take vs. our app. The dependencies take 3-4 minutes to compile, whereas our app takes about 30 seconds or so.

- name: Compile dependencies
  run: |
    mix deps.compile
- name: Compile
  run: |
    mix compile --warnings-as-errors

If I skip the deps.compile step then compile will compile all the deps.

If the mix.lock file hasn’t changed, and if the _build folder is cached, then I would hope that deps.compile would be very fast or could even be skipped.

But that isn’t the case right now. :cry:

Further steps

There are further steps for static code analysis and running the tests (which incidentally takes less time than compiling dependencies). I’m going to stop here because it’s the compilation that I’m interested in.

If you have any suggestions, please let me know. Also happy to open a bug with Elixir if you think it’s a regression.

4 Likes

I can reproduce mix deps.compile recompiling Erlang dependencies on Elixir v1.13. But it doesn’t recompile Elixir deps and it doesn’t recompile when running mix compile. Does this match the behaviour you see?

EDIT: to be clear, it doesn’t recompile Erlang dependencies. It invokes Rebar3 but it is mostly a no-op.

1 Like

If you can isolate this error into a smaller app with reproduction steps, I will be glad to take a look at it!

2 Likes

Hi José. Thanks for the rapid reply. I think I see what you mean.

Before I didn’t recognize the difference between Erlang dependencies and Elixir dependencies compiling. So as I understand now, this is running multiple times locally (Erlang-only):

❯ mix deps.compile
===> Analyzing applications...
===> Compiling parse_trans
===> Analyzing applications...
===> Compiling mimerl
===> Analyzing applications...
...

In CI it is different, in that there is a full recompile, which includes Elixir dependencies.

Run mix compile --warnings-as-errors
==> exactor
Compiling 10 files (.ex)
Generated exactor app
==> faker
Compiling 110 files (.ex)
Generated faker app
==> jumper
Compiling 1 file (.ex)
Generated jumper app
==> nimble_options
Compiling 3 files (.ex)
Generated nimble_options app
==> assertions
Compiling 6 files (.ex)
Generated assertions app
===> Analyzing applications...
===> Compiling parse_trans
...

What this makes me think is that deps and _build may not be the only folders that should be cached?

I found this closed issue that suggests caching .setup-beam.

It will take me a little while to see if I can reproduce the error in a smaller app, but that’s always a good exercise. I may try a few other things first. If you or anyone has any further suggestions, that would be greatly appreciated.

Thanks, Nathan.

1 Like

That .setup-beam stuff didn’t seem to help.

This is an example that includes a subset dependencies from the larger app. Included some github ones just in case that makes a difference.

The hope is that there is someway to avoid run the recompile of all the dependencies after every change:

You can also look at our setup in Livebook. I don’t think we are recompiling dependencies in there: GitHub - livebook-dev/livebook: Interactive and collaborative code notebooks for Elixir - made with Phoenix LiveView

I was curious so I tried to play with your repository example but I could not reproduce the issue.

However, I think I have found the cause: the cache is not created yet when the next workflow is trying to access it. It is more a Github Action issue.

Commit 4fa8fe4 pushed at 11 Feb 2022, 22:22 GMT:

Commit 206f14b pushed at 11 Feb 2022, 22:25 GMT:

Both are trying to get a cache with the key Linux-mix-ca19570116549225dad4bcc705d9f4fe0f00a3bca16e60d7e730c353018faeb2. But the first workflow has not finished to create the cache. Not the end of the message “another job may be creating this cache.”

Potential solution:

  1. Split caches between deps and _build
  2. Cache key only for deps should be end with ${{ hashFiles('**/mix.lock') }}
  3. Cache key only for _build should be end with ${{ hashFiles('**/mix.lock') }}-${{ hashFiles( '**/lib/**/*.{ex,eex}', '**/config/*.exs', '**/mix.exs' ) }}
  4. (optional) Adapt the glob patterns in hashFiles for your needs

Here an example of the final result:

      - name: Cache Elixir deps
        id: elixir-deps-cache
        uses: actions/cache@v2.1.7
        with:
          path: |
            **/deps
          key: |
            ${{ runner.os }}-elixir-deps-${{ hashFiles('**/mix.lock') }}
          restore-keys: |
            ${{ runner.os }}-elixir-deps-
      - name: Cache Elixir build
        id: elixir-build-cache
        uses: actions/cache@v2.1.7
        with:
          path: |
            **/_build
          key: |
            ${{ runner.os }}-elixir-build-${{ hashFiles('**/mix.lock') }}-${{ hashFiles( '**/lib/**/*.{ex,eex}', '**/config/*.exs', '**/mix.exs' ) }}
          restore-keys: |
            ${{ runner.os }}-elixir-build-${{ hashFiles('**/mix.lock') }}-
            ${{ runner.os }}-elixir-build-

I don’t remember where I have seen this, but the reason is that the cache from deps should depend on changes from mix.lock only. Whereas _build should depend on changes from “the rest” (application code, config, …). For _build cache, a new hash will be generated each time lib, config or mix.exs are changed, which will avoid potential conflicts between workflows when getting/creating a cache key.

2 Likes

@aifrak Thanks so much for taking the time to investigate, and pointing out the “another job may be creating this cache” error on the example repository.

I’m not seeing that error on my actual app right now, just “Cache hit occurred on the primary key v2-Linux-mix-{hash}, not saving cache.” That said, it does make sense to split up deps/_build caching, or use needs: test or maybe even combine those jobs.

While I wasn’t that concerned with caching the mix compile if I could get mix deps.compile to cache, I may give that a go, based on your example.

@josevalim You speak the truth. It looks like the “Check warnings” step in Livebook isn’t recompiling all the dependencies. I need to get to the bottom of why the discrepancy, and then report back.

Thanks all

After experimenting with the previously linked stand-alone repository, I made the following changes:

  • Combine Elixir & JS tests into one job. This reduces likelihood of contention on the cache, which can result in a “another job may be creating this cache” error (our JS tests need Elixir & Phoenix).
  • Remove AppSignal recompile fix :crossed_fingers:t2:
  • Don’t compile dependencies separately, as this always recompiles Erlang deps
  • Rev GitHub Actions cache to build a fresh cache

With the cache working properly, the compile step is taking around 20 seconds instead of 3.5 minutes or so.

Thanks again for the assistance and examples,
Nathan.

5 Likes

Thanks for your tip right there! A lot of tutorials mention mix deps.compile followed by mix compile, and I saw a lot of slow logs in one of my apps CI:

===> Analyzing applications...
===> Compiling parse_trans
===> Analyzing applications...
===> Compiling yamerl
===> Analyzing applications...
===> Compiling mimerl
===> Analyzing applications...
===> Compiling sleeplocks
===> Analyzing applications...

These are only Erlang apps, your post put me on the right track.

The trick was to only call mix compile (Accélération de la build by thbar · Pull Request #2277 · etalab/transport-site · GitHub).

Thanks!

Do you (or anyone else) know of any workaround for this? I really like to have my dependencies compilation output separated from my application’s compilation output and it would be nice to keep that property without requiring a recompilation of all the erlang dependencies.

1 Like

mix deps.compile will attempt to compile all dependencies. Rebar3 logs, but the truth is that all dependencies are being checked, which does not occur on mix compile. So unless someone contributes a mix deps.compile --only-stale or something of sorts, a mix deps.compile step will always do more work than necessary.

3 Likes

There is also mix loadpaths, which is actually what is called by mix compile. It is a public task although it does not show up on mix help. It won’t compile the current app but it will check and compile missing dependencies. Another option is to provide something like mix compile --only-deps.

5 Likes