Offline Mirror for mix deps?

I recently came across this due to our yarn install failing with network issues during the CI/CD process. I’m wondering is mix has anything like yarn’s offline mirror or if there has been discussion around it at all. I like the assurances of having the dependencies committed without all the noise of individual dependency files in the repo.

I know if the deps are already downloaded in deps/ and then you run mix deps.get while offline, mix will attempt to contact hex.pm and then fall back to what is downloaded locally. So if it’s possible to cache deps/. in ci, that might help.

I’m curious if there’s a way to have mix skip attempting to contact hex.pm if it already has everything downloaded that it needs. :thinking: That would save me a couple seconds each CI run.

Deps have the option to have a path, like:

deps() do
   [
       {:local_dependency, path: "path/to/local_dependency"}
   ]
end

Link to the docs, although I am not sure it accepts an actual package. The repo would have to be cloned?

One thing I was considering setting up was a MiniRepo Docker image pre-loaded with the hex deps the project needed. The build happens in a docker-compose so it makes sense in my case but maybe not all cases.

I use that sometimes for development work, but I don’t want the manual process of keeping something like that up to date. The nice thing about how yarn does it, is it saves the tarballs locally and the checksum for them in the lock file. That way you’re getting the exact same thing npm would be giving you, and you only have to track a smaller list of files. They could even be stored in git LFS.

I don’t like to store the reps folder in the repo because it’s so many files that can change a lot during updates. I do like having the assurance of the exact files I expect being stored in a way I can control though. It’s a tough compromise to make, but I feel like yarn addressed it in a pretty elegant way.

Reading through the yarn article it seems like Hex does all of this except it does it automatically without needing to configure a mirror.

All Hex packages you download are cached in ~/.hex which means that if your network is down and you need to fetch the package again the cache will be used as a fallback. You can force the cache to be used by setting the environment flag HEX_OFFLINE=1. Additionally, all your dependencies are locked in the mix.lock file to ensure you always get the same versions that were initially fetched and locked.

If you need something else then please elaborate on what you want to do and how it is different from what Hex provides today.

9 Likes

If I try to mix deps.get on a project, will it try to use this cache? Or will it always prefer to download “fresh” libs (even if same version)?

There are multiple layers of caching for the different stages of fetching dependencies. We will always try to use the cache if it’s available but that does not necessarily mean we make no HTTP requests if everything is cached.

First of mix deps.get will not do anything if you already have fetched dependencies in the projects deps/ directory and they match the lock. If dependencies are missing in deps/ or they do not match the lockfile they will be fetched.

Before fetching dependencies we may need to do dependency resolution in case dependencies are not locked or they are being updated with mix deps.update. To perform dependency resolution we need to fetch package indexes from the registry. Package indexes are cached but we will always make conditional HTTP requests [1] for them to ensure you have the latest version of the index, if you have the latest version they are not downloaded again.

When dependencies are resolved to a specific version their package tarballs can be fetched, the tarballs are also cached so if you fetched the tarball previously on the same machine and the checksum matches then no request will be made.

[1] https://developer.mozilla.org/en-US/docs/Web/HTTP/Conditional_requests

4 Likes

Thanks @ericmj , it seems like hex does handle this well already. However, it would be difficult to commit the cached tarballs since they reside in the user’s home directory. Is there any way to specify the location of the cache, so we can make sure it’s available in a CI/CD docker process?

1 Like

We don’t have a way to configure only the cache directory location.

Can you explain why you want this feature, what existing problems does it solve? Personally I don’t think you should commit files to version control to help the reliability of CI/CD. I think any concerns about that should be handled by the CI/CD process, by making it more reliable or adding caching there.

More notes

  • You may just tarball deps folder for example and use HEX_OFFLINE env var. However, I’ve seen its content changes (e.g. creating new file under /deps) during compile time…
  • You can easily set up hex repo with mini_repo - but you have to list packages to sync, or sync all.
2 Likes

This is a very informative and concise explanation. Thanks.

I recently was asked to mirror hex for an elixir developer on our offline network. We have an offline network (note not firewalled, just completely offline) for development work. I was unable to find a satisfactory mirroring solution so ended up writing one. It is fast and works well for our purpose.
I thought it might be useful to share so I have put it on GitHub at GitHub - WaterJuice/download-hexpm: Python script for downloading elixir repo.hex.pm for offline mirroring.

2 Likes

Found this repo for mirroring hex.pm in offline / air-gapped environments.

I quite liked that it doesn’t take a dependency on elixir directly ( just uses python ) as that seems friendlier for your local sysadmin.

Edit: I had missed this thread and originally top-posted it. It got merged into this thread so apologies for the dup with the previous comment