Copying _build, deps and recompiling in different docker container (CI/CD)

tino415 · March 24, 2023, 7:48pm

Hello all,

today I had interesting problem in our CI/CD pipeline. We set up pipeline so in one container was run mix deps.get, MIX_ENV=test mix deps.compile and MIX_ENV=test mix compile then we copied _build and deps folder to other container and run mix test there. Problem was that after running migrations tests failed on error:

** (Mix) Could not start application ranch: exited in: :ranch_app.start(:normal, []) 00:06
744    ** (EXIT) an exception was raised: 00:06
745        ** (UndefinedFunctionError) function :ranch_app.start/2 is undefined (module :ranch_app is not available) 00:06
746            (ranch 1.8.0) :ranch_app.start(:normal, []) 00:06
747            (kernel 8.3.2.3) application_master.erl:293: :application_master.start_it_old/4

Which got resolved by running MIX_ENV=test mix deps.compile on testing machine. My question is why is that? Is that somehow avoidable? On localhost I never had to run MIX_ENV=test mix deps.compile or MIX_ENV=test mix compile to be able to run mix test and deps.compile takes a lot of time and noticeably slowing down our pipeline.

BradS2S · March 25, 2023, 2:33am

tino415:

all,

today I had interesting problem in our CI/CD pipeline. We set up pipeline so in one container was run mix deps.get, MIX_ENV=test mix deps.compile and MIX_ENV=test mix compile then we copied _build and deps folder to other container and run mix test there. Problem was that after running migrations tests failed on error:
** (Mix) Could not start application ranch: exited in: :ranch_app.start(:normal, []) 00:06
744    ** (EXIT) an exception was raised: 00:06
745        ** (UndefinedFunctionError) function :ranch_app.start/2 is undefined (module :ranch_app is not available) 00:06
746            (ranch 1.8.0) :ranch_app.start(:normal, []) 00:06
747            (kernel 8.3.2.3) application_master.erl:293: :application_master.start_it_old/4
Which got resolved by running MIX_ENV=test mix deps.compile on testing machine. My question is why is that? Is that somehow avoidable? On localhost I never had to run MIX_ENV=test mix deps.compile or MIX_ENV=test mix compile to be able to run mix test and deps.compile takes a lot of time and noticeably slowing down our pipeline.

So you might be conflating mix test which just runs all the tests in whatever env you are in with the test environment.

My bet would be that your other container has different config settings / different env. So when you take files compiled for one configuration and run it in another you are going to have a bad time.

Look in your mix.exs file and I’m guessing the ranch dependency is set to not be included in test env.

dimitarvp · March 25, 2023, 8:21am

Your description is a bit confusing. What’s your goal? Running mix test on a separate, dedicated, VM/machine?

tino415 · March 25, 2023, 2:18pm

It is actually semaphoreci and they are running separate build process steps in separate docker containers, I think it is similar to running multistage dockerfile

tino415 · March 25, 2023, 2:28pm

I tried to verify this and:

ranch is in copied deps folder
test is in copied _build
there is _build/test/lib/ranch/ebin/ranch.app
error persists even when running MIX_ENV=test mix test

Am I missing something?

BradS2S · March 25, 2023, 2:37pm

It isn’t best practices to copy the _buikd folder and move it. I’m not sure I follow you completely.

dimitarvp · March 26, 2023, 8:40am

Okay, I get that, but I don’t think straight up copying files is how it’s done. Utilizing multi-stage builds you should be able to just base one image on another. Though I am not sure how does that work with distributing work among different nodes, haven’t tried it.

cjbottaro · March 26, 2023, 6:49pm

Our CI pipeline actually pushes the built Docker image to a registry.

So we have two jobs:

build -> test

The build job creates the image (compiling both prod and test) and pushes it to an image repo, then the test job pulls down the image and runs a container using a command like mix test. If the test job passes, then we’re all ready to deploy to prod since the image is already in a repository.

Complexity arises when you want to cache your deps, which drastically speeds up the build process.

tino415 · March 26, 2023, 8:27pm

I don’t have control over docker files, have only limited api to copy stuff between build steps. Maybe should not compare it to multi stage build. Lets say I have no control over dockerfiles and docker stuff lets say I have simple api where I can define build steps, which every run in different docker container (I even have limited control which image is used) and then I have api to copy files between build steps.

tino415 · March 26, 2023, 8:30pm

Yes, but that is not usable in this case, actually we build docker image only on master on prod pipeline, on feature branches I prefer speed.

tino415 · March 26, 2023, 8:35pm

I work with what I have. Nov it works, could be 40 seconds faster, but works. But still I would like to know what is going on? Maybe there are some other files that are required to copy to replicate builded project?

felix-starman · April 26, 2023, 7:44pm

I just ran into this with GitHub Actions and caching.

I believe it’s that _build/test/lib/ranch/ebin/ is a symlink to deps/ranch/ebin/ and during deps.compile it will actually compile the .beam files into the deps/ranch/ebin folder. If there is a discrepancy between architectures, erlang versions, or anything that would cause the .beam file in the deps/ranch/ebin file to be considered invalid for the final runtime container, it will fail to find it. Or I guess if you’re only moving the _build and not the deps too, that’d do it. In that case, if you’re only moving the _build, then you probably want a release. If this isn’t for a release container, then you’ll need to copy deps and _build

Obviously my scenario is very different, but I hope that helps some?