Get git info in a mix release with docker

linusdm · December 14, 2022, 9:35pm

I’m trying to smuggle some git info (for example, last commit timestamp) into a release, built with docker. The dockerignore file contains this trick:

github.com

phoenixframework/phoenix/blob/master/priv/templates/phx.gen.release/dockerignore.eex#L19-L25


      
          # Ignore git, but keep git HEAD and refs to access current commit hash if needed:
          #
          # $ cat .git/HEAD | awk '{print ".git/"$2}' | xargs cat
          # d0b8727759e1e0e7aa3d41707d12376e373d5ecc
          .git
          !.git/HEAD
          !.git/refs

But this only allows getting the commit hash. Is there a good wat to get more git info into a docker release? Thx!

hst337 · December 14, 2022, 9:38pm

What info do you need to get inside the docker release?

linusdm · December 14, 2022, 9:54pm

An indication of the last commit date. And possibly the last commit message.

hst337 · December 14, 2022, 9:58pm

append this

!.git/logs/HEAD
!.git/logs/refs/heads/<the_branch_you_want>

hst337 · December 14, 2022, 9:59pm

Anyway, this is not an Elixir question

And don’t forget to mark the solution

stefanchrobot · December 15, 2022, 12:24pm

Wouldn’t it be better to transport this over an ENV variable? Do something like this in your Dockerfile:

ENV GIT_SHA=<some command to grab the SHA>
ENV GIT_TIMESTAMP=<some command to grab the timestamp>

linusdm · December 15, 2022, 1:03pm

Thanks for that information about the .git/logs folder! I’m gonna try it out and report back how that turned out. I didn’t know you be that selective when copying the .git folder, and still get information with the git CLI.

Wouldn’t it be better to transport this over an ENV variable?

Passing in environment variables is also an option, but sounds a bit brittle. Or do you (@stefanchrobot) have good experience doing it that way?

I’m not building local, but using the fly.io toolchain to deploy an application, so I have to take that into account.

stefanchrobot · December 15, 2022, 1:26pm

Is .git’s internal structure publicly documented? If not, I wouldn’t try to touch it since I would assume it may change in the future - instead I would just use the git commands to pull whatever is needed.

But as I wrote - I’d prefer to bake the info that I need into the Docker image, whether as an ENV variable or a dedicated file. At one team each service had to have GIT_SHA ENV var baked into the image and it worked just fine.

linusdm · December 15, 2022, 5:59pm

Doing this does not leave the git repo intact. I get this error instead, when I run git log inside the container:

fatal: not a git repository (or any of the parent directories): .git

dimitarvp · December 15, 2022, 6:48pm

Can you expand on that? I’ve helped maintain a fleet of 50+ services spread over 800+ container pods and every single deployment used env vars. Nothing ever broke (once we had the infrastructure in place for it of course e.g. Consul / etcd et.al.)

And yeah don’t try to exfiltrate stuff from Git internal files. You can very easily get the checksum of the latest commit and use that as a tag / key for your purposes.

linusdm · December 15, 2022, 7:10pm

Maybe I should zoom out a bit:

When doing continuous deployment of a monolitic Phoenix app, I think it’s very important to let the user know which version is currently deployed. Versions come and go, and often it’s barely visible what has changed between two versions. When a bug is logged, the user should also record the version that is currently used, so that it’s clear later how to reproduce, and whether there was a fix in the meanwhile. The version might also show up in whatever logging or bug-tracking service you’re collecting your logs/errors.

A simple commit SHA will do the trick. But I also like to have notion of the commit date, and maybe even the last commit message (maybe that’s a bad idea, I don’t know yet). So I, and the end users, know how fresh the deploy is. It’s easier for the human mind to think about dates, than seeing those pesky commit hashes (and SHA’s don’t have the notion of ordering on a timeline). The last commit date seems to be the best substitute of that freshness date (better than a build date, that is not reproducible).

So that’s the why.

About the env variables: I have this idea in my head that it should be easy to create a release, by just running MIX_ENV=prod mix release, or with docker build . when doing it inside a docker image. But that thought is probably flawed. I guess I just don’t want to give up on easily kicking off a command to create an image, locally.

hst337 · December 15, 2022, 7:53pm

And you shouldn’t run git log in repository. Just open .git/logs/HEAD and that’s all

hst337 · December 15, 2022, 8:03pm

You can create a task which generates dockerfile and builds it. This is a pretty common practice

As far as I know, there is no such program in real world, which takes a part of version control system with itself to tell it’s own version

Usually, version is embedded in the program during the build

linusdm · December 15, 2022, 8:16pm

I now realise that I misunderstood you. Thanks for clarifying. I’m not inclined to parse the raw log files, that’s a bridge too far.

I’m looking at Plausible, which has a shell script to start the release process:

github.com

plausible/analytics/blob/88173342b9e969894879bfb2e8d203426f6a1b1c/rel/prepare_release.sh

COMMIT=$(git rev-parse HEAD)
VERSION="$1"

if [ "$VERSION" = "" ]
then
  echo "Please supply a version tag e.g \`./rel/prepare_release.sh v1.5.0\`"
  exit 1
fi


if [ "$GITHUB_WORKSPACE" != "" ]
then
  ls $GITHUB_WORKSPACE
  TARGET_FOLDER=$GITHUB_WORKSPACE/priv
else
  TARGET_FOLDER=$(pwd)/priv
fi

echo "{\"version\": \"$VERSION\", \"commit\": \"$COMMIT\"}" > $TARGET_FOLDER/version.json

This file has been truncated. show original

I’m more or less convinced that the git information has to come from outside the docker context/daemon. I’m just not sure yet what would be a good way to inject this information (a plain old shell script, or something more elaborate, like a task file). Suggestions are welcome.

hst337 · December 15, 2022, 9:17pm

First of all,

head = File.read! ".git/logs/HEAD"
[_, hash | _] = String.split head
message = List.last String.split(head, ": ")

Second, just use docker arguments… and have it as an env variable inside your dockerfile. There is no reason to take .git directory.

The idea with env variable is solid as a rock and it is the most simple solution, just use it.

Third, please, for the sake of programming and me sleeping well at nights, don’t bring that golang yaml abomination clone of makefile to embed a version in your software. This is one of the most simple and common problems to solve, and you don’t need anything like Task to solve this problem. I mean, even very stupid tools like yes from coreutils are able to display their own version without any rocker science

stefanchrobot · December 15, 2022, 10:43pm

linusdm:

Maybe I should zoom out a bit:

When doing continuous deployment of a monolitic Phoenix app, I think it’s very important to let the user know which version is currently deployed. Versions come and go, and often it’s barely visible what has changed between two versions. When a bug is logged, the user should also record the version that is currently used, so that it’s clear later how to reproduce, and whether there was a fix in the meanwhile. The version might also show up in whatever logging or bug-tracking service you’re collecting your logs/errors.

A simple commit SHA will do the trick. But I also like to have notion of the commit date, and maybe even the last commit message (maybe that’s a bad idea, I don’t know yet). So I, and the end users, know how fresh the deploy is. It’s easier for the human mind to think about dates, than seeing those pesky commit hashes (and SHA’s don’t have the notion of ordering on a timeline). The last commit date seems to be the best substitute of that freshness date (better than a build date, that is not reproducible).

So that’s the why.

Makes perfect sense to me!

I think this is totally doable. One option would be to use your project’s version as defined in mix.exs. Then you can call :application.get_key(:myapp, :vsn) to fetch the version at runtime.

If you want to use the Git SHA and timestamp, you need to pass that to the Docker’s build command. This depends on how you’re building the Docker image. Something like this:

# checkout the repo
git ...
export GIT_SHA=<get the SHA>
docker build --build-arg GIT_SHA .

I’m deploying to DigitalOcean’s App Platform and it provides the commit SHA in a build-time ENV var, so I can just do this in the UI settings:

GIT_SHA=${_self.COMMIT_HASH}

and this will basically bake the commit SHA into the Docker image.

chulkilee · December 16, 2022, 4:45am

Quick notes

You can pass build time args, not env vars to docker file, to avoid leaking that information as env var unnecessarily.
You can use it to add label to container images, which can be also helpful (no need for complicated image tag convention)
It would be simpler to generate build metadata file and pass it in addition to the source code. I used to have X=Y format file to keep those information (e.g. branch, commit, build job number, date, etc.)