Some practical considerations for releases with GitLab CI/CD & Docker

I recently setup several GitLab CI/CD pipelines that use Docker. Having done this, I can confirm that the payoff for implementing CI/CD with Docker is indeed great. But the reality is that implementing CI/CD pipelines well is still challenging.

Half of the challenge (for me) was learning enough arcana about these two technologies to wield them effectively. The other half of the challenge was mapping my intuitive sense of how things work onto the abstractions and limitations of these tools.

In this article I’ll walk through some of what I figured out about creating releases with GitLab CI/CD and Docker. There’s currently less written about GitLab CI/CD than about Docker, so I’ll focus on GitLab and leave the Docker knowledge to others.

For the rest of this post I’ll assume you’ve got a working GitLab CI/CD pipeline, and now you want to programmatically push releases (with associated files) from that pipeline.

Pivoting from setting up a basic CI/CD pipline to release mechanics

Getting Docker working inside GitLab CI/CD was a unique challenge. After a week of experimentation, I had a working CI/CD pipeline and some battle-tested Dockerfiles.

But then I faced a different problem altogether: programmatically generating releases inside a GitLab CI/CD pipeline. As it turns out, this isn’t a straightforward problem to solve. And there is not much written about it online (yet).

How to get built binaries out of Docker image(s)

If you’re docker building images in your CI/CD pipeline script, you may need to get files out of the built Docker image. There are many ways to skin this particular cat. For example, you could do any of the following:

docker build your Docker image as usual, then docker run that image w/ a volume attached, and cp to exfiltrate files from inside the container to the host
do the docker create / docker cp trick to avoid the overhead of actually running the image just to copy some files out.
do all operations inside the Docker container (upload built binaries, etc.), without “exfiltrating” files to the host
(and so on)

The point is that there’s no one right way to copy files out of a Docker image / container. You can pick the approach that works best for your project.

Storing built binaries (and other files)

This might seem like a trivial issue, but I’ve seen this trip up several other folks on the journey to GitLab CI/CD englightnenment.

I initially assumed that GitLab provides some special “file store” feature to upload files generated by a CI/CD pipeline. But apparently this is one feature that GitLab does not provide (as of June 2020).

The problem here is twofold:

GitLab releases does not support file uploads as release assets (as of June 2020 - but there is an open issue for this
There are several GitLab features that look like they fulfill this purpose, but in fact are not a good fit for this use case

The former limitation (#1) forces us to find a place to store build files / binaries. The latter complication (#2) means that there is some confusion as to whether existing GitLab features can act as a suitable file store.

The first near-miss is GitLab’s upload API. At first glance, it looks like a good way to store built files. But as of March 2020 this feature isn’t fully baked with all the functionality you’d want. For example, there’s no UI to view & manage all uploaded files. There are several open issues about this (and other missing features), so it may be a good way to store files in the future. I’ll mention in passing that go-semrel-gitlab actually supports using the GitLab upload API as a file store ⁴.

You might think (as I did) that the GitLab CI/CD job artifacts could fill the role of a “build file store”. But artifacts aren’t necessarily a good fit for this use case. For starters, job artifacts are ephemeral by design: they have a default expiration time after which they’re deleted ³. Beyond the lifetime issues, artifacts proliferate with every CI/CD pipeline execution, and UI provded by GitLab to view & manage artifacts isn’t ideal for the purpose of working with build files.

Luckily, there is a good solution that’s readily available: simply copy your built binaries & other files to cloud storage like Amazon S3. The benefit of using S3 is that you can access uploaded files via a URL. This is good for adding assets to a release via GitLab’s release assets feature (which I’ll talk more about later). If you can’t or don’t want to use S3, you can use a local object store or a local Samba or NFS share (among others).

I want to emphasize that whatever file store you choose to store your build files, it must have some way to access those files via a URL - because that is ultimately how GitLab represents files associated with a release ⁵.

Associate built binaries with the commit that built them

You can simply append a commit SHA hash to the binary filename. For example, in your .gitlab-ci.yml, a filename can include the predefined GitLab CI environment variable $CI_COMMIT_SHA:

your-binary-file-$CI_COMMIT_SHA.zip

Or you can get fancy with semantic release strings, tags, and so on.

Programmatically create a release from a GitLab CI/CD pipeline

You can manually call the GitLab Releases API to create a new release. Assuming that you’ve already tagged your commit & pushed to remote, you can create a release thusly:

curl --request POST \
  --header 'Content-Type: application/json' \
  --header "Private-Token: YOUR_PRIVATE_TOKEN" \
  --data '{"name": "YOUR_RELEASE_NAME", "tag_name": "YOUR_TAG_NAME", "description": "Release with the binary LINK_TO_YOUR_BINARY"}' "https://YOUR_GITLAB_HOST/api/v4/projects/YOUR_PROJECT_ID/releases"

If this manual approach is a bit too much for you, you can instead use a tool like go-semrel-gitlab to automate & simplify the whole process. go-semrel-gitlab has several nice features besides just creating GitLab releases.

Attach built binaries (and other files) to a release

The official GitLab term for ‘files attached to a release’ is “release assets”.

As of June 2020 ¹, GitLab release assets are expressed as URL links, rather than directly uploaded / embedded files. In other words, you can provide a URL to point at each asset but you cannot (currently) upload an asset file directly. This may be confusing and counter-intuitive, if you assumed (as I did) that you could embed files as you can in GitHub’s release feature ².

If you’re calling the Releases API, you can attach assets by specifying a URL for each asset in your API request (as we saw earlier). As we covered earlier, if you’re using S3 you can simply provide a URL to the file you uploaded to S3. For example, the JSON from the curl snippet above would become (replace all-caps strings below):

"assets": {
  "links": [
    {
      "name": "FILE NAME",
      "url": "URL TO FILE (IN S3 OR WHEREVER)",
      "filepath": "PATH RELATIVE TO GITLAB PROJECT BASE URL",
      "link_type": "other"
    }
  ]
}

Again, Juhani Ränkimies’s go-semrel-gitlab can help here, with its add-download-link command. Or you can use inetprocess/gitlab-release, and pass a list of files as arguments ⁸.

Versioning (without tears)

Historically, it was difficult to implement proper versioning. There’s a whole cluster of problems around representing consistent “version state” distributed across a project’s source code, repo commits, tags, and releases. Things only become more complex when you consider the mechanics of synchronizing your project’s “distributed” version state as time marches forward ⁷.

Luckily the versioning problem has been well-studied and good, simple solutions have emerged (modulo a tradeoff or two). Nowadays versioning is much easier, thanks to better models and new tools.

The simplest way I know of to solve the “consistent version state” problems is to not store version numbers in source code, but instead “inject” versioning into downstream artifacts (binaries, releases, Git tags, etc.) at build time via your CI/CD pipeline. This is the approach made possible by tools like Semantic Release, as explained by Remy Sharp’s Versioning: The Chicken and Egg. and Kent Dodds’ Automating Releases with semantic-release.

This simple “build-time version injection” solution sidesteps the version-state-synchronization problem by removing version state from the codebase altogether. In this model, “version state” only exists in built files, releases, and Git tags – not in committed source code.

The tradeoff for this simplicity is that you can’t (directly) get the version number in local development builds. Because you’ll actually remove (or ‘reset’) the version number in e.g., your package.json and other standard files. There are workarounds if you really need the version number for local development ⁶.

A way to keep version state in your repo (also without tears)

If you must store version state in your repo, all is not lost. There’s a way to do that while keeping the benefits of the simple approach we just covered.

Basically you do something like the following, in your CI/CD pipeline script:

generate version numbers programmatically in your CI/CD pipeline script (e.g., via go-semrel-gitlab)
embed the generated version numbers in source files
commit with a [skip ci] commit message ← this is the important part!
push the commit

The [skip cli] is the magic that makes this work. It tells GitLab to not run the pipeline for that commit.

If you squint and look sideways at this, this approach is analogous to an idempotent, single-source-of-truth functional pipeline. At least that’s how I conceptualize it.

More about versioning

So we’ve got a simple approach to “version state”. But how do we decide when to bump versions, and how do we do that in a GitLab CI/CD pipeline?

You could continue bumping version numbers manually, and tag Git commits to tell the build pipeline to create a new release. But it’s much easier to use Semantic Release, go-semrel-gitlab or any of their ilk.

To leverage the full power of those tools, you’ll need to adopt the conventions of Conventional Commits and format your commit messages in a standardized way. When you do this, bumping your version numbers can be fully automated so you never have to think about it. And you get proper semantic version numbers for (almost) free!

Other bits

One feature I haven’t explored yet is pre-scheduling releases to automatically occur e.g., at fixed intervals. If you ever need to do this, GitLab CI/CD supports this scenario.

Sketch of a full solution

If we distill everything we’ve covered so far, we arrive at the approach I currently use to create releases with GitLab CI/CD:

Store built files in an external file store which supports access via URL, like S3 (or if you must, GitLab’s project uploads store)
Append the Git commit hash to built files’ names to associate them back to the commit that generated them - or embed the commit hash in a .commit-sha file
Programmatically create a GitLab release from your CI/CD pipeline script, and attach links to release assets
Use a tool like Semantic Release / go-semrel-gitlab in your CI/CD pipeline script to automate bumping version numbers, injecting version numbers into built files, generating CHANGELOG.md, etc.
Don’t store version numbers in source code, but instead inject them into build files, Git tags, releases
Let Semantic Release, go-semrel-gitlab, et al. take care of bumping version numbers - don’t update version numbers “by hand”
Adopt Conventional Commits and format commit messages to enable the semantic versioning tools to do their thing

In a nutshell, this approach boils down to following conventions and using helpful tools.

Summary

After figuring out all these bits, it’s straightforward and simple to create releases from a GitLab CI/CD pipeline. GitLab provides many tools to help make this easier. And we can reach beyond GitLab to other tools to creatively work around features that GitLab lacks.

Footnotes

I mention the date b/c GitLab changes so rapidly that this statement may be inaccurate in the near future. For example, there’s already an issue for adding binary file support to releases. ↩
As is supported by GitHub’s release feature. ↩
While this expiration can be configured site-wide to disable deleting artifacts, GitLab’s job artifacts abstraction have other drawbacks that make them less than ideal as a “file store” for release assets. ↩
The go-semrel-gitlab add-download command actually uses GitLab’s project upload API, so it’s not entirely crazy to store built files this way. ↩
As of June 2020. ↩
Remy Sharp’s Versioning: The Chicken and Egg talks about his version promise code solution, for example. ↩
If you squint your eyes, that cluster of problems kind of resembles distributed state synchronization - which gives a sense of how tricky it can be to solve well. ↩
Note that inetprocess/gitlab-release uses GitLab’s upload API, so the files will be stored in the repo uploads. ↩