CI/CD as a library

sasajuric · January 14, 2021, 6:42pm

I’d like to announce a new library I’ve just released, called ci, which aims to provide CI/CD features as a library. It’s currently pretty basic, but I have plans on expanding it further with support for docker, integration with GitHub, etc. The general roadmap is included in the readme. Any feedback is welcome

tcoopman · January 15, 2021, 8:31am

Hi @sasajuric

Thanks for this library. Could you give an example of an intended use case of this library. It looks cool to me, but I don’t see immediately why I would want to use it (maybe because I don’t have a need for it).

sasajuric · January 15, 2021, 9:12am

The library intends to complement, or in same cases completely replace, the existing CI systems (GH actions, Travis, Circle, Jenkins, …). In its current version, it only does the complementing part, which admittedly isn’t much, but I think it’s still something worth having.

You can see the concrete example in the way the library itself does CI:

The immediate benefits compared to eg GH actions:

Implement most of the flow in Elixir
Easily run all the CI steps locally
Write test the CI flow (example)
Due to support for parallel actions, you can get all the errors in a single CI roundtrip. In this example, if the code is not properly formatted, and some test is failing, and there are dialyzer warnings, all these errors will be reported in a single CI run.

The roadmap vaguely hints at future goals: the support for managing docker containers (so we can run dockerized builds), and integration with SCMs (e.g. GitHub, GitLab, …), which would allow us to easily create our own CI/CD platform in Elixir.

Finally, an internal, but a very important goal is to find properly focused abstractions. For example, in the current version we have OsCmd, Job, and Job.Pipeline, all of which can be used beyond just the CI domain. I’d like to keep this approach with the upcoming features too. Ideally, this library would be a set of small helpers, and it just so happens that you can combine them to implement your own CI

hauleth · January 15, 2021, 12:10pm

Most of CI services I know (with exception to the TravisCI) supports such behaviour OotB without need for such library.

That is really handy and I am wondering why other services haven’t provided such tools already to run suites locally.

I see it as a big downside. Let configuration be configuration, not code, and if so, make it non-TC. Then the step 3. would be mostly unneeded.

tcoopman · January 15, 2021, 12:16pm

A bit of topic, but it seems that phoenix builds switched to https://earthly.dev/ which also has a way of running CI locally (https://twitter.com/josevalim/status/1346404430275612683). So I agree, running your CI locally is a feature that is so useful!

sasajuric · January 15, 2021, 1:10pm

Yeah, some do, some don’t, but my impression is that even those that do require more clumsy/complex yaml to describe the desired pipeline. See e.g. build stages in Travis.

Except it’s not configuration, it’s literally imperative code. In the simplest case we want to execute some steps in sequence. In more complex cases we want to parallelize some steps. Sometimes we want to have a conditional around some steps (e.g. send an e-mail on errors, deploy the system if there are no errors). Other times we might want to run a loop. Modern CIs mostly support such scenarios somehow, but the solutions seem somehow clumsy to me. For example, with Travis build stages, you end up including if as data.

To be clear, the ability to represent the pipeline as data can sometimes be very useful, allowing generic pipeline actions such as: restart from step x, execute only step y, generate visual flow diagram, better progress report, etc. This can all be very handy in more complex pipelines, or in larger companies that want to standardize their flow across projects.

That said, a straightforward imperative approach would be a better fit in pretty much all the cases I’ve personally experienced. Furthermore, it’s worth pointing out that you can always build declarative on top of imperative, while the opposite requires some complex/clumsy improvisations and sometimes might not even be possible.

Finally, testability is IMO always a good benefit. It’s worth testing a more complex pipeline regardless of whether its described declaratively or imperatively. Otherwise, how can we be sure that e.g. code is deployed only after all the checks have passed and the PR is approved? Without tests, if we mess something up, this will fail only in production, and it will fail silently.

sasajuric · January 15, 2021, 1:21pm

Here’s another fun thing we were able to do in our imperative CI at Aircloak. We had a monorepo consisting of mutliple separate projects. Because we cotrolled all the steps from an imperative code, we were able to do a git diff between the current commit and the target branch, and from that figure out which projects actually need to be retested. The build time reduction was about 5x in most cases (most PRs were focused on one or two projects). This can probably be done with declarative too, e.g. by wrapping each step in a script that figures out if the command should be executed, but I have to say it feels pretty clumsy to me.

bottlenecked · January 15, 2021, 3:25pm

I’m not sure if it is stated explicitly or not, but keeping things ‘local’ (as in part of the same platform/codebase/deployable artifact) is probably in line with @sasajuric’s feelings on operational complexity. From his blog post on site_encrypt:

This is an example of what I call “integrated operations”. Instead of being spread across a bunch of yamls, inis, jsons, and bash scripts, somehow all glued together at the OS-level, most of the operations is done in development, i.e. the same place where the rest of the system is implemented, using the same language. Such approach significantly reduces the technical complexity of the system

dimitarvp · January 15, 2021, 3:47pm

I completely agree that YAML is a terrible platform for expressing logic yet the world keeps insisting on it. Insanity. Especially in one contract I had in 2020 where we had Kubernetes setup with Helm charts and Tilt [basically Python] scripts, with secret management sprinkled everywhere. It was nightmarish.

I do wonder if Elixir is a good scripting language for configuration – but it’s a really good start and I’ll try your library in the future. But who knows – maybe Lua? A sub-set of any imperative (non-FP) language?

That being said, adding the capability to test the CI/CD pipeline locally would make me an instant convert.

@tcoopman thanks for mentioning that Earthly allows local testing. I kind of looked at it once and was like “meh”. If only I knew!

sasajuric · January 15, 2021, 4:17pm

Yeah. I think that with added docker support (which is the next thing I’d like to tackle), this project would be similar to eartly, with the main difference that you’d use Elixir to describe the flow.

My position is that any language other than the main language used for development adds extra complexity. It means something new has to be learned, and that we can’t rely on the tools & libraries that we have in our main language.

That complexity may bring some other benefits to the table, and so it might sometimes make sense. For example, perhaps in a larger team the operators are not backend developers, and then you may want to use some simpler scripting language, such as Lua. In some other cases you may want to go pure declarative (which, as I said, is straightforward to do on top of imperative).

But if we’re talking about smaller teams and/or small-to-medium projects, I think that using Elixir is just fine, and that it can take you quite a long way. People on the team have one thing less to learn, and that’s always a good thing in my book

dimitarvp · January 15, 2021, 4:22pm

Absolutely! You framed it perfectly. But being paranoid as I am, I can’t help but wonder if an Elixir configuration language won’t become yet another leaky abstraction.

In any case, I am grateful for your work. Looking forward to try the library when I need it.

derek-zhou · January 15, 2021, 4:25pm

I think earthly demand the usage of docker. For a self hosted, fully customized CI flow I’d like to have the freedom of not using docker.

sasajuric · January 15, 2021, 4:27pm

Yeah, this library will definitely support both. After all, it’s just a set of smaller abstractions which you can combine however you want

dimitarvp · January 15, 2021, 4:28pm

…oh by the way, do seriously consider supporting Hashicorp’s Nomad as well. I keep hearing that some orgs prefer it to k8s and I already heard a good amount of operations people praising it.

Have you used it?

hauleth · January 15, 2021, 4:36pm

Dhall. It already have support for GitLab CI.

I was thinking about creating CI that would utilise systemd to run jobs. It could be fun.

I had, and I really liked it. It is so much simpler than k8s that it is weird that it is not used more. However I still think that in 90% of the deployments plain systemd would suffice.

sasajuric · January 15, 2021, 4:36pm

No. My secret wish us to have something like that available as an Elixir library

dimitarvp · January 15, 2021, 4:49pm

Isn’t that the story of like 90% of IT?..

I would agree but then again cloud vendor lock-in pulls really hard in the other direction and the customers’ executives are buying it (figuratively and literally).

sasajuric · January 15, 2021, 4:57pm

It’s not just yaml, though I think that yaml is particularly horrible

IMO, yaml, xml, json, ini & co are meant to state facts. CI flow is not a collection of facts, but an imperative flow (run foo, then maybe bar, then baz&qux, …). Admittedly, in larger pipelines you might want turn it as a collection of high-level facts, so you can e.g. tell the forrest from the trees (or for any other purposes). But I think that in such case it’s probably better to roll your own DSL tailored to your particular needs. The available yamls & co are trying to solve everyone’s problems, which makes them overly generic and at the same time possibly insufficient.

hauleth · January 15, 2021, 5:00pm

I was thinking about Elixir project that would work like CoreOS’ fleetctl, so it would be only management layer on top of other system that would handle the running the processes. And as systemd offers D-Bus API for that it would be a good choice, as it provides support for running process in isolated world and even support running containers via systemd-nspawn. So you would get a lot of great stuff without much work, just tackle the management on top of all of that.

dimitarvp · January 15, 2021, 5:01pm

I am thinking that it’s trying to look like a declarative collection of steps and some instructions how to execute them but yeah, it’s very leaky and it doesn’t serve that function well.

I haven’t checked out Dhall just yet but I have found myself agreeing with former colleagues who used their own LISP DSLs to describe their deployment and testing pipelines. It almost did read like English and was very enjoyable to use.

But most orgs don’t do that, they want their people to hit the ground running in the first 3 days. Which is a sensible business goal but humans being humans they are of course overdoing it.