How much overlap is there between Docker and Elixir (BEAM VM)?

zsck · March 7, 2018, 5:56pm

Hello, world!

I have had a question come to mind that I have been hoping some of the enthusiastic folks here in the Elixir community might be able to help me to answer.

Let me preface this by saying that I am a beginner with regards to both Docker and Elixir, and have not worked with Erlang in the past.

As I have been studying Docker recently, I’ve been impressed at how much it is able to do with regards to managing and facilitating communication between images. If we were to think of a running image as a process, then Docker seems to provide a lot of the kinds of concurrent programming superpowers that the BEAM VM does. In particular, I’ve noticed Docker’s ability to run multiple instances of services, easily be configured to monitor for crashes and take care of automatic restarts, and so on. This, it appears to me, seems to overlap greatly with the kinds of things we can do quite easily in Elixir through processes and with things like supervisors.

So my question boils down to this: How much overlap is there really between Docker and the BEAM VM? If I were to try to make a compelling argument to my team that Elixir is a technology worth adopting, considering that no one on my team knows of it and that Docker is already being put to use to varying degrees, what might you say makes it really stand out?

AstonJ · March 7, 2018, 6:22pm

I think they are very different things Zack. I look at Docker like a sort of VPS; a container in which you run some software. Elixir on the other hand IS the software

Many people use Docker for ease of deployment and commissioning production setups, in fact many people run Elixir software inside Docker containers.

zsck · March 7, 2018, 6:41pm

I’m aware of these extremely basic differences. My question is specifically about features that BEAM and Docker provide with regards to distribution, replication, process management, monitoring, and so on.

OvermindDL1 · March 7, 2018, 8:09pm

BeamVM allows introspection and tracing at a very detailed level compared to what docker can do, which can only get to OS process level. Docker doesn’t really distribute or replicate though tools built on it might do that.

shanesveller · March 7, 2018, 8:52pm

When I use the unadorned word “Docker” below, I’m generally referring to one or all of the Docker daemon, the Docker CLI, or perhaps the Docker image format, and no other additional/adjacent components. If that is not the way OP is using the word Docker, I apologize now.

with regards to distribution, replication, process management, monitoring, and so on.

Strictly speaking, Docker itself as an individual tool doesn’t provide any considerations for these aspects of operating a complex and/or distributed system, without accentuating Docker with still more tools. Kubernetes, Nomad, Swarm, Mesos/DCOS, etc. cover a lot of these functionality gaps with varying approaches.

Distribution: Docker itself is not multi-host aware, and you cannot make it so without layering an orchestration tool on top, such as the aforementioned Kubernetes/Nomad/Swarm. The additional requirements for Swarm are included in most Docker OS packages, but that does not mean that its functionality comes for free or by default. There is no built-in way for container A on host 1 to directly communicate with container B on host 2 with no additional components involved. This concept is supported within the BEAM runtime.

Replication: In order to make sure that X copies of image ABC are running at all times, you need another component. Docker Compose can do some rudimentary scaling of a given container definition to more than one actual running container on a single host, but the Docker daemon itself doesn’t see them as having any direct relationship other than incidental similarities. Docker Compose is not multi-host aware without exporting an app bundle for use with Swarm, or translating to a similar manifest for Kubernetes using something like kompose. AFAIK, this is basically subject to DIY on the BEAM based on how you build your supervision trees, etc. but is not infeasible.

Process Management: This is the one possible exception that Docker itself might provide, depending on on whether you meant management of process groups composed of containers or process groups within single containers. Docker added a minimal --init feature flag for docker run and derived commands, which helps manage child processes within a single container. I believe it uses the semantics of, or directly leverages the code of, tini. It definitely cannot compete with systemd, upstart, runit, supervisord etc. for features or for complexity. More, I would say this functionality does not provide a useful degree of feature parity to process supervision trees in BEAM.

For multiple-container process groups, Docker Compose comes close with the YAML manifest plus a little bit of networking and DNS glue. However, all inter-container communication most go over a shared filesystem or network calls, even if they’re over localhost, crossing various process and memory boundaries. Docker Compose itself is responsible for the creation and “enforcement” of that multi-container environment through its various sub-commands, and the Docker daemon doesn’t have inherent knowledge of those over-arching concerns.

Monitoring: Other than the option to restart exited containers always or under failure conditions, I don’t consider Docker itself to have had meaningful monitoring or supervision behavior for quite a lot of its life as a project. The late introduction of the HEALTHCHECK directive for Dockerfiles did add some better capabilities of self-healing behavior compared to early releases. In Docker, you can’t natively express the idea that when one container exits abruptly, none/one/some/all of its logical siblings should also get restarted, but this is a first-class capability in BEAM.

Docker seems to provide a lot of the kinds of concurrent programming superpowers that the BEAM VM does.

Anything that Docker seems to be able to do with regards to concurrency is probably incidental, and are mostly enabled by the operating system itself, not by Docker. The running containers are just represented as separate OS processes (or process trees) on the host and are subject to the same kernel CPU scheduling as any other program you might run.

I would encourage most newcomers to think of Docker as providing three main value propositions:

an extremely portable, straightforward, unsophisticated packaging format for your deliverables
a way to (imperfectly) isolate those deliverables from each other while sharing an underlying host
a vehicle for abstracting away the differences between different runtime environments and languages

On that last point, Docker doesn’t provide a great abstraction by itself, but by way of example, intentional design of disparate Docker images to have similar “APIs” in the form of their ENTRYPOINT, CMD, etc. can present a unified front in a polyglot organization with applications authored in several different runtime languages.

To come at these points from another angle, operators of software packaged via Docker often only need to care about some limited logistical details:

what image and tag to run
what CPU/memory resources to allocate
what simple arguments to supply
what network ports to expose
where to expose those ports to

The execution model of the program itself can be a black box. There’s no need for the operator to have much if any knowledge of the implementation language or other “internal” details.

I wouldn’t say any of these three value propositions have direct analogues in the BEAM, and some of them don’t seem to make sense for it anyway.

sasajuric · March 7, 2018, 10:28pm

My impression is that the ecosystem revolving around Docker is indeed used to get the similar features which we get on top of BEAM, such as fault-tolerance and scalability.

To address this, I’ll borrow the example mentioned in another topic:

Imagine that instead of WhicheverLang/Docker/Kubernetes/… you can just do everything in Elixir, running just one OS process per production node. That is vastly simpler than this coctail of technologies. It will simplify all the phases of the software production, such as development, testing, deployment, monitoring of production system, debugging, and let’s not forget about onboarding of new developers which have to learn just one technology to work on any part of the system. That’s a bunch of wins across the board, and for me it’s a big reason why I prefer BEAM facing languages to anything else.

Granted, these other 3rd party technologies are usually more feature rich and advanced, and we often don’t have complete counterparts in the BEAM ecosystem. That said, in my experience, when my needs are simple, I can most often implement a proper solution directly in Elixir, with way less needs to step outside of my main technology, and that ultimately leads to a more technically homogeneous solution, and in turn brings the benefits I mentioned above.

Thus, for me having first class concepts for fault-tolerance and scalability are a must in any runtime I’d consider to power my backend. These properties are needed in any kind of a production server side system, and I want my main technology to be able to help me get there. Without that, we end up improvising on top of inappropriate foundations, which is definitely possible, but will inevitably increase technical baggage and complexity.

mkunikow · March 7, 2018, 10:44pm

OK but final final you will need to host application somewhere. How do you do up scale / down scale without any orchestration framework like k8s engine for example if you want to run elixir on pure metal …

sasajuric · March 7, 2018, 11:05pm

Using k8 or anything else for autoscaling is perfectly fine. I’m not necessarily arguing against it, and I also quite like Docker, which we use at my company to bundle our system (and also for our CI tests).

Of course, not everyone needs autoscaling, and therefore not everyone need k8. For example, my first production was working great for a whole year on a small instance, it grew organically, and once we got close to our max capacity, we scaled it manually

My main point was that it’s my impression that the ecosystem around Docker is often used to get the same features you get natively on top of BEAM. I personally prefer using BEAM without all those other complexities for that, and I feel that for simple to moderately complex scenarios I can get much simpler solution with just BEAM. For more complex cases, I might reach for 3rd party technologies, and I’m certainly not averse to use external products. But since not all scenarios in the system are so extremely complex, in general the technical solution built on top of BEAM is in my opinion much simpler compared to most other languages, especially the ones which don’t offer proper actor-like lightweight concurrency and first-class fault tolerance.

ibgib · March 8, 2018, 12:33pm

I have gone down the road of both Elixir and the Docker ecosystem, which includes Swarm (though Swarm not implemented yet, only Docker Compose with multiple services and using Docker machine for deployment). Here are my thoughts on the dynamics:

Docker, and in particular Docker Swarm, is for macroscopic “processes”
- Has durable processes, like BEAM supervision trees, for restoring or adding nodes in a known state but relatively expensive to bring up.
- Think 10s, 100s, or maybe 1000s of processes.
- Has better isolation WRT inter-nodal security
  - “Harder” to escape a container than a connected BEAM process to influence other processes.
Elixir is for microscopic processes
- Extremely lightweight parallel processes, even lighter than threads, which are super cheap.
- Global inter-process communication.
- Think 1000s, 10000s, or 1000000s+ of processes.

dokuzbir · March 8, 2018, 12:55pm

How about performance ?

mkunikow · March 10, 2018, 10:48pm

But I assume you could run container with one BEAM process that would spin 10000s+ light processes inside it? You can hit some limit how much resources this container is allowed to use.
But Docker Swarm and K8s are just orchestration frameworks.

I think most problems running language like JVM / BEAM inside container is that language virtual machine may not know limitations assigned to container and could think that have access to all computer resources what is not true. For example

ibgib · March 11, 2018, 6:18pm

Exactly. Well beyond the 10000s up in the millions! (That is also how the current ibGib deployment works on a single server, though not used at that scale.)

Hmm, I have alarm bells go off in my head when I use “just” (also “always”). The orchestration tools function very similarly to Elixir supervision trees (and more). I think of the Swarm worker nodes as “processes” and the manager nodes as “supervisor processes”. Again, this is more on the macroscopic level.

What I’m saying is unproven though as yet, since ibGib lives only on a single node. I’m going to be working on getting others in on the action wrt the distributed aspect, so news on that front in the near future.

BarelyFunctional · March 13, 2018, 2:43am

Is it possible for Nerves or a modified version of - to run on a VPS/bare metal server?

If so, that may provide the solution @zsck and others have been looking for.

Or would this community feel that this is coupling OS (personally I’d prefer it to run on FreeBSD) and the Erlang VM too much?

If developing on MacOS, to my knowledge, Docker still needs to be run in a VM, so running an equivalent of Nerves, say for a Phoenix App, would still allow a similar workflow to Docker - same dev and production environment.

Thoughts?