I’ve read different threads in the forum about elixir and docker, like Elixir applications in Docker containers?, and I opened this thread to open a discussion about a specific case which is still a bit unsolved to me: updating a (distributed?) stateful elixir app running on kubernetes. At least I didn’t see a well known good practice to handle this.
I think that the issue of connecting elixir nodes, running in different containers, in a Kubernetes cluster is pretty much solved. The @bitwalker 's libcluster library, with the Cluster.Strategy.Kubernetes.DNS
strategy, makes it really easy to cluster different elixir containers.
But what about preserving the state processes’ state? So…we can’t do hot code swap with containers/kubernetes… then we need to do a sort of blue/green deployments killing the old containers (which hold the state) and spawning new containers with the new image.
@dazuma in his recent talk Docker and OTP Friends or Foes suggests to use hordehttps://github.com/derekkraan/horde) and CRDTs to push the state to another live container, during old containers termination. I’ve tried this approach and to me seems a bit too fragile, but maybe I’m doing something wrong. The graceful termination time in kubernetes is fixed, and I don’t have any guarantees that the state is replicated correctly over another healthy elixir node/container. @dazuma, do you have something public I can see to easy replicate what you did in the video?
@dazuma mentions also another way to tackle this doing a hot code swapping within the container, without updating the container image itself. Does gigalixir really do this? This way goes against the containers best-practice… BUT honestly… it could be much less complicated (and maybe solid) than state replication during termination etc. Still, when we need to upgrade the containers to new images (like new elixir version)…we have to kill the containers, loosing the state.
Any other pattern we could use? What about stashing the state in something like redis and making the new containers to recover it?