Pogo is a distributed supervisor for clustered Elixir applications.
It uses battle-tested distributed named process groups (:pg) under the hood to maintain cluster-wide state and coordinate work between local supervisors running on different nodes.
Features of distributed supervisor:
automatically chooses a node to locally supervise child process
a child process running in the cluster can be started or stopped using any local supervisor
ensures a child is started only once in the cluster (as long as its child spec is unique)
redistributes children when cluster topology changes
In that aspect itās similar to Horde or Swarm, but doesnāt provide a distributed registry (Horde and Swarm do). Internals obviously are different - Horde uses ā-CRDT, Swarm uses Interval Tree Clock for synchronization. Pogoās local supervisors donāt exchange messages to synchronize state, but rely on Erlangās process groups, observe their memberships and adjust their local state based on it.
For anyone interested, Pogoās inner workings have been detailed in an introductory blog post .
To provide some context, the library was developed at Telnyx as an alternative to Horde as we couldnāt overcome some problems that TBH could have been peculiar to our environment (20+ node cluster with quite dynamic membership). Not that it went smoothly but since version 0.3 itās been quite stable.
Does is mean that in case of network split, there will be two instances of the same child_spec: one in the part of network, which is not connected to the leader, and one which is a process, restarted by the topology change? Or will the disconnected from the leader part of the cluster just kill all local children?
In case of a network split there will be two instances of each child process as there is no concept of a leader or majority in pogo. Once the connectivity is restored, extraneous instances will get terminated.
Could you please tell the status of the lib ? still actively maintained ?
Has it been thoroughly tested in production apps ?
Iām looking for distributed libs to globalize genserver across cluster nodes.
I was going to use Horde but unfortunately the maintainer does not fix issues anymore. (I donāt blame him for it).
Swarm seems to have been abandoned.
Just follow your gut and pick whichever library feels right to you. Whatever you choose, I think you will have to at some point read the source code to understand what is happening.
I have used :pogo and :horde a lot. :pogo is easy to understand. The source code was only a couple of files when I was using it. :horde was difficult for me to understand and in a production project we kept seeing it just ālosingā visibility of nodes randomly. That was probably our own ignorant fault. However, I have found :process_hub to be reliable and easier to understand. The source code is clear enough that I am able to feel āin controlā because I understand, in broad strokes, what is happening.
You did not ask but here is info on how I run distributed nodes locally with docker: Using :dns_cluster with docker-compose locally (it can be done) When I wrote this I was using :pogo because I was trying to replace :horde. The same approach works for :process_hub.
This is just my subjective opinion based on my experience. For me, the most important factor when choosing a lib is how responsive and active are the authors of the library. I donāt care if there are 1000 github issues so long as the maintainers are responding to people.