We currently have a monolith where I work, on kubernetes. We are using libcluster to get the nodes the mesh together, and up until now, there has only been one type of node that has every possible dependency and library. It isn’t really sustainable so I’d like to break up some functionality.
The way libcluster works is it gets a list of ips where erlang nodes could be, and then connects them together. Since there could be multiple erlang vms on one machine, it uses a basename@hostname combo for its long names, where the hostname determines what machine it is on, and the basename determines which vm on that machine you want.
In kubernetes every container is just going to have one node on it, so libcluster made a design decision that every erlang node should have the same basename. That’s fine, but then how do I tell the difference between node types in my cluster?
For example, if I have some api nodes, and I have some analytics nodes, and one of the api nodes wants to contact a random analytics node to crunch some data and wait for the results, how do I determine which nodes are “analytics” nodes, when all of the nodes names look like :foo@whatever ?
Up until now I would have just named them api@whatever and analytics@whatever and just done a Node.list and filtered out the analytics nodes, but I don’t see a lot of people doing that so it makes me wonder if there is a better way.
I think a lot of k8s people use a service registry like etcd.
I would be interested in another way however as I’m choosing not to go down the k8s route (for now … using Google’s instance groups) but I’m at a place where I need to deregister a node when it receives a sigterm. Perhaps Swarm or Syn.
Is there a clear requirement informed by some other aspect of the design that calls for all of the heterogeneous BEAM nodes to be meshed together in one large pool? Can they be meshed with just their direct peers instead, or in many cases not need to mesh at all?
Most of the time we’d use Pod labels and a ClusterIP or headless Service that selects over those labels to do sibling discovery, which is a model
libcluster directly supports. In many environments you don’t have any direct connectivity to
etcd that powers the cluster, for security and safety reasons, and some don’t actually use
etcd under the hood. IIRC Amazon EKS is actually persisting to DynamoDB or something to that effect.
Honestly they could be meshed only with direct peers. Most of these will only ever interact with one other node type, although it is likely that the api nodes will have to interact with the majority at one point or another.
My purpose in this is mostly to avoid having to write rest api glue between services that could just directly make rpc calls to each other.
I’ve tried to use swarm in the past, but it did not give me useable results. I actually just tried out syn. I decided to use its process groups to register the top level supervisor of each app into the groups :api, and :analytics, then when they start up, their top level pids are registered. Then I can call
:syn.get_members(:api) |> Enum.map(&node/1) to get a list of nodes in the system of that type.
I think that works, and I can’t see any reason why it wouldn’t. Does that seem like a good solution?
Are you certain about the libcluster node naming? I’m using
libcluster in IoT devices and the node name follows the sname/name given at node startup. Like:
Not sure how/if there’s an easy way to pass snam/name to the BEAM on k8s from some metadata. Though I think when
Node.start is called it can be passed a new name? It’s not too hard to create custom modules for
libcluster so you could try that too if you don’t want to use swarm/syn or something else.
Definitely seems reasonable to me!
I’m assuming your IoT devices are using a strategy like gossip or something, which causes each node to send its node name to others periodically, allowing them to connect back.
I think I might go with the syn option. Now if only there were a way to make it log less verbosely.
Yes, it’s gossip. That’s useful to know as I didn’t know the different
libcluster modules don’t send the node name.