Mesh - Capability-based routing for Processes on the BEAM

derek-zhou · January 10, 2026, 3:43pm

Can you articulate your intended usage scenario with a concrete example? Do you want “node affinity” based on performance concerns, ie. hot cache, or physical attributes of the nodes ie, big memory vs small memory, or security/regulation concerns ie. to make private enclaves within a public cloud?

sleipnir · January 10, 2026, 6:59pm

Oh man, that was rather sad.
I appreciate all constructive feedback, and as I said in the post, the idea was to share it very early to receive positive feedback, without any pretense of being perfect or academic.
I think you made a number of assumptions, both about the library itself and about me, whom you don’t know. I have 25 years of experience, many of which have been spent working with distributed systems, and I have read many books and implemented many systems in production. I also contribute to many libraries, many of them about distributed systems, not only from the Elixir ecosystem but from other ecosystems as well, and I always try to be polite to everyone I interact with.
You mentioned Horde; I am one of the contributors/maintainers of the project, and I am sure I know it very well. Anyway, thank you for the feedback; I appreciate it very much and will take it into account whenever possible.

sleipnir · January 10, 2026, 7:12pm

Imagine you have a cluster of shared applications that aren’t homogeneous—that is, they are different applications, such as games, chat applications, and others—but for whatever reason you want them to be part of an Erlang cluster. You want to logically group these applications within the same cluster and be able to invoke processes using any label (as mentioned before, not a PID). You expect a process to be started and maybe monitored when this request occurs, and that eventually, this request will go to the same node as long as there is a live process on that node. That’s basically it. We don’t want strong consistency, replication of user process state, nothing like that; it’s eventual consistency and it’s just service discovery. You can implement your own selection strategy if you want.

derek-zhou · January 10, 2026, 7:25pm

Of course I can imagine all that, I even gave you 3 examples so you can tell me which one is closer to your application, and you did not answer my question. My point is: the need for affinity could be mandatory, or advisory, or anything in between. I do not think your library can cover all bases, at least not now, right?

When I “shop” for a library, I look for 3 things, in this order:

The concrete problem the creator intended to solve
The commitment from the creator to solve it, even without any external contribution
The vision of the creator to expend the application field, with external contribution

You have given 3, but not 1 and 2.

sleipnir · January 10, 2026, 7:35pm

In a sidecar-based system, clients in other programming languages register with a hostname. These containers have routines that contain entities, and these entities have names. They share the same cluster, and the sidecar executes the calls between the various containers via erlang dist. For this to work, the sidecar in Elixir cannot choose a random Node in the cluster because there is an affinity between the sidecar and the host application, and therefore it would fail to call a sidecar that does not have an affinity with the destination sidecar. So we need:
1 - To know which hosts have which entities
2 - When a client invokes an entity, the cluster must call exactly the sidecar that has a host that has the destination entity.

That’s exactly what the library was created for, and it meets these requirements. This happens in practice. However, I see that it’s possible to explore the library for other use cases. For example, I could have applications separated by region and could route a call based on the client’s region. Or I could have specific workloads for specific tasks and route based on that. There are many possible use cases.

derek-zhou · January 10, 2026, 7:56pm

Sorry, I am not familiar with the term “sidecar”. Do you mean a background job? A server-less function? A system optimize to run hour long simulation jobs will need very different architecture from a system to execute serverless functions finishing in a few milliseconds.

What you described feels like a directory service, which may not even need to be distributed in nature. LDAP has been there for decades; when properly configured it is pretty fast too.

sleipnir · January 10, 2026, 8:05pm

I think it’s normal to be unfamiliar with all possible terms in the world of technology. I didn’t mean background job, and it certainly has nothing to do with LDAP.

A sidecar is a container that runs alongside another container within a POD. This sidecar container can provide any kind of extra functionality that the main container depends on and that, for some reason, it doesn’t execute on its own.

Here’s a random examples from the internet of what a sidecar is:

Here’s something closer to what I described as a use case: https://www.youtube.com/watch?v=b86DIo0_UoU

And here’s the project that will use Mesh after we stabilize the Mesh API:

garrison · January 11, 2026, 2:14am

tbh y’all need to chill this thread is kinda unhinged

It seems to me like you know what you’re doing and wrote a library that makes sense for your case, but when describing it in the readme you (understandably) tried to put it into very general terms. But because it’s very general everyone is just going to project their own experiences onto it instead.

In a similar situation I found that explicit examples help. The sidecar thing makes sense to me; consider adding it to the readme.