How to do 'Sticky Processes'?

Qqwy · April 18, 2018, 7:09pm

During @chrismccord’s keynote at ElixirConf.EU he briefly touched on the subject of different ways to set up systems that need to be scalable but still be able to track data accurately.
Now, there are multiple ways to approach this problem, each with their own set of trade-offs, but one of the possibilities would be to use an approach similar to ‘sticky sessions’:

When a user connects to a (web)application and interacts with it, the application can make sure that the user only can do things it is allowed to (like make only a certain amount of requests per hour, or will only spend as much money as they have in their account). But when we try to scale this up and we need multiple nodes, then this suddenly breaks down, because now a user might connect to multiple nodes at the same time, and perform requests faster than these nodes can synchronize.

So a possibility is to set up something that is known as a ‘sticky session’: Whenever a user signs in (or possibly already earlier), it is decided to what node the traffic of this user will always be redirected. This is of course to be built in such a way that the amount of users that interact with the application are somewhat evenly spread across the nodes.

This is a really nice idea, but usually it is implemented in a load-balancer set-up before we enter the application (that is written in Elixir or anything else). There is a nice article that goes into detail about setting up such a kind of system using HAproxy as Load Balancer.

However, there are two drawbacks to this approach:

You introduce an external dependency on the Load Balancer to the system that might not be required. _{(do note that It might very well be the case that when your application grows so large you might want a load balancing layer in front of your cluster of outward-facing nodes anyway, to make sure traffic is always properly re-routed when nodes fail, since the DNS doesn’t do that for you.)}
You’re only able to balance requests based on sessions, but maybe your application logic has its distributable stateful thing at a different logical layer. An example: In a system where users play together in a game (say: poker), it would make sense to distribute these games, rather than the individual users.

So that’s why I’m currently looking for a way to properly and reliably do this type of thing right inside the BEAM. For proper nomenclature, since we might distribute different (stateful) things than just sessions, I think that ‘Sticky Process’ might be a good name for this.

So what I’ve thought about so far:

We have wonderful tools now like Swarm to specify what node we want to run what type of process on, and how to redistribute it (potentially keeping its state!) when its node goes down. I think we can build a ‘which one of this group of processes do I want to call based on whatever you pass in’ wrapper on top of this although I am not entirely sure yet how it would look/work. The other main thing I am unsure about here is how the supervision would look/work (Because we want a ‘pool’ of processes where members of these pool all live at different nodes).
When a node goes down, we could either choose to redistribute their sticky processes’s state over the other nodes’ sticky processes, but maybe it is a lot less complex to just move the node’s sticky process as a whole to a different node until it comes back. _{And if we want to make sure workload is distributed evenly, we might run multiple sticky processes of the same type on each node: Say we have nodes A, B and C and D and each have three sticky processes, then when A goes down, A1 will move to B, A2 to C and A3 to D.}
Obviously we want to use the Quorum-option (i.e. ‘majority wins’) of Swarm so we have strong consistency; if eventual consistency is good enough for your application, then using e.g. CRDTs might be a better option than something like sticky processes.

Now, I’m also wondering if there are edge cases that have not been considered yet, and also would love to hear if there are people already doing something like this in their applications .

~Wiebe-Marten/Qqwy

idi527 · April 18, 2018, 7:41pm

This would be might be quite difficult with sticky sessions since some of your “hard-core” users might end up on a single node, whereas other nodes might stay idle. Like with Justin Bieber and twitter. Imagine you are twitter and several celebrities end up on a single node, how would you spread that kind of load?

EDIT However, now that I’ve thought about it more, I probably confuse your approach with consistent hashing. So you would still be able to redistribute the load after an inefficient assignment of a process to a node, whereas with consistent hashing it’s “written in stone”.

An example: In a system where users play together in a game (say: poker), it would make sense to distribute these games, rather than the individual users.

I wonder how discord servers work. Probably the same way, judging by their libraries and blog posts.

The other main thing I am unsure about here is how the supervision would look/work (Because we want a ‘pool’ of processes where members of these pool all live at different nodes).

I think this would be too difficult to be useful. We can’t possibly (although there are ways, but they are too difficult and probably require specialized hardware) know if the process is timing out or the packet is dropped. Erlang type of supervision should be constrained within a single node, I think, since it seems to have been designed with that in mind.

Have you looked at GitHub - erleans/erleans: Erlang Orleans? It might have some of the properties you want.

jakemorrison · April 19, 2018, 1:18am

One way to do this is at the DNS level. When the initial request comes in to www.example.com, we assign it to a pool of servers, www1.example.com, www2.example.com, … and redirect.
Subsequent requests from the same user will go to the same subdomain. That keeps the traffic sticky to one server. Another option is to do customer-specific subdomains, e.g. foo.example.com, which also handle the case of some customers having more traffic than others. We can grow and shrink the pool of physical machines by updating the DNS, e.g. we start with 100 “shards” and assign them to 20 physical machines.

Another option is the “distributed load balancer” approach, i.e. every front end machine can take traffic from any user, but we proxy back to the user’s home machine. We can set a cookie and use that to direct traffic on subsequent requests.

None of this is Elixir specific, but there are interesting tools to share info in the cluster.

Qqwy · April 19, 2018, 2:18pm

Indeed. Instead of consistent hashing, what process is on what node can simply be looked up in the local process registry, which means that processes can be moved around.

Yes, this seems to be the case, sort of. I read this very good blog article about what problems the encountered while attempting to scale to serving many users. Indeed, a Discord Server is actually a single GenServer, but if there are multiple tens of thousand people connecting to the same one, their socket processes might actually be managed by a different node. To limit inter-node chatter they introduced the Manifold library.

Actually, the ‘Sticky Process’ idea is not really new, and is what everyone has to do when building a stateful system (a game, a chatroom, etc): In these cases, we have one single GenServer per ‘unit of concurrency’. What makes it more difficult though, is what happens when you change to a multi-node system (either for scalability or for fault-tolerancy):

How do we make sure the sticky process (preferably including its current state) lives on when the node it was hosted on dies?
How do we redistribute work properly when a node dies? And what when the node comes back?
It is quite possible that we might want to persist-and-terminate a sticky process when it hasn’t been used for a while. (And how to bring it back up later?)

Erleans definitely looks like it is actually attempting to handle this (or at least the last two points), although its documentation is pretty lackluster at this moment, so I would actually not be sure how to get started with using it.

idi527 · April 19, 2018, 6:44pm

Erleans definitely looks like it is actually attempting to handle this (or at least the last two points), although its documentation is pretty lackluster at this moment, so I would actually not be sure how to get started with using it.

Maybe Orleans is a cross-platform framework for building robust, scalable distributed applications | Microsoft Orleans Documentation would provide more information. It would at least explain their vocabulary: grains, silos, etc. Erleans tries to adapt orleans to erlang/otp, as far as I understand.