Hey everyone,
So I’ve been thinking about this problem for a while now, and can’t think of the best way to solve it.
I basically have resource IDs (Snowflakes) which will run on single GenServers. I have multiple types of resources - users and groups, which run on different Elixir codebases but will still be connected within an Erlang cluster.
Each group will run on it’s own GenServer, but it’s dynamic. The group GenServer will only run if users that are in that specific group are online the service. For example, when a user GenServer starts it’ll grab the groups that user is in from a DB and it’ll need to query some registry to get the PID running each group and call each GenServer to connect to it (group GenServer will just store a list of the PIDs connected). If that registry finds that the group isn’t already online, it’ll need to start that GenServer, then send the PID down back as a reply to the service that called the registry. Remember that this needs to all be done in a distributed Erlang cluster.
Then, when an event happens within that group (e.g. message send) it’ll fan out the message to the user GenServers which will send it to the users down a websocket.
Now, here’s where the problem I’m having comes into play. Messages can be sent from external services, written in different languages - for example, the API, which is written in JS on Node will need to tell the Elixir group GenServer corresponding to the ID that the message was sent to a group, probably via RabbitMQ somehow, but the thing is - the API won’t be able to find the PID of that group because it has no logical map to it.
I know there’s hash rings which are stateless, but I’m not sure how to write a hash function correctly, and even still, wouldn’t that mean that some nodes could run a different amount of groups without knowing, since only the ID would be hashed?
At the moment, my only solution to this is to create some sort of dictionary
service in Elixir which stores an ETS table of every single ID and corresponding node & PID linked to that ID. Then, it can listen to RabbitMQ messages for when the API wants to send a message to a specific resource ID and it can forward the message on by looking up the PID/node in the ETS table and sending it a message. Then, the user service can also use this dictionary
service when initializing, it can check group PIDs by querying this service, and if the group GenServer isn’t started then the dictionary
service can start it and respond back with the corresponding PID.
Sorry if this is super long, it’s really hard to explain properly what I’m trying to do. Please let me know if you have any questions, as some things might be unclear. I’d love to know what you think the best solution would be, and if I’m thinking about hash rings correctly.
Thanks for the help in advance!