Running multiple nodes in the same machine makes sense?

sezaru · September 14, 2019, 11:39pm

Hello,

Does it make sense to create a BEAM multi-node architecture if I plan to use it in one machine only?

To elaborate a bit more… In my system a have a complex and big supervisor tree with lots of processes etc. and I need to recreate this entire tree for each external source I support.

This means that if I call my sup tree SupA, and I support sources Source1, Source2 and Source3, I will have to duplicate SupA 3 times, one for each source.

Currently, I already have that implemented, I send the source name when creating the tree so I would get a unique name for each one (and for all processes and supervisors inside SupA too).

So, for example, for Source1, I have SupA.Source1 something like that.

Making all the processes and supervisors have a unique name and keep track of that to call the right one is kind of a pain (I do use Registry in some places to make it a little bit nicer) so I was thinking if I could instead of running everything in a single node, create a node for each one of the sources with its own BEAM.

Does this make sense? What are the pros and cons of such an approach? Is it even a good idea?

If it is, how can I handle the node initialization? Should I just create a script that starts each node for me with the correct parameters or there is something nicer that handles that for me?

Thanks!

keathley · September 14, 2019, 11:55pm

You certainly can run multiple instances of the BEAM on a single node. But my preference would be to run a single instance and use tools like dynamic supervisor and registry to start the different subtrees as needed. My primary concern with running multiple BEAMS would be performance since you’d be sharing your CPU. I also think that you’d be taking on a fair amount of operational complexity for not a lot of gain. But that might be me projecting since I’m better at managing elixir code and supervisors than I am with ops work.

tty · September 15, 2019, 9:19pm

Besides the CPU / memory constrains there is no harm in organizing your platform in this manner. We typically split a separate node for external services which can be a tad flaky coughIBM MQcough. One other advantage is migrating the busier node to a separate server at a later date.

My preference is to have a master node and a controller process spawn slave nodes. The same controller process can also monitor_node restarting errant slaves.

For your specific problem could you have one Application per source ? It is also unnecessary to have unique names per process unless you want to specifically send messages to it by name. It is quite common to rely solely on Pids.