Given the number of services mentioned (3000), I’d likely take my chances with a single node and code reloading
Nope, it’s something else. I feel that distributed Erlang should be used only to power the same code, i.e. multiple instances of the same “thing” which are connected into a cluster. Adding different types of systems (using different OTP apps and having a different process structure) might cause various problems with distributed parts of the code (e.g. pg2, Phoenix PubSub or Phoenix Tracker).