Hi,
currently I am testing running a Phoenix application on a system with multiple connected nodes. There are some observations I’d like to confirm/questions I’d like to ask that I could not find documented anywhere.
Consider the case where requests are load balanced between multiple phoenix applications on different (BEAM) nodes (via an external load balancer i.e. Docker Swarm ingress load balancing or HAProxy):
-
When using websocket transport, channel communication is performed via one persistent HTTP connection which the load balancer routes to one of the Phoenix instances. The life span of a “channel session” (which begins with
join
and ends withterminate
) is identical to the life span of that connection. Correct? Does Phoenix persist any channel related state that spans multiple such connections? -
When using long polling, the channel communication is divided into many requests. If the load balancer routes these to different Phoenix instances, I see errors in my JS client. Once Phoenix instances run on connected nodes, I so far was no longer able to reproduce these problems. Does Phoenix delegate channel traffic to the correct process across node boundaries? So far it looks like it does, but I could not find this documented anywhere.
-
Assuming 2. is the case, what happens when one of the nodes disconnects because it is stopped (e.g. during a rolling update deploy)? Is there anything my application needs to do to ensure connections terminate gracefully (particularly in the long polling case) so that the client can establish a new connection?
Any insights on these topics or links to resources covering these questions would be greatly appreciated.