Using Phoenix socket / LiveView in a horizontally scaled application

Hi,

I am developing a chat-bot using Phoenix and LiveView. My question, as the application gets more users, I will have to scale it up by adding more machines (using Kubernetes, AWS Elastic Beanstalk, AWS ECS etc). Would Phoenix handle this out-of-the-box?

Basically, there will be a load balancer routing incoming requests to different machines in a cluster (all running the same Phoenix application). Would LiveView and Sockets behave normally? Or does LiveView assume a single server machine running?

How to do that while scaling up horizontally?

Thank you.

There is no expectation that you run a single machine. But there is an expectation that you run Phoenix.PubSub connected across all nodes (through one of their available adapters) if you want to make use of cluster wide PubSub. LV doesn’t technically need that, but given you’re talking about chat you might be using PubSub for message delivery between LV processes.

Thanks. If the chat data (actually sent always to Chat-GPT) i.e. the chain is stored in conn.assigns then I think Phoenix.PubSub is not needed ?!

That is correct. If each LiveView is isolated, meaning it doesn’t communicate with any other process, nor save any data on the local machine, then you can scale horizontally infinitely. Every client will get a LV process started on an available node in the cluster, which will be “sticky” in that it has an open websocket. On every refresh it will start a new LV process on perhaps a different node. You don’t even need to use BEAM distribution if all you do is save data as needed with a central repository, such as a shared database. The only thing you might have issues with is to make sure your load balancer supports websockets, and long lived connections for long-polling for those clients that fall back.

2 Likes

If clients fall back to long polling you either need sticky sessions or pubsub needs to work for LV to move state between nodes when the client eventually is routed to a different node on a new long poll request.

2 Likes

Same applies and holds true if using Channels where each channel is merely a server <==> specific user communication ?

Yup, LiveView is build on Phoenix Channels, so it behaves the same way with either.

1 Like

If the websocket or long poll request is closed, a new request will be made to the load balancer, which might choose a new web server, but then a whole new Channel/LiveView process is created from scratch, so sticky sessions or state migration shouldn’t be needed. (Unless you’re doing something really really weird, like saving state locally on the machine.)

In the case of LiveView, the mount/3 might actually be called once in normal http (unconnected) mode, then when the phoenix.js takes over and creates a new websocket/long poll connection, a whole new machine could be selected, and the mount/3 will run the second time on a different machine. As long as you aren’t storing anything locally between those two requests, it’s fine that they happen on completely different machines. Each request should be atomic.

1 Like

That’s not the case for long polling. With long polling you get a request per each message being sent on a channel (applies to LV as well given it uses channels). So it’s not one request/connection per livecycle of a channel process, but many. Hence you either need clustering to be able to connect to the process across the whole cluster or sticky sessions to not be routed to a different node.

1 Like