We’re attempting to make our deployment process a little less… disruptive, so we’d like to find a way of manually draining all open websockets.
Background
We’re running on AWS autoscaling groups with an Application Load Balancer. When we deploy, we register the a new instance with the load balancer target group, then deregister the old one. Unfortunately, target groups deregistration is ignorant of websockets, so it waits the 300s for all open connections to complete (which of course the websockets do not do), and then forcefully closes all of the connections. This causes a bit of a stampede (yes, yes, this is partly client issue, let’s just assume the clients are bad actors) and we much prefer to do this more gently.
Question
Is there any “official” way of traversing the open websockets and closing them down? Aside from spelunking through the Supervision tree that is.
Presence.track can store key/values so you could store hostname and userid? So on a quit signal do a lookup on hostname and you have a list of connected sockets for that node? And you could use Phoenix.PubSub — phoenix_pubsub v2.0.0 to keep the broadcast local to the node.
just an off topic to connection draining, why are you using ALB when they are not guaranteed to maintain persistent connections, NLB is recommended for persistent connections. Do you guys see any connection issue with ALB or just not at the scale where you ALB would scale out/in in the background for you to drop connections
For folks coming across this recently: Phoenix sockets now have built-in connection draining (since 1.7.2). You can configure it with the drainer option to socket/3: Phoenix.Endpoint — Phoenix v1.7.14