How do you optimize, manage, and track websockets and users per server? Anything we must do? Numerous questions

mikejm · October 12, 2024, 6:17am

I am trying to understand the concept of managing many websocket connections (and thus connected users). I have numerous questions about this process, so I will split this up with what I understand and what I am wondering.

1) Websockets & Elixir in General
In this thread, we have the explanation for how to get so many connections on one system:

Each network connection is described by tuple of 4 values:

Source Address
Source Port
Target Address
Target Port

So only if these 4 values match, we cannot open new connection. This should make it clear how we can have 2M (or more) simultaneous connections to single service listening on single port.

But I do not think this is adequate to explain what is going on. Let’s say in a conventional scenario you set up a Cowboy/Bandit websocket system like shown here at ws://localhost:4000/websocket

If we have set “target port” to always be 4000 then we have eliminated that as a point of variance. By “target address” I presume we mean the address of the websocket which is also not changing, so that is out too.

Thus our differentiator for users becomes “source address” and “source port”. But how is this further differentiated or stored in the web socket or server? How is that adequate as well?

I have read this link, and I understand there is also a random sec-websocket-key included in the handshake. So I presume this is actually the primary factor that lets us make many websocket connections.

But is there any protection against random collisions? What about after the handshake? It is stated there this key is discarded after handshake.

So should sec-websocket-key be considered the “5th factor” (primary reason we are not limited) for how many websockets we can set up?

Either way, how does the system next continuously differentiate websocket sessions from for example from the same computer?

I can open multiple web browser tabs and connect to the same localhost server and have separate connections where the messages aren’t scrambled. After the sec-websocket-key is exchanged, it is not sent on every data exchange, and I have set no token info, so in any simple terms, how does it still continuously distinguish data from one user vs. another even from the same computer?

2) Tracking User Connections

Related to this, does the basic Cowboy/Bandit system also have some background manner of distinctly identifying each user that accesses ws://localhost:4000/websocket? If so how?

Do we have access to those differentiating details? ie. Is there some tracking of all the connections or key-value database of all connected clients? In this example they create a registry for each SocketHandler to presumably keep track of each connected user.

I believe we are inevitably creating one object (with internal state) for every distinct websocket connection made as well. In this example, that would be EchoServer or in this case SocketHandler.

These objects are derived from cowboy_websocket in the Medium example and in the Plug examples a “WebSock compliant module” as per this ref. I presume these are just two ways of accomplishing the same thing and they are both “WebSock compliant modules”.

The WebSock behavior then has state within in like a GenServer. And I believe regardless of the voodoo asked about in point (1), we can presume each of these represents just one user connection and that will be the case from start to finish.

Thus I initially thought reading Elixir I should need a “GenServer” per user connection, but with WebSockets, it appears I already have this in the “WebSock compliant module” and can just use that to manage each user session. ie. Hanlde user connection state and requests, store their authentication token, etc. Given this, we don’t need to keep validating every request through the websocket or continously checking the token, because we presume it is always them (unless perhaps we expect to be timing them or their token out). Is this roughly correct?

(3) Manually Tracking Users
The main point of making a Registry as they did in the Medium article seems to be to quickly check which users are connected and which aren’t, for example if there is an update to share with a potentially connected user. That is what they use it for there. (Ie. Non-Phoenix system)

But a Registry is local to each node, so I presume like described here for Phoenix a Pg2 system would actually be even better for this purpose if you want expandability. Or Syn, as per also this post.

(4) Garbage Collection

What about garbage collection in such a system? It looks from what I can tell that the “WebSock module” terminates itself after timeout or loss of connection. Does this essentially “null” or “Dispose()” it in some way to mark it for garbage collection? If it is in a Registry do we manually remove it in one of those “WebSock module” functions on termination?

That’s all I can think of for now. If I can clarify that I think I can get the bulk of my server figured out. At least the basic stuff.

Any random thoughts or advice are welcome and appreciated.

ruslandoga · October 12, 2024, 9:34am

@mikejm

Re 1: WebSockets are just TCP connections so it’s up to the OS to differentiate between them. To Erlang they arrive as either :socket sockets or :gen_tcp ports.

How is that adequate as well?

Not sure what you mean, but the post you quote probably explains interface aliases. Ask ChatGPT how you can make more than 65k connections from a single host using interface aliases and be enlightened

sec-websocket-key is completely unrelated to identifying WebSocket connections. Instead, it’s just a way to setup a WebSocket connection, see RFC 6455 - The WebSocket Protocol.

Re: 2. You can use Phoenix.Tracker to track users in your system. If you don’t need extra functionality Phoenix.Tracker provides, and all you need is to register WebSockets to some shared topic which then can be used to send all those WebSockets a message, then Phoenix.PubSub would probably work fine as well.

WebSock is a behaviour to be implemented, it doesn’t have any state of its own. Just look at its code: websock/lib/websock.ex at main · phoenixframework/websock · GitHub.

Given this, we don’t need to keep validating every request through the websocket or continously checking the token, because we presume it is always them (unless perhaps we expect to be timing them or their token out). Is this roughly correct?

Yes, once you authenticated a user’s WebSocket connection, it can be safely assumed that it’s the same user for the whole duration of life connection. Reverse-proxies usually map WebSockets 1:1. Note however that you would still need to authorize each of the users actions, though. You can read more about this in LiveView docs: Security considerations — Phoenix LiveView v0.20.17

Re: 3.

Pg2 system would actually be even better for this purpose if you want expandability

Depends on what you mean by expandability, but performance-wise, it might end up being worse. Phoenix.PubSub targets a very narrow use case, message broadcasts, while pg is a general purpose process registry, and it shows. In my (old) benchmarks, out of the box, using pg as a pubsub and broadcasting a message with it was far worse than using Phoenix.PubSub: GitHub - ruslandoga/pg_pubsub_bench.

Re: 4. Process registries monitor the registered processes, the processes get removed automatically when they exit. WebSocket processes, at least in Cowboy, are started with sockets in “active” mode, so they exit when they receive a message indicating that the underlying TCP socket was closed. Other implementations might do it differently.

That’s all I can think of for now. If I can clarify that I think I can get the bulk of my server figured out. At least the basic stuff.

ChatGPT is really good at basic stuff like this. Just ask it how Cowboy / WebSockets / TCP / Phoenix PubSub / Phoenix Tracker / Phoenix Channels work in details, and it would clarify it all.

benwilson512 · October 12, 2024, 12:02pm

If you haven’t seen https://www.youtube.com/watch?v=JvBT4XBdoUE I highly recommend it. While this talk is using background jobs instead of sockets all of the process mechanics are the same.

hauleth · October 12, 2024, 1:41pm

It is stored and managed by OS. So it is not something that WebSocket transport cares about, nor it is any concern of Erlang. Nothing else really matters there.

Yes, because client will open separate port for connecting to the remote node so in tuple {client_addr, client_port, server_addr, server_port} there is still difference in client_port field.

mikejm · October 12, 2024, 10:52pm

Ah thank you that makes sense. That was the main point I was not getting why I was getting so confused and worked up over it.

mikejm · October 13, 2024, 1:28am

Thanks for your detailed response. This is something I had been unsure of. So let’s say you have the simple websocket configuration posted in the Plug documentation here.

I was assuming maybe EchoServer there was a GenServer (or a variant of GenServer) with its own state because state is being passed into the handle_in. What then is the meaning of state here? Just for extra arguments or what?

Either way, if WebSock or this EchoServer does not have true state (like GenServer), then I must greate a GenServer per user on login. And I presume I must enter a reference to the WebSocket into their GenServer state to access it from there somehow.

I made a new post to continue on that subject then here:

Any further help? I am learning. I appreciate it for what it’s worth. Thanks again.