I would like to discuss how phoenix can be improved regarding websocket connection validation with Phoenix Channels.
Connecting to a websocket generally involves some kind of authentication/authorization and it needs to be possible to close the websocket connection if this fails.
Additionally, in order to have different client behaviour based on the reason that it was closed, we need to be able to pass at least a reason code, if not a full message on or before closing the connection.
E.g. if the reason is that the auth token is invalid, then the client should not attempt to reconnect until it has acquired a new token.
This should be flexible so that application developers can implement their custom requirements.
However currently it is not possible to both close the connection and provide a reason that is accessible to the client (at least for the browser websocket api).
There are existing discussions regarding this that have been pretty dismissive of this issue:
- Allow different error codes during websocket connection · Issue #2847 · phoenixframework/phoenix · GitHub
- Returning custom WS close code with Phoenix Socket/Channels · Issue #4822 · phoenixframework/phoenix · GitHub
The proposed workaround (and the one we currently use) is to require the client to join some special auth channel after connecting, before connecting to any other channels. As you can imagine, this results in multiple issues:
- All channels need to perform some kind of validation on join. This isn’t that bad since it can just delegate to a module which checks the
- You can only do auth after the client has decided to join a channel. There is no escape hatch to send a message to the client before they have joined any channels. This makes the server vulnerable to DOS attacks since the client can just open websocket connections without joining any channels and the server will just keep these idle connections running because auth isn’t done on websocket connect.
This leaves us stuck between 2 places.
- Either close the websocket connection without a reason, which means the client will always continue attempting to reconnect, even if their token etc is invalid. This leaves us with misbehaving clients resulting in us getting kinda DOS’d by our own clients.
- Make it very easy for a bad actor to perform a DOS attack on the server.
I understand that phoenix channels are transport agnostic and that the correct behaviour cannot be implemented in the channels, but we need to figure out a way to handle this correctly in specific transports.
I think the most flexible solution would be providing some way to send a message from the server to the client without needing the client to join any channel, but if this is not realistically possible given the channel protocol, then we should at least provide a way to send a close frame on the socket connect. Note however that close frames are quite limited. This could be done by instead of rejecting the http upgrade on a connect error, we successfully upgrade the connection and then immediately sending a close frame. Having looked at the phoenix source, this seems like a minor change, except for the fact that cowboy doesn’t seem to support sending custom close frames currently, so we would have to add that support to cowboy first.
I hope that we can have a fruitful discussion here and find a good solution to a problem that I believe a lot of developers have with websocket/phoenix channels currently.