Phoenix Channels - Incorrect order of messages sent to clients

Hello,

We have been using Elixir/Phoenix for our production app. We have noticed one issue with the way the messages are sent to clients connected through channels.

Our app is a online poker game. Multiple players can connect to one room. They do some actions during the gameplay. Their actions need to be broadcasted to all others players in the room. Some players can leave in between and some can stay for longer time.

This works most of the time. After some time, the players are not getting the messages in time.

For e.g. There are 4 players in a room. If the messages that are to be sent are: “msg1”, “msg2”, “msg3” and “msg4” in the given order. 2 Players are receiving them in the correct time. They are getting “msg1” first, then “msg2”, then “msg3” and then “msg4”. But other 2 players are receiving them after the first 2 players got all 4 messages.

Time 00:00:00: Player1, Player2 receive "msg1"
Time 00:00:01: Player1, Player2 receive "msg2"
Time 00:00:02: Player1, Player2 receive "msg3"
Time 00:00:03: Player1, Player2 receive "msg4"

Time 00:00:04: Player3, Player4 receive "msg1"
Time 00:00:05: Player3, Player4 receive "msg2"
Time 00:00:06: Player3, Player4 receive "msg3"
Time 00:00:07: Player3, Player4 receive "msg4"

The order of receiving the messages are fine. But, the time they receive is not correct. And, this results in an incorrect state in the game.

Can you please help us. The pseudo code we have is :

Code for broadcasting game state:

Webapp.Endpoint.broadcast(room_id, "update:gameplay_state", gameplay_state)

Channel code:

intercept ["update:gameplay_state"]

def handle_out("update:gameplay_state", gameplay_state, socket) do
  Channel.push socket, "update:state", %{game_play_state: gameplay_state}
  {:noreply, socket}
end

Thanks,
-Ravi

This is a feature of distributed systems. Note that P3 and P4 are still receiving messages in the correct order, but the timing is not the same due to network latency and other factors. I don’t think Phoenix waits for any confirmation from all the clients before ending the broadcast, that would block the channel entirely if a message to one client was waiting due to big latency or timeout.

If you need to ensure that all players receive message 1 before any other player receives message 2, you need to implement some kind of confirmation system yourself. But do check if it would be possible to refactor your client so that it does not matter: if they all get 1-2-3-4 they would synchronise eventually.

Hi @Nicd,

Thanks for the reply. I agree on the latency part.

We added the log in handle_out method and we don’t see this method called for P3, and P4 player sockets for the earlier messages. If I am understanding correctly, at least, the socket push should have been called for all connected clients for each message one by one. Isn’t it?

Thanks,
-Ravi

I’m unsure about the semantics of handle_out. I would think it would call it for all messages if the pattern match succeeds.

To use handle_out you have to intercept the specific messages first (it introduces an overhead is why so you have to opt-in).

And for note, anytime any messages get passed between any 2 different things will always be in order on the BEAM, but interleaving messages will not, plus when going over websocket then all bets are off. If you want ordering in your messages then including an ordering field (like time) with them (standard websocket, nothing phoenix specific here). :slight_smile:

1 Like

Thank you @OvermindDL1.

We will try to avoid intercept and see if that helps. We didn’t anticipate this much delay due to intercept.