Handling Channels/LiveView LongPoll fallback "unmatched topic"

I’m investigating LiveView JS client errors when the Socket/LiveSocket throws failed with reason: {"reason":"unmatched topic"}.

I modified Phoenix 1.8.1 to emit telemetry events when the “unmatched topic” condition happens server side and collected some of those events (including socket and reply) with production traffic for debugging.

It turns out a batch of such errors all were related to the :longpoll transport and I’m trying to understand what the underlying cause might be, and whether I can safely ignore those errors assuming the channel connection will be re-established anyway later.

I have a proxy (Cloudflare) between users and the server.

This is one of the events, PII scrubbed:

%{
  name: [:phoenix, :socket_unmatched_topic],
  metadata: %{
    reply: %Phoenix.Socket.Reply{
      topic: "lv:phx-GGJYSitVRluvhDWD",
      status: :error,
      payload: %{reason: "unmatched topic"},
      ref: "92",
      join_ref: nil
    },
    socket: %Phoenix.Socket{
      assigns: %{},
      channel: nil,
      channel_pid: nil,
      endpoint: MyAppWeb.Endpoint,
      handler: Phoenix.LiveView.Socket,
      id: nil,
      joined: false,
      join_ref: nil,
      private: %{
        connect_info: %{
          x_headers: [
            # ...
          ]
        }
      },
      pubsub_server: MyApp.PubSub,
      ref: nil,
      serializer: Phoenix.Socket.V2.JSONSerializer,
      topic: nil,
      transport: :longpoll,
      transport_pid: #PID<0.33718.0>
    }
  },
  measurements: %{system_time: 1757064975931223300}
}

Has anybody explored this area before and could help shed some light?
What additional data could I gather?

My current thought is that the client either misses a ping/heartbeat or the proxy terminates a connection earlier, causing a server process backing the channel connection to go away. Probably the client reconnects spawning a new process / "lv:..." topic and the user doesn’t even notice :crossed_fingers:

It would help me to reproduce this locally, so any tips / shortcuts are welcome :slight_smile:

References collected:

In order to preserve the state of the user’s connected socket and to preserve the behaviour of a socket being long-lived, the user’s process is kept alive, and each long-poll request attempts to find the user’s stateful process. If the stateful process is not reachable, every request will create a new process and a new state, thereby breaking the fact that the socket is long-lived and stateful.

Clients subscribe to topics, and Phoenix stores those subscriptions in an in-memory ETS table. If a channel crashes, the clients will need to reconnect to the topics they had previously subscribed to. Fortunately, the Phoenix JavaScript client knows how to do this. The server will notify all the clients of the crash. This will trigger each client’s Channel.onError callback. The clients will attempt to reconnect to the server using an exponential backoff strategy. Once they reconnect, they’ll attempt to rejoin the topics they had previously subscribed to. If they are successful, they’ll start receiving messages from those topics as before.

For some reason, I’m observing cases in which those topics are gone when the client sends a long poll request.