Phoenix Blog Post: Failing Big with Elixir and LiveView - A Post-Mortem

Here’s the story how one of the world’s first production deployments of LiveView came to be - and how trying to improve it almost caused a political party in Germany to cancel their convention.

I wrote this post just a few days after the event took place. As annoying as it was, it was a good teachable moment. And soon I’ll write an update with a tutorial on how to scale to 5,000 concurrent LiveView users on a single VPS :slight_smile:


Posted via Devtalk (see this thread for details).

15 Likes

Fail your way to success

TL;DR for those who haven’t read the well written article yet:

  1. Avoid large payloads in Phoenix.PubSub if possible
  2. Throttle PubSub events at the sender level to avoid clogged process inboxes
  3. Using assign/3 in LiveView always causes an update via Websocket, even if no changes were made
2 Likes

Very unfortunate title choice for an otherwise interesting post that praises LiveView

13 Likes

In regards to point #1, it seems like the argument here is that instead of one process querying the DB and sending the result to the other processes as a PubSub payload, it’s better to have the process send a message without payload, and then each process makes its own DB query. Is this always true? I use payloads in my PubSub broadcasts sometimes to minimize DB queries; now I’m wondering if this was a mistake.

2 Likes

Not quite the proper takeaway for all points:

  1. Large payloads over PubSub by themselves are just fine. We actually broadcast entire ets tables of presence information when we replicate data for new nodes, so by itself this isn’t a problem

  2. This is the key takeaway here. Publishing events at scale must consider downstream subscribers. In this case publishing the entire user list for any individual user change was the mistake, rather than deltas of joins/leaves (or presence)

  3. We won’t send anything on the wire if there is an empty diff, so I’d need to know more about this particular case.

7 Likes

I broadcast 5,000 updates and my laptop didn’t crash.

takeaway #1 for me: try to do testing that is close to reality.

2 Likes

In my case: No. The state is being requested from a GenServer and not a DB.

I’m sorry it’s being perceived that way - and I’ve heard the same from a couple of commenters online. I understand how the title could be misread as disparaging the use of LiveView which was absolutely not my intention (which is hopefully very obvious when you read the article).

The problem is that if there is a change in the assigns which doesn’t result in a change in the template (because the assign isn’t used in the template), it still sends an update to the browser.

That’s a very interesting point, thank you :slight_smile: Maybe the takeaway shouldn’t be a general discouragement of large payloads but rather to be extra mindful whether they are necessary or not.

1 Like

This was definitely my thought when i read the diagnosis of the crash. Was there a design story here that really called for every single update to be printed in real time to begin with? What would an admin user even do with such information? Seems like if you are sending so much data that even the browser couldn’t keep up, what was the human user going to do?

2 Likes

The problem is that if there is a change in the assigns which doesn’t result in a change in the template (because the assign isn’t used in the template), it still sends an update to the browser.

I think this is related to this issue Add non_rendering_assigns · Issue #1386 · phoenixframework/phoenix_live_view · GitHub

After submitting that I think it’s still pretty odd that anything is sent because diff should be empty. Are you perhaps showing current time in UI or something that gets changed on every render so diff won’t be empty? If not maybe you should submit a bug report here Issues · phoenixframework/phoenix_live_view · GitHub

Oh wow, thanks for finding that issue. What I was describing is exactly what Jose describes in that issue.