How to scale LiveView for a social site like Hacker News

Hi there,
I am Elixir/Phoenix noob and I am trying to wrap my head around, how to build a live version of the orange site with LiveView.

Goal: let’s say for each post comments appear in real-time as they are posted. So for a hot post we can get to ca. 100-1000 comments in a short amount of time. And maybe 10K-100K users would watch the comments pop up in realtime.

Problem: I played around around with mix phx.gen.live and figured that naively showing 100-1000 basic comments adds up to some 100KB-1MB memory usage for the LiveView.Channel per active user per process as seen in LiveDashboard. So at the low end we need 100KB memory * 10 000 users = 1GB RAM or 1MB memory * 100 000 users = 100GB RAM.

I have seem multiple posts on this forum, that say you should use temporary assigns. From what I understand the initial comments would be then flushed from the memory on mount / first render, but in the case a user joins the party early, then

def handle_info({:update_message, message}, socket) do
  {:noreply, update(socket, :messages, fn messages -> [message | messages] end)}
end

would build up the same in-memory list of 100-1000 comments, right?

Question: is there a way to structure LiveView in this case without needing a 100GB RAM server? Or this in the end the price of realtime?

Thanks a lot in advance!!

3 Likes

I just use Elixir as hobby. But because Elixir is immutable with {:noreply, update(socket, :messages, fn messages -> [message | messages] end)} you are just updating memory of single LiveView connection instance. You need to store these messages into another process, ETS or database to share them between users connected with LiveView. This single data source, be it another process or database would be the source you get those comments from to use with temporary assigns.

4 Likes

you are on the right track of not “keeping” the data, but passing it through…

for the messages you would keep them in (shared/separate) memory in an ets table (or similar) - the liveview would just stream from what is already in memory… (as @wanton7 also said)

for client sanity/performance you would most likely chunk it like in this pr ⚡ Send data received from metrics history provider in chunks to make … · phoenixframework/phoenix_live_dashboard@2435a31 · GitHub

(and thus you could/would cache the chunks as well)

7 Likes

Hey @wanton7, @outlog - thanks for your input! Really cool stuff!
Yeah, makes a lot of sense as it is all exactly the same redundant data in each LiveView.Channel after all.

I am really new to Elixir/Phoenix (zero OTP knowledge so far, sorry :see_no_evil:), but maybe you can point me in the following:

  1. Most importantly - are there any downsides and gotchas to your proposed approach? If it works, then it reduces memory footprint significantly for any many apps and probably could be rolled upstream / with the framework, so why isn’t it the default in Phoenix LiveView?
  2. Nuance between using another process, ETS or database for where we store the data? Just use GenServer as the most simple thing for starters?
  3. “streaming messages” → that means Phoenix.PubSub, right? Sorry if it is an even more stupid question, but what is the cost of this operation? I suppose the premise is that subscribing to PubSub in a LiveView is much cheaper than holding redundant state in the LiveView process.

Thanks a lot again!

  1. It’s not always/usually required, doesn’t make sense for lots of applications and as you can see from the PR, it’s fairly trivial to implement.

  2. I’m not sure there is a choice in this case other than persisting the data, so it’s going in the database. Caching usually happens in an ETS table, which is usually owned by a GenServer. You don’t cache anything that is read heavy in a GenServer because it can’t have multiple simultaneous readers (it processes messages one by one). An ETS table can.

  3. All they’re doing in that PR is sending data to a chart in chunks using a normal method of updating a component. Each new visitor is going to hit the cache/db initially. PubSub can be used to update the view once it’s loaded, not the initial load.

2 Likes

Maybe you’ll find some useful tips in The Road to 2 Million Websocket Connections in Phoenix - Phoenix Blog.

2 Likes

In addition I also recommend, without any specific order:

Hey everybody, thanks for your ideas!
I read through the PR in some detail and figured out the following:

  • This chunking technique is cool, but I guess it’s not the point, because they call send_update function from the LiveView.
  • But(!) the send_update function seems to be meant to share state down to components and not between LiveViews as discussed in this thread Intended use of LiveView.send_update/2 - #6 by tfwright.
  • So in the PR they send updates from Telemetry ETS to the Chart component with send_update asynchronously, but(!) the data is still assigned to the components state in the end.

So I guess that doesn’t solve my initial goal of getting 100-1000 comments out of the assign state. I need to learn more … :nerd_face: