Scaling LiveView for low latency in multiple regions

So we’re using LiveView in production and it’s mostly been fantastic. BUT, we’re UK based and serving the UK market - from GCP europe-west2, i.e. London. We’ve got devs in NZ currently, and they have said that the site is, as imagined, really slow from there.

What are peoples thoughts on scaling LiveView across multiple regions? It’s not an immediate problem for us, but it was a known since we decided to adopt it - thus far I think the productivity and quality of the UK experience has paid off. We’re currently using GKE, running stateless elixir pods, nothing shared or distributed between. This is working well for us and hasn’t given us any problems with the sockets. Given this, my first thought is just expand the Kube cluster into other regions and use node affinity and google’s global load balancer routing to sort it out…but I haven’t tried this or similar yet.

Has anyone had experience of this? It feels like the final hurdle for LiveView in some ways, so be good to know how people have handled it and whether it’s been smooth

4 Likes

What backing store do you have? When you start to geolocate your nodes, you’re then changing the latency from User<->Server to User<->"Edge Server"<->Database Server (for example).

It this actually a liveview specific problem? I’d imagine you’d have the latency for every other tool as well. The only thing truely liveview related I can see being problematic in such a setting is the initial double request of http + websocket connection.

1 Like

Good point, I had considered that but forgot to write it…so yeah, assume at least read replicas in regions too, I guess? Postgres is DB. We’re e-commerce so not write heavy and can eat the latency on puchases etc to a main master.

1 Like

Maybe not LiveView specific, as yeah, any client-server application will suffer latency, but as LiveView mixes UI and data in what it returns from the server, the experience could be worse than an SPA calling just for data.

Another reason why I feel like it’s a bit special for Elixir, is due to BEAM and OTP being able to be distributed, but I don’t have much experience with that so am unsure if help or hindrance.

1 Like

Generally payload sizes should at least be comparable, but I can see a difference in e.g. loading a big chunk of items and paginating purely client side, vs. being easy on the initial request, but going back to the server for sorting/filtering and such.

1 Like

I’m surprised this question didn’t come up earily and I even wondered do elixir devs actually serve app worldwide? Demo-ing liveview on localhost:4000 is not impressive. And I only hear stories about single region cluster.

Ok, real answer: I just use fly.io (alternative: appfleet.com), though they are not cheap, so I have my prod there, and staging on DO’s PaaS (the app platform) where db and app server sit together. The experience so far is that those edge servers are actually valuable, fast and is suitable for liveview. But the edge server <-> db distance is still tricky. So I read from edge server and response first, then (send self and handle_info) read,merge,write back that result to db. Mostly the former response and the one written to db is consistent. If it’s conflict later, just respond error follow former optimistic ui.

I also plan to add something like Cachex in prod for read (for optimistic ui), and still write back to db. My logic is not near db, it supposes to have legit result in order to write (to db), otherwise, response error or blow up, to maintain data consistency.


For worldwide liveview users with smooth UX/UI, I’d love to hear about other choices as well. I think most folks just target local users, on-prem, or latency expectation that is slightly better than read/write db for every request.

2 Likes

Distribution does not magically solve the problem of a high-latency connection from someone far away; you’ll still need to use e.g. a load-balancer in front (or maybe region-specific DNS settings, or both?) to make sure people try to connect to the server closes to them.

Distribution does however allow people to e.g. be notified of each-other’s changes without having to do a database round-trip, which might be very valuable depending on what you are building.

For the initial page load, LiveView does support that you load it e.g. through a CDN.

And besides this: In general (of course it depends on your particular application) I think it is a good idea for a growing interactive application to decouple the database from your application logic as much as possible; in essence: do not make requests to the DB block usage of the UI, but perform them asynchronously and display the results whenever they become available (even if this takes a second or more). This ensures that the UI experience is smooth even when results take a while to be fetched.

2 Likes

Correct. This is not a LiveView specific problem and it is going to happen on whatever app running between EU<->NZ. The LiveView specifics to this discussion are:

  1. The initial request is rendered twice: once with a regular HTTP requests and another one over WebSockets

  2. All upcoming live_patch/live_redirect happen on the established connection. This improves UX because we don’t send the layout again nor the browser has to load it again (this is similar to what you get with SPA, Turbolinks, unpoly, etc)

  3. LiveView automatically caches and reuses templates in the client - so it sends less data than other server-rendered HTML solutions (similar to what you get with SPA)

  4. LiveView runs on WebSockets, which means we don’t need to parse headers, authenticate user on the DB and so on on every request which improves response times (you can get similar with SPA if you are running on WebSockets)

So besides the initial request, LiveView should be helping with the user experience, but the discussion is definitely more general. So it probably makes more sense to open the discussion beyond the context of LiveView.

EDIT: Oh, you can also call liveSocket.enableLatencySim(200) in your browser console to have LiveView simulate latency so you can see which part of your app is not providing proper UX under high latencies. :slight_smile:

12 Likes

I wonder can we do something like this on mount to distinguish first connect from reconnect
(It’s more likely there is a reason this can’t be, but i didn’t know):

socket = 
  if get_connect_params(socket)["_mounts"] == 0, do: mark_no_render(socket), else: socket

If the goal is to optimize at this level, I think it is easier to skip the “disconnected render” and render something like “Loading…” (or nothing) instead. You can do so if your pages are private (i.e. they require login) or based on user agent, etc. Although I haven’t really seen a need for this in practice yet.

5 Likes

So besides the initial request, LiveView should be helping with the user experience, but the discussion is definitely more general.

@josevalim that’s of course if you are comparing server-side rendered templates served via standard HTTP requests. I think that’s valid, but another point of comparison is SPA, where latency doesn’t matter until you actually need to make a server round trip to save or load some data.

In LiveView you need roundtrips on each UI state change (if you live up to the promise of not writing JS code), that’s why latency is a bigger issue in LiveView comparing to standard React apps.

I’d love to hear what’s your angle on this, since that’s one of the biggest showstoppers for LiveView IMO.

Imo this is not really a promise LiveView tries to make, exactly for that reason. It’s always been marketed to solve (primarily) those problems, where the server needs to be involved anyways. As mentioned before there is overlap (e.g. in pagination), but generally one should use client side tooling for purely client side interactions. Nobody wants latency in opening dropdowns or simple accordions.

7 Likes

Exactly what @LostKobrakai said. In fact, the docs even say you should not be using LiveView for UI-only state changes. To quote directly:

animations, menus, and general events that do not need the server in the first place are a bad fit for LiveView

CSS, support for hooks and even the recent callback for integration with Alpine.js are alternative mechanisms so you don’t funnel UI-only behaviour through the server.

2 Likes

That makes perfect sense to me, thanks the clarification.

I guess this misunderstanding comes from how LiveView is described in some of the sources, that it’s a replacement for JavaScript when it comes to building web apps:

As an application developer, you don’t need to write a single line of JavaScript to create these kinds of experiences

Phoenix LiveView leverages server-rendered HTML and Phoenix’s native WebSocket tooling so you can build fancy real-time features without all that complicated JavaScript. If you’re sick to death of writing JS (I had a bad day with Redux, don’t ask), then this is the library for you!

1 Like