Possible payload size improvement to HEEX list comprehensions

Jskalc · July 17, 2024, 9:57pm

Hi everyone! Recently I was thinking a lot about the way HEEX renders lists. People are generally surprised about huge payloads being send on any list change, and I had to work around this in one of my recent projects to keep site responsive.

I think there’s a possible payload optimisation inspired by client-side VDOM implementations. Let me explain by providing a very simple phoenix playground rendering a list.

Mix.install([{:phoenix_playground, "~> 0.1.3"}])

defmodule DemoLive do
  use Phoenix.LiveView

  def render(assigns) do
    ~H"""
    <button phx-click="add">add</button>

    <ul>
      <li :for={i <- @items}>
        <%= i.name %>
      </li>
    </ul>
    """
  end

  def mount(_params, _session, socket) do
    {:ok, assign(socket, :items, [])}
  end

  def handle_event("add", _params, socket) do
    items = socket.assigns.items
    id = length(items)
    new_item = %{id: id + 1, name: "New#{id + 1}"}
    {:noreply, assign(socket, :items, [new_item | items])}
  end
end

PhoenixPlayground.start(live: DemoLive)

The problem

LiveView has many exciting optimizations, but they mostly doesn’t apply if you’re using a “:for” loop. In my basic example, on each list update LiveView sends all the elements again, even if most of them are exactly the same as before.

In this case, problem could be solved trivially by using Streams. But in many other cases it’s not feasible because:

Streams works only as top-level assigns. If you have list as an attribute of an object (eg. ecto associations like current_user.posts) then using streams is not easy.
Streams API is different. You need to understand how exactly list changes. It might complicate your business layer (contexts) just to get that information.
Streams gives you both memory optimization and payload optimization. I believe second one should be available out of the box, even without using streams.

Idea

This problem is not new. Multiple frontend frameworks use Virtual DOM, where they need to calculate a minimal patch between VDOM and DOM. They simply require :key attribute when rendering a list, for example Vue.js

Then, it’s rather trivial to figure out:

if element is new (key not present in the old state, but present in the new one)
if element was removed (key present in the old state, but no longer in the new one)
if element was updated (key present both at old and new state, then we should calculate diff recursively for these elements)

So, maybe LiveView could go the same route? If we could introduce :key as an optional attribute, HEEX engine could calculate:

added elements (and their positions)
removed elements
updated elements (and their positions)

and send an efficient, minimal payload to the client. This actually could even enable efficient diffs for components nested below :for loops.

An example (notice added :key):

  def render(assigns) do
    ~H"""
    <ul>
      <li :for={i <- @items} :key={i.id}>
        <%= i.name %>
      </li>
    </ul>
    """
  end

Implications

This would require to keep previous assign of a list around in assigns.__changed__, which is currently not the case.
This would impose a small overhead of calculating old keys, new keys and figuring what was updated, but only when :key is present. I think it’s worth to do it and send a smaller payload than to send everything.
Reordering might require some thoughts
I’m not sure if it would be a breaking change, possibly not?

All in all, I believe that optimization should be possible to accomplish. I might give it a shot myself, just wanted to first ask community for some feedback What’s your opinion?

EDIT: I wasn’t aware it’s a category only for registered users, could someone move it to phoenix questions or some other public place? Would like to refer this in github PR / issue if I’ll be able to sit down and tackle it.

josevalim · July 18, 2024, 1:04am

There are two challenges in making this practical:

How will your template know that the right side of ← in your comprehensions is state it needs to keep around between renders in order to generate diffs? It is easy to do at the top level (which is what streams do) but hard otherwise. Even if you keep lists in __changed__, how will you diff, for example, a list inside another list? Or a list that is generated by a function or Enum.map?
If the list is large, you will need to traverse the old and new list in order to find which keys were added and removed. And you have to keep the whole old list in memory for this to work. Both problems are solved by streams.

Jskalc · July 18, 2024, 4:15pm

Good points, as always.

We only need to compare with previously-rendered list, correct? So I’d say it’s a job of __changed__ to keep that old state around until the render is completed. We don’t need to keep keys for each comprehensions separately, we could calculate old keys based on __changed__. There’s one possible problem though - non-deterministic function used in list comprehension, based on a current state. I wonder how often this could happen? I don’t think I ever used such a construct.
Correct me if I’m wrong, but isn’t it a small cost to pay? Currently we always traverse the whole list once and render all the elements underneath. With my suggestion we would need to traverse lists twice (assuming similar size post-update) but at the same time it could possibly skip rendering multiple children elements if they were not updated. Regarding memory, we only need to keep an old version around until render is done, so possibly not a long time. We already do it with maps that might contain lists as well. Also, depending on how exactly GC works, that object might be still around even if we don’t keep it in __changed__. Last thing, an old list and a new one might reuse multiple elements, so it’s not that cost of keeping old list in memory doubles the consumption…

A good question to ask: if it would make rendering 5-10% slower for certain use cases but reduce payload by 90%, is it worth it? I believe the answer is yes

josevalim · July 19, 2024, 1:38am

Yes, __changed__ only keeps the list (the assign) but not any transformation you apply on it. You have a good example on depending on other state that may change.
You need to traverse the list twice and build maps based on keys. Then you need to compare the values of the keys: this may not be a cheap operation, to say two values are equal (which may be the majority if the list only changes a little), you need to fully traverse two values. For example, comparing two user structs will compare all fields and all associations recursively.

Plus the memory cost is not the cost of keeping two copies in memory. It is the cost that you need to keep the whole collection in memory after rendering until the next render. While streams can purge everything immediately after render.

Anyway, don’t let me be a party popper. If you feel you can tackle these limitations, go for it!

Jskalc · July 19, 2024, 8:55am

Is that so? I thought __changed__ is kept around between assigns in handle_event etc until render and then discarded, so in most cases a few milliseconds.

Anyway, I think I’m curious enough to try it I might ask for some guidance later, if you won’t mind.

josevalim · July 19, 2024, 3:14pm

You are correct. But in order to have something in __changed__, you need to have the list of the previous render, which is the old state in __changed__. Streams do not require you to do it. In other words, streams allow you to remove the collection from assigns after render. Your approach requires you to keep the collection in assigns (so it goes to __changed__ in the next render).

Feel free to ask and reach out!

Zurga · July 19, 2024, 7:00pm

What if you store a hash of the object that is represented by the key? It would not allow you to see how an element in the list has changed, but at least it is better than sending the entire list.

This would reduce the memory footprint only and might add more computational strain.

I’ve had no problems so far storing the associations as an assign after mount/update/handle_* and handling them with Stream. Just have a function that handles setting the assigns.

Having said that, I can understand why OP would want this.

Jskalc · July 22, 2024, 9:38pm

I’m aware assign still has to be in memory. For me it’s not a downside - it’s exactly the same case as simply rendering a list comprehension without using streams.

Just to be clear about my goal - I want to optimize HTML diff size. My idea won’t help with memory usage, nor with performance (it might be slightly slower). But I believe it’s still worth a shot since IMO this is the last substantial improvement to be done in that area

I’m getting familiar with the HEEX engine right now, just real life gets in the way

josevalim · July 23, 2024, 11:57am

If we are going to ask people to add a :key to their comprehensions, wouldn’t it be better to ask them to use streams? I understand this is simpler to use but if the recommendation will be to prefer streams whenever possible, because they reduce HTML size, perform better, and use less memory, it is hard to argue its inclusion (and that’s not even considering complexity in the implementation).

LostKobrakai · July 23, 2024, 12:13pm

I’m not sure I subscribe to the notion that streams are superior on all accounts. Stream work great for what they intend to do, but not everything is modelled to a insert/update/delete event system, which can just be forwarded into streams apis. Once that’s the case by now the suggestion has been “track the keys manually”, at which point I’d argue that no – streams are now no longer the better option to a built in key tracking option.

chrismccord · July 23, 2024, 12:43pm

I just want to note that key tracking could be added to streams without holding the collection in in memory, at the cost of only holding the bookkeeping keys, so it’s not 1 vs the other in this regard.

josevalim · July 23, 2024, 1:18pm

If that’s true, then starting with the higher level APIs of how these approaches differ is probably more relevant than the low level implementation details, because then there may be a better option than both mentioned here. At the low level, I think the streams trade-offs are hard to beat

LostKobrakai · July 23, 2024, 2:12pm

There is quite a good example in LV: Phoenix.Component.inputs_for. The form abstractions in place right now don’t tell you which items in a list of assocs/embeds changed. The result of those abstractions just give you a new list. Currently it’s plain unoptimized, rerendering all the inputs of all the nested forms. If there would be a way to provide a key for individual items there could be optimizations, where unchanged items in the list don’t need to send data or rerender.

I don’t really understand how this is such a controversial topic given it’s exactly how react, solid, vue, … handle the issue of preventing unnecessary rerenders. At least in the current implemenation I don’t see streams are a replacement for that. Streams only work if you can push the detection of which record changed upstream. They don’t help in detecting which records of a list changed, which is what this topic is about by my understanding.

josevalim · July 23, 2024, 10:47pm

That’s indeed a very good example of a problem.

I think there is some confusion here. I don’t think the problem statement you gave is controversial, however, I still think if the proposed solution is possible is controversial/debatable.

To understand precisely what I mean, let’s try to imagine how we could solve this problem. Here is the inputs_for example in latest docs:

<.form for={@form} phx-change="change_name">
  <.inputs_for :let={f_nested} field={@form[:nested]}>
    <.input type="text" field={f_nested[:name]} />
  </.inputs_for>
</.form>

What you want is to, if @form[:nested] represents a collection, you want to automatically compute the diff. The first immediate problem is that the loop today is inside inputs_for. Maybe we could make the key an argument to inputs_for (if it is possible it is a separate discussion) but let’s assume we introduce a new function that allows us to extract the comprehension out:

<.form for={@form} phx-change="change_name">
  <%= for f_nested <- inputs_for(@form[:nested]), key: "..." do %>
    <.input type="text" field={f_nested[:name]} />
  <% end %>
</.form>

In order for this to work, we need to figure out what is a good key. But let’s keep it simple for now and say the key is the index. This at least avoids sending all inputs when you only change one. It will likely cause everything to be sent when adding or deleting entries. Good enough for now.

Now that we have a key, we need to figure out how to diff the previous value of inputs_for(@form[:nested]) (from the previous render) with the value of inputs_for(@form[:nested]) from the new one. The trouble is: template rendering cannot keep state, so we cannot automatically store the comprehension we traversed in the old value. You could say: “we have the previous value of @form, so why don’t we evaluate inputs_for(@form[:nested]) again with the old value of @form”?

Well, you could do that for the example so far, but what if it was written as:

for f_nested <- my_inputs_for(@form[:nested], @another_assign), key: ...

Or this:

for f_nested <- my_inputs_for(@form[:nested], some_function()), key: ...

We are now saying that rendering a “keyed-comprehension” needs to reevaluate some expression (the right side of <-) in the “past” with 100% fidelity, so we can compare old and new. That looks very hard to do reliably in practice.

That’s the crux of my argument: I don’t think you can simply key a random comprehension without adding a bunch of restrictions.

A simple solution would be to say: “look, you can only key comprehensions if you read the value directly from an assign, this way we can easily track before and after”. So we have something like this:

for f_nested <- @inputs_for_nested do
  ...
end

And you could set the assign using a special function in LiveView, like this:

socket
|> assign(:form, form)
|> keyed_enum(:inputs_for_nested, inputs_for(form[:nested]), key: ...)

And when you look at it, that’s precisely what streams do!

In fact, assuming inputs_for exists, I could implement keyed_enum on top of streams today. I just need to compare the previous assign with the new one, based on the key, and generate specific insert/update/delete commands into the stream. All we need to do is to have a function called inputs_for that returns a collection of children forms.

So yes, the solution is a controversial topic because there are several trade-offs that need to be considered. And, for the problem statement you gave, the solution lies (IMO) on adding a function completely unrelated to streams or comprehensions.

Looking at react, solid, vue, and etc is useful but ultimately they run on the client and in a vastly different paradigm. I am afraid we cannot ultimately change where the detection of a record happens, although we could make the generation of a stream from two assigns (old and new) easier.

LostKobrakai · July 24, 2024, 8:05am

Those are all fair points – especially in the case of inputs_for.

That a very real issue, but also it’s not one that’s unique to for comprehensions. If you call functions within a template that’s known to break change tracking elsewhere as well. I think we’re on the same boat that this is a problem. Tbh I think that one might just need further additions on the form abstractions, so they could return a plain (deeply nested) datastructure to change track instead of requiring intermediate function calls in the templates.

But a nested (tree) structure means you cannot track list changes only on the top level assigns like streams do, unless you chunk up the tree in nested live_components and have a stream be on each of their top level assigns. I think that would work, but doesn’t feel like a nice experience needing to move to a different module/state context wherever there’s a list comprehension to change track better.

This one is the big part here I think. This means we cannot optimize which list in assigns needs tracking from the actual usage within a template. I surely missed to consider that. I’m wondering if a template could relay such information back at some point – at least where there’s no function calls involved. Probably not easily done though.

So I see that for now this constraint would require us to state upfront for all assigns which lists in them should be tracked by which strategy, which to me would already be an improvement. That’s essentially your resolution. I’d love to see if that could work for nested data instead of top level assigns only though.

Jskalc · July 24, 2024, 12:12pm

I agree with LostKobrakai here, and that’s the reason why I’ve started the discussion. Streams are superior in all aspects, but only if you can use them. And you can use them efficiently only if your context gives you diffs. Why?

Let’s consider (IMO) the most common case of using list comprehensions - looping through an Ecto collection. For example:

assign(socket, :users, Accounts.list_users())

Let’s consider one is not using PubSub for publishing the updates / creates / removals (it can get time-consuming to do if you’d need to do it for all the resources), and would like to periodically refresh visible users. The easiest way is to simply run the assign again, eg. in a periodic handle_info:

def handle_info(:refresh, socket) do
  Process.send_after(self(), :refresh, 10000)
  {:noreply, assign(socket, :users, Accounts.list_users())}
end

By default, you’ll send the whole collection back to the client, even if nothing was changed

Ok, so let’s use streams for that case. Sadly we need to write a diff code by ourselves, otherwise streams won’t help with the payload at all. But since we need to do the diff, we don’t save any memory because we need old assigns to calculate the diff. Do you agree @josevalim?

def handle_info(:refresh, socket) do
  Process.send_after(self(), :refresh, 10000)
  users = Accounts.list_users())
  old_users = socket.assigns.users
  
  added = for u <- users, do: ... # calculate added users
  removed = for u <- users, do: ... # calculate removed users
  updated = for u <- users, do: ... # calculate updated users  

  socket = socket
  |> stream(:users, added ++ updated)
  |> stream_delete(:users, removed)
  
  {:noreply, socket}
end

So, for a case like this, nobody will consider using streams, because it’s simply too much work for just optimising the payload.

Yes, we could do that. So we could write something like

stream_update(socket, :users, old_users, new_users)

and hopefully it would work - we wouldn’t get any benefits regarding memory usage, but at least payload would be optimized.

Just, imagine that users is not a top-level assign, eg an ecto preload:

assign(socket, :organization, Accounts.get_organization(preload: [:users]))

If we would like to optimise that, we’d need to introduce top-level stream just for users.

org = Accounts.get_organization(preload: [:users])
socket
|> assign(:organization, org)
|> stream(:users, org.users)

Keeping it in-sync with org.users is additional overhead for the developer. But still, we didn’t yet arrive at the “worst” case:

assign(socket, :users, Accounts.list_users(preload: [:comments]))

So basically we’d like to render a list comprehension within list comprehension. Optimising it with streams is a lot of work, so the vast majority of developers won’t even try. On the other hand, adding two key attributes should be trivial for the developer.

<div :for={user <- @users} key={@user.id}>
  <div :for={comment <- user.comments} key={@comment.id}>
     <%= comment.content %>
  </div>
</div>

And this is also leads me to another point: Streams are not able to optimise payloads of individual items, because they don’t have the previous value. So if we’d consider a stream rendering user like this:

<div :for={{dom_id, user} <- @streams.users} id={dom_id}>
  <img src={user.avatar}/>
  <span><%= user.name %></span>
  <span><%= user.description %></span>
  <span :for={comment <- user.comments} ><%= comment.content %></span>
</div>

If only name was updated, stream will still send all of the dynamic values to the client, since it has no way of determining which values were updated. In my case it’s negligible, but I’ve worked with much more complex collections, consisting of many levels of components. Streams improve the payload by sending less items, but on the other side each item is not optimized.

I think this is the most important problem you see right now, correct? If we could somehow reliably get old keys, both when they’re taken from assigns and from some function calls, then you’d be more keen on supporting that improvement?

Still, doesn’t that issue apply to normal change tracking and is already mostly solved? Eg if someone renders

<div :if={check_condition(@current_user)}/>

then the function is executed each time, and I believe it’s optimised as well (eg. if it evaluated to truthy value both on that render and the previous one coming from __changed__, only the diff will be send?). What happens if the function is non-deterministic and might return different value for the same argument?

To summarize, I think streams are not the one-fits-all solution, because:

you need to have access to diffs, otherwise they optimize only payload size
they’re not able to optimize payloads of individual items
you need a separate assign for each stream (problem if your collection is a field of another assign)
they require a more verbose & careful code (where exactly we want to insert that item? what should be the limit?)
optimizing lists within lists is very hard with streams

Implementing this in Phoenix HEEX engine correctly will be very challenging. Still, don’t you think it might be worth at least to try?

josevalim · July 24, 2024, 5:11pm

I love this argument.

Those are different. Change tracking is used exclusively to say if a part needs to be recomputed or not (it is effectively a boolean) and we can be conservative about it. And even then, what tells you if something changed or not is the assign you do in your handle_* callbacks, not the view. What you want is to compute a previous value (that needs to be reliable!) and detect what changed in the view, and I am really afraid that’s not possible.

Given everything we said, here is what I propose:

Implement a function called stream_diff, that compares two assigns and generates the relevant stream items, as I mentioned in my previous comment. This can be implemented today using the existing APIs and it would be called in your handle_*callbacks
Introduce a mechanism in streams that allow you to say “I have the previous element too, so generate the proper change tracking code”, this will require changes to Elixir and the JavaScript client. We could add this to the existing API, such as stream_insert(socket, :foos, new_element, previous: ...). Then we can change stream_diff to use it

So ultimately we are still using streams but we make “diffing lists” much simpler.

Schultzer · July 24, 2024, 6:22pm

Nice!

I always found that the stream API was missing this and I had to manually use JS to deal with this to improve performance of a big complex realtime database management app.

So improvement here is very welcome!

Jskalc · July 24, 2024, 9:54pm

I like your idea This solves everything except the need to use streams as a top-level assign and synchronizing it with eg. preloads.

stream_diff seems like already almost implemented in ecto put_assoc (I assume it figures out what to update / delete / insert)?

Maybe after this is done we could think about further improvements

Jskalc · July 26, 2024, 7:48pm

I’m thinking about the implementation and there are some non-obvious challenges.

Let’s start with the parameters of stream_diff.

def stream_diff(socket, stream_name, current_items, previous_items, opts \\ [])

would it be enough? (opts for some options that might be needed in the future, if we won’t find anything relevant we could remove it)

The bigger issue is with reordering. So let’s imagine we have a stream with two items. We swapped their order, and called stream_diff. Currently it would need to be solved by removing the first value and inserting the second one, correct? Reversing a big list would be even worse.

Should we consider such a case and try to optimise it as well? We could introduce sth like

def stream_reorder(socket, stream_name, changes)
#. ...

stream_reorder(socket, :test, %{1 => 3, 2 => 5})

Without this it’s impossible to make it really optimised. Case of reordering seems important for eg. sortable tables.

Than you in advance! Btw, what would you think would be a better place to have such a discussion about implementation details? Here as well or maybe open an empty PR just for a discussion?