How to avoid multiple database queries when using LiveView streams

teamon · June 12, 2024, 8:37pm

I’m reading through the example todo_trek app to try to understand how it all works.
One thing I’ve noticed is that in the validate event handler the edited todo is built from incoming params:

github.com

chrismccord/todo_trek/blob/main/lib/todo_trek_web/live/todo_list_component.ex#L123


      
          def update(%{list: list} = assigns, socket) do
            todo_forms = Enum.map(list.todos, &to_change_form(&1, %{}))
          
            {:ok,
             socket
             |> assign(list_id: list.id, scope: assigns.scope)
             |> stream(:todos, todo_forms)}
          end
          
          def handle_event("validate", %{"todo" => todo_params} = params, socket) do
            todo = %Todo{id: params["id"], list_id: socket.assigns.list_id}
          
            {:noreply, stream_insert(socket, :todos, to_change_form(todo, todo_params, :validate))}
          end
          
          def handle_event("save", %{"id" => id, "todo" => params}, socket) do
            todo = Todos.get_todo!(socket.assigns.scope, id)
          
            case Todos.update_todo(socket.assigns.scope, todo, params) do
              {:ok, updated_todo} ->
                {:noreply, stream_insert(socket, :todos, to_change_form(updated_todo, %{}))}

While this works for the todo_trek app, it will fail when some other non-editable field is necessary to properly render the stream entry - let’s say for the sake of example that I’d like to render todo.inserted_at for every todo.

Since streams do not keep data in memory, in the validate handler I need to fetch todo from database to keep the inserted_at field rendered. This is not ideas, as the validate handler will be called multiple times, even when using phx-debounce.

One solution I can think of is to provide a short-lived cache using process dict with timer-based ttl.

What would be the preferred solution here?
Is there a way to avoid validate events at all?

cmo · June 12, 2024, 9:20pm

I don’t see why you would need to query the database on every change to get information that the client already has. You ensure the client sends it back by making it part of the form, no?

teamon · June 12, 2024, 9:47pm

The inserted_at field is not part of the form, it’s just rendered as form.data.inserted_at.

On the first render it’s all good because form.data is a full Todo from database, but after validate event it has only id and list_id fields and rendering of inserted_at breaks

cmo · June 13, 2024, 1:05am

What is stopping you from making it part of the form? You want that data, you can get it by making it part of the form, you can keep it in memory or you can query for it.

With regards to your process dictionary cache idea, I’d probably use a proper cache so you’re not hand building a cache in every LiveView process. Kind of defeats the point of using streams in the first place doesn’t it?

KP123 · June 13, 2024, 1:10am

Save them in your markup with phx-value-* properties. As outlined in the docs here you could add phx-value-inserted-at to each todo then on event send this values.

Using an in memory cache makes no sense here as you could just save the struct directly in your live view state without having to manage a new dependency.

teamon · June 13, 2024, 6:23am

Sending a single value might be ok, but imagine you need to also render Todo author avatar and you end up sending multiple values just to rebuild the Todo and User and Avatar schemas by hand. This quickly becames unmanagable and extremaly prone to errors. Not to mention that even with single inserted_at field it can be only sent as a binary so you need to convert it to proper DateTime which is just another layer of complexity. And if you render another field but forget to send it back you will only notice when you try to edit the Todo as the first render will work just fine.

I don’t agree the cache doesn’t make sense - with a long lists you can still benefit from not having to keep everything in memory but only to recently used items.

A proof of concept of a process-local cache could be something like this:

defmodule LiveCache do
  @ttl :timer.seconds(60)
  
  def fetch(key, ttl \\ @ttl, fun) do
    case Process.get(key) do
      nil ->
        value = fun.()
        put(key, value, ttl)
        value

      {value, timer_ref} ->
        Process.cancel_timer(timer_ref)
        put(key, value, ttl)
        value
    end
  end

  def put(key, value, ttl) do
    ref = Process.send_after(self(), {__MODULE__, :clear, key}, ttl)
    Process.put(key, {value, ref})
  end

  def clear(key), do: Process.delete(key)
end

## in LiveView process
def handle_info({LiveCache, :clear, key}, socket) do
  LiveCache.clear(key)
  {:noreply, socket}
end

## in (nested) LiveComponent
def handle_event("validate", %{"id" => id, "todo" => data}, socket) do
  todo = LiveCache.fetch({Todo, id}, fn -> Todos.get_todo!(id) end)
  {:noreply, stream_insert(socket, :todos, to_change_form(todo, data, :validate))}
end

While it’s possible to use socket.assigns as the cache storage it seems more complex to work with when using nested live components that have their own state. If the cache is to be shared between components it must be passed as attributes, potentially causing unnecessary rerendering.

A different way of solving this issue would be to have two zipped streams like this:

todos = Todos.list_todos()
forms = Enum.map(todos, &to_change_form(&1, %{}))

# ...

<%= for {todo, form} <- zip(@streams.todos, @streams.forms) do %>
  <.avatar user={todo.user}/>
  <.form for={form}> ... </.form>
<% end %>

and then allow updates to one stream only while keeping the other intact.

cmo · June 13, 2024, 7:53am

You have the same problem with forgetting to cache a value, no?

Is a cache per liveview better than a cache between all the liveviews and the database? You don’t think you’ll be caching the same thing in each liveview for each person on the page? And if you’re caching most of the data structure why bother with streams at all?

Are you sure that having a cache in the process dictionary that is designed to be accessed/edited in multiple places is easier to manage and reason about than passing it down explictly?

KP123 · June 13, 2024, 12:30pm

If you need the data in memory on the server don’t use streams, then you only need to send the ID of the entity you want to work on back to live view.