Using streams with recursive and/or deeply nested schemas

MarthinL · January 30, 2025, 11:49am

Update 1:

@steffend posted a complete example of how one could achieve the effect I described leveraging LiveComponent and Streams to keep each node’s data in server memory. Thank you kindly.

Update 2:
I had a few key insights since starting this discussion (and duly apologise for not thinking of it earlier).

Lists are not arrays, they merely exhibit some array-like semantics.
Being aware of ways to keep tree data in arrays I’ve been trying to come up with a way to achieve a kind of dual nature version of recursive data whereby it’s a both a (streamable) list and (recursively loaded) struct at the same time i.e. without having to maintain two copies of it. Then it dawned on me that actually lists are not arrays but exactly the recursive structure I’m looking for because the way they are implemented in Elixir to allow for nested lists they already maintain a reference to a parent which is nil for non-nested list heads.
A key part of the problem is the gigantic and indeterminate size of the recursive underlying data resulting in bloated memory and/or DOM deltas. I know I mentioned it, to myself and in this thread, but didn’t take note of one implied observation until now. My (and likely most) recursive data has two parts - the recursive structure and the data it structures. Compared to the combined node contents, the tree/graph structure is quite small. Seeing that I already employ techniques to limit how much of the large tree is loaded and presented to a user at a time, the structural part of that is amost certainly small enough to keep in memory as a whole to serve the purpose of what Steffen calls “surgical updates”.
After combining insights 1) and 2) above, it dawned upon me that if I extract only the (integer, in my case) id’s of the recursive element from the database or loaded structure, it can accurately and succinctly represent the tree strucure in a list, so that

[1,2,[3,4,[5,6,7],8],9]

represents the tree

-1
-2
+3
| -4
| +5
| | -6
| | -7
| -8
-9

[Edit] The non-obivious mapping in the above is that a nested list maintains a “parent” element reference. Semantically a list is [head|tail] where the tail is also a list, so to store a tree in there simply means adding the convention that
By how Elixir’s (linked) lists map onto erlang’s [head|tail] the internal structure in both the above cases are the same. It’s actually list operations themselves and the Enum behaviour that creates the illusion that the recursive structure is linear.

I guess it’s in honour of simplicity that the goto approach for traversing a nested list is to flatten it first, but it’s trivial to write a function which mimics each, map, reduce, fold or fold for raw nested lists. Not that it’s important, just an observation.

The implication of this insight is that I do not need to retain the whole cluster of Ecto structures to remember where everything fits into the tree. A nested list divorced from any Ecto struct metadata can store the recursive structure without any loss of information. It’s trivial to use identifiers from the list representation of the tree to reapply metadata to reload or reconstruct nodes into recursive Ecto struct when that’s needed.

I probably would, to be explicit and prevent the structural information being discarded by an innocent flatten operation along the line, extract the structure into a list where single integeres represent leaf nodes and nodes with children are represented by a tuple containing the id of the parent element and a list of its children, i.e. the same tree as above would be:

That would leave me with a small enough structure for a TreeView LiveComponent to retain as state for the whole (currently visible) part of the impossibly large tree, directly manipulate cheaply and consult to determine what changes to propagate and what to do with changes propagated from elsewhere. The meat of the node, i.e. cluster of associated schemas can then be sourced and supplied at will as a traditional flat List of such structs or a Map for quick access or a MapSet for quick access and automatic de-duplication, sourced from the database at the most opportune time, i.e either ahead of time with a bg recursive preload which is then flattened for the initial mount, as individual fetches by id if that’s required or by reloading and further preloading a node for which the children were not loaded because of they were too deeply nested for the window at that time.

In summary, I’m currently of opinion that I (and possibly others with the same use case) can prevent the recursive nature of the data from impeding on the regular and sober use of LiveView (with or without Streams) by isolating and removing the structural aspect of the data and explicitly dealing with that data as a separate concern. I’d still need to give both concerns due attention to avoid unwanted behaviour, but they would no longer cripple each other.

If it turns out (I’ve started asking what I hope would be the right questions) that the only or most effective way to get LiveView not replace existing list of children rendered into a parent when the parent is patched is to make the children LiveComponents themselves, then that’s easy and cheap to arrange because the LiveComponent would have minimal state. It would be ideal though if the nodes can be rendered as function components (i.e. no state) and still allow control over when the children should be retained or dumped when the parent node is patched. I’ll continue to investigate if that is feasible.

The same approach would also work for other key types including composite keys, tenanted or partiioned data, the structure would just grow bigger with the additional data per node to uniquely identify it.

garrison · January 30, 2025, 7:08pm

The update/2 clause that comment is placed before pattern matches on the current socket to determine whether the entry is already set, and if so it avoids streaming the children. In effect, it ensures that the children of a parent are only ever rendered the first time the tree is rendered (via the second update/2 clause, which is only reached when entry is not yet set), and after that they have to be updated individually by a send_update.

In fact, as I read more closely, I don’t believe it’s possible to update the structure of the tree at all in this example - but of course it is only an example.

MarthinL · January 30, 2025, 7:27pm

Ah, now I notice it. Cool, thanks.

In my rant about handling structure separately i conclude not having access to the structure via the LiveComponent for each node is probably a good thing. TL/DR: I’m given to the idea of splitting the problem to separate concerns - one dealing exclusively with structure and the other with the contents of each node.

PS. I did better catching the gist of the gist once I realised I should regard TreeComponent as the NodeComponent because it doesn’t operate on the tree, only on a Node.

garrison · January 30, 2025, 7:48pm

Note that this is what I described in one of my more recent replies (about the maps), and I found that approach to be helpful when dealing with updates that come in (via PubSub in my case) which touch a given record. The reason I took that path is that, otherwise, I would have to write recursive functions to “patch” the tree and I prefer to just use indirection (especially since I have very complex functionality which lends itself to recomputing the final tree as I explained).

However, what I meant with regards to the example was simply that, because the children list is never re-rendered, the structure of the tree can never change: i.e. you could never re-order a set of nodes, or move a node elsewhere. But it is just an example.

Ah, but if you were a philosopher you would recognize that a Node is nothing more than a smaller Tree

MarthinL · January 31, 2025, 10:02am

I didn’t notice it to be the same, and still don’t to be honest. It’s similar, related or even converging, but you described (the map) as a way to keep the nodes in a flat structure but if I understood correctly each node would still contain its own child list, whether that’s in effect a copy or a list of references by map key, the structure is still within the data. I’m proposing taking that just one step further and moving the knowledge about each node’s children out of the flattened data. Practically this means the association representing the list of children of a node reverted to :not_loaded.

The structure exlusively lives in a different calculated value and passed to the appropriate LiveView component as an assign, and a very lightweight one at that since the structure data fits snugly into a list of either integers or 2-tuples with an integer and recursive list. Once the structure is in that form and completely independent from the data/HTML associated with each id in the tree, any updates to it can be succintly reduced to a series of primitives which becomes the basis for updating the rendered content correctly in any situation from loading a different root which replaces all the content to adding or moving child nodes to updating a parent without touching any children to moving any portion of the tree from one parent to another. Those are all things we know how to do with trees and therefore with simple nested lists with a small enough memory footprint so we can make and compare several interim copies of it if we need to at very low cost. Plus you only need to do it once, ever. None of what you implement in structure manipulation gets affected in any way when there are changes to the content schemas.

Really, O(1)? I’d buy O(log n), but for the map code to find the key it has to, one way or another, search through the keys. My understanding is that the keys are kept in a balanced tree of sorts to minimise the number of comparisons it needs to make on average to find a key, but unavoidably as the number of keys grow access time must eventually get slower. I was also lead to believe that maps, sets and mapsets all use the exact same mechanism anyway, so I think your assumption that, being sets, mapsets would unavoidably be slower might be a little off or even false.

The way lists live in memory means all access has to be from the head walking recursively into each tail. That fundamental recursion is reduced to mere iteration through the magic of (originally Erlang’s) tail recursion detection and optimisation. The array-like semantics of lists is purely an illusion. It does come with substantial benefits, not the least of which is that a list may contain not only variable size elements but also elements of different types. Most of the list versions of Enum has been operating at maximum optimisation levels for a long time resulting in really good performance, but it will never be actually possible to access it like an array by calculating an element offset from an index.

Several of the underlying Erlang data structures including sets and gb_trees if I remember correctly, use internally defined and managed memory constructs which are opague to the user. That allows them to forfeit variable length elements and store data in fixed element size arrays with actual array access performance, i.e. calculating the address of an element as a starting point plus an index times the element size. This resulted in some highly efficient techniques from the world of advanced data structures and algorithms becoming available in Erlang and through that also to Elixir where they ended up being put to use to implement maps.

Erlang didn’t have maps (it had records, with metadata only at compile-time, not runtime, which wasn’t great) until Elixir formulated and implemented the concept which ultimately made its way bak to Erlang. The core of Elixir is still written in Erlang and I believe the data structures used to create Map wasn’t custom written exclusively for Elixir’s Map type. Map either used pre-existing Erlang library functions or what was done to make Map run faster was not exclusively for Map but helped improve efficiency for many other structures as well.

Disclaimer: It’s all open source but I’ve never been directly involved in creating or maintaing any of the Erlang, OTP, Elixir, Phoenix, Ecto or LiveView code nor have I made any sort of habit of trying to understand the underlying code. I’ve merely been professionally aware of Erlang from before it was open-sourced, evangelised many of my peers into taking a look at it who built their entire careers and businesses around it since, written a few small systems in Erlang and planned to write my life’s work’s proper server in Erlang until I discovered how far I can get how fast using Elixir and Phoenix instead. Which is a long way of saying I don’t know these things for fact because I lived it, but through keen observation over a long time, albeit usually at a fairly abstract level only. I did recently go read a bit of the Elixir code for clues as to how I might implement this unicorn dual-natured structure I thought I needed. Only after that (and the confirmation that Elixir lists are directly based on Erlang’s lists) did i truly realise that what is called a linked list is really a completely unbalanced tree turned on its side. That’s when all the pieces of the picture fell into place for me and I was able to see that all the “pointers” I thought I’d have to implement and manage in keeping with both Elixir and Erlang’s immutable data principles are already present in the list construct. It is absolutely perfect for the job and I couldn’t have hoped for a more optimal set of tools to manipulate them with already tested to the fullest possible extent.

It is clear as day to me that nested lists of primary key values represents the structural part of a tree so accurately and effeciently that there is not a shred of doubt in my mind that it’s the ideal choice of how to keep the strucure of any portion of the data in memory.

Yeah, or because I am one I differentiate between a Node and a Tree.

garrison · January 31, 2025, 6:51pm

Well no, not exactly. Associative arrays in most programming languages are built on hash tables, which are O(1) if there are no hash collisions. Which is a big if, but people usually call them O(1) even though I think it’s technically Omega(1) for that reason. Note that the trade-off versus a tree here is that hash tables are not order-preserving.

Erlang maps are actually technically hash array-mapped tries which are a more exotic hybrid data structure. I’m not sure the exact performance characteristics but you can investigate further if you are interested

But sets have no method to retrieve an object by key at all - you would have to brute-force iterate through the whole thing.

The use-case I was describing was to store the tree in one data structure (like you said) and then join it with the associated records by looking them up in a map. You would not use a set for this.

The fact that sets internally use maps is an implementation detail.

garrison · January 31, 2025, 7:15pm

Not quite. Let me try to demonstrate what I did but in the context of the (simpler) files/folders example.

There are three database tables with corresponding Ecto schemas: nodes, files, and folders. The nodes store the structure of the tree and contain foreign keys pointing to files and folders. The files and folders tables then store metadata, like names and so on.

%Node{id: 1, folder_id: 1, file_id: nil, parent_id: nil} # Root folder node
%Node{id: 2, folder_id: nil, file_id: 1, parent_id: 1} # File node
%Folder{id: 1, name: "elixir stuff"} # Folder metadata
%File{id: 1, name: "elixir.jpg"} # File metadata

This tree only has two nodes, but you could imagine an arbitrary filesystem structure. Now, you could write a recursive query to load the tree from the database. You then might join every Node with its associated Folder or File to get the whole tree.

The annoying bit, though, is that now if the user updates a folder (a rename, say) then the new version will come in over PubSub. And you have to recursively walk the tree, find that Folder node, and update it. This is valid, but there is another approach: use indirection.

Instead, you could avoid joining the files and folders to the nodes in the query, and instead load them separately. Then you can store them in maps (files = %{file.id => file} and so on) and join them at runtime by looking up the File for a file_id on a Node.

Now you can simply update the folders map when a new Folder comes in over PubSub, and then it will appear next render. I think this is in the same vein as what you are talking about, but correct me if I’m wrong.

There is one problem with this approach: LiveView is not smart enough to diff the arbitrary folders lookups, so any change to folders would re-render the entire tree and send the whole thing down the wire. I avoid that problem by materializing the tree myself into a simplified representation which I then pass to the LiveComponents. So the “simplified tree” comes out looking like:

%{
  id: 1, type: :folder, name: "elixir stuff", children: [
    %{id: 2, type: :file, name: "elixir.jpg"},
  ],
}

This might sound like a lot of extra complexity, but my use-case is very complicated because the UI is highly interactive and has a lot of moving parts. I iterated several times before arriving at this design, which I found to be radically simpler and more performant. Your situation may vary of course.

MarthinL · January 31, 2025, 7:31pm

I think we’ve misunderstood each other on this. Map for the flattened content is spot on and I was discussing how one can store the structure as nested lists. I believe you made reference to using maps in the context of the streams discussion before I decided to physically split the structure and content. But it’s no issue, we’re in agreement that the bulky node data can live in maps if you’re going to keep them in memory, or loaded from database if you need to save on memory and if you need it, they can go into streams as well. As long as you can “surgically” remote control the DOM, i.e. to change the children shown content of container or not depending on how the structure changes, it should work well.

I haven’t yet figured out if it really will require a LiveComponent per node as per the example or not or what the overheads for that is like, but I’ll get there.

garrison · January 31, 2025, 7:50pm

If you want to minimize diffs over the wire you will probably want one LiveComponent per node as discussed previously. The overhead for LiveComponents is not very large, they are really just a vehicle for maintaining the diffs and that overhead is obviously unavoidable. Importantly LiveComponents live in the same process as the parent LiveView so when you pass things to them they will share that memory (no copying).

MarthinL · January 31, 2025, 7:52pm

garrison:

There is one problem with this approach: LiveView is not smart enough to diff the arbitrary folders lookups, so any change to folders would re-render the entire tree and send the whole thing down the wire. I avoid that problem by materializing the tree myself into a simplified representation which I then pass to the LiveComponents. So the “simplified tree” comes out looking like:
%{
  id: 1, type: :folder, name: "elixir stuff", children: [
    %{id: 2, type: :file, name: "elixir.jpg"},
  ],
}
This might sound like a lot of extra complexity, but my use-case is very complicated because the UI is highly interactive and has a lot of moving parts. I iterated several times before arriving at this design, which I found to be radically simpler and more performant. Your situation may vary of course.

Yes, that is a LiveView challenge that to date we’ve only seen one viable solution for. I’m hoping more will come to light.

Our approaches are converging. I’ve taken the simplification of the tree view a whole lot further until it genuinely only contains the IDs by which to find the actual content from whereever they are, but it’s the same principle. You’re storing the simplified tree still in nested maps, I’m using essentially nested lists or lists of either integers (meaning it’s a leaf node) or tuples {id, child_id_list} meaning it’s a branch or node with children.

For completeness, I’m actually storing the structure in two parts. Both are list based, but the semantics are slightly different. The first is a straight list of ids representing the path to the node that forms the “root” of the tree data being displayed, like breadcrumbs. All the nodes in that list (must have) children (in order to be part of the path) but I’m not storing any detail about how many other siblings each node miight have or such as that’s not relevant data. The second part is the recursive structure of the part of the data for which HTML has been generated and sent to the client that’s been extracted from the nested associations before they’re flattened (with the recursive association reset to not_loaded).

MarthinL · January 31, 2025, 8:01pm

The jury is still out (in my case) about that (both the diffs part and the unavoidability of the overhead). I don’t have sufficient command over all the tools at my disposal just yet. It is obviously in my interest to keep the load per user session to a minimum, but there’s still too many variables and at some point it will have a run-in with the law of diminishing returns.

MarthinL · January 31, 2025, 8:05pm

For reference sake, the structure part for that (in the form I current use, which might change at will) looks like this:

[{1,[2]}]

But that only works if it really is a recursive structure, i.e. if every id refers to a record in the exact same schema. For my data that holds true, but in your example you have separate schemas for files and folders. I guess though that it’s merely a consequence of the oversimplification of the example you’ve constructed. I too have very many schemas involved in the node, and I need and load them all for the components to turn into HTML. I count all of that as one node with one id of the recursive schema which eventually gets to a list of records (with ids) of the same schema again. Those would be the next (child) ids that goes into the tree structure. Nothing else.

garrison · January 31, 2025, 8:22pm

My comment had grown long so I was not as clear about this as I should be. I am essentially storing two copies of the tree. The first is a tree of Nodes which comes from the database (loaded via recursive query from Postgres), and then the maps of folders and (in our example) files which go with it. (If you will recall my real app is for RSS feeds, not files, but this is immaterial.)

Then there is a second form, which is stored alongside the first. More specifically, it is recomputed from the first each time either the tree structure or any of the maps (files, folders) changes in response to a PubSub broadcast.

This second form, which again lives alongside the first form, is the “simplified tree” I described (when I came up with this design I called it a “rendered tree”, but I am trying to avoid overloading that term in this conversation).

The purpose of the “simplified tree” is twofold: first, it materializes the changes which come in from the PubSub, which in my case is very important because a change in a single record can actually affect many places in the tree. This may not apply to you, but again my UI is highly interactive.

The second purpose is that the LiveComponents actually need this materialized structure to properly diff their own attributes. If you imagine the alternative - passing the files and folders maps down to each LiveComponent along with the tree - you will find that updating the maps can have performance problems because LiveView is not (or at least was not when I tried) smart enough to diff these arbitrary map accesses at runtime.

So, by passing a “simplified tree” to the LiveComponents instead, they are able to easily see exactly which attributes have changed since the last version, and as a result the diffs are very minimal.

Honestly, thinking about it now, I think the best name would be “materialized tree”. I’m afraid that may be the third or fourth name I’ve given it over the course of this discussion…

garrison · January 31, 2025, 8:43pm

This is a lot to comprehend so here’s a purely “visual” example:

# Tree structure
root_node = %Node{
  id: 1, parent_id: nil, folder_id: 1, file_id: nil,
  children: [
    %Node{id: 2, parent_id: 1, folder_id: nil, file_id: 1},
    %Node{id: 3, parent_id: 1, folder_id: nil, file_id: 2},
  ],
}
# Folders
folders = %{
  1 => %Folder{id: 1, name: "Elixir stuff"},
}
# Files
files = %{
  1 => %File{id: 1, name: "elixir.jpg"},
  2 => %File{id: 2, name: "jose.png"},
}

Then we have a recursive function, materialize/3, to “materialize” the tree.

def materialize(%Node{file_id: nil} = node, folders, files) do
  %{
    id: node.id, type: :folder,
    name: Map.fetch!(folders, node.folder_id).name,
    children: Enum.map(node.children, &materialize(&1, folders, files)),
  }
end

def materialize(%Node{folder_id: nil} = node, folders, files) do
  %{
    id: node.id, type: :file,
    name: Map.fetch!(files, node.file_id).name,
  }
end

Which would produce the following “materialized tree”, which we pass to our LiveComponents.

%{
  id: 1, type: :folder, name: "Elixir stuff",
  children: [
    %{id: 2, type: :file, name: "elixir.jpg"},
    %{id: 3, type: :file, name: "jose.png"},
  ],
}

So now, if we received a new folder over PubSub, we would essentially do this:

def handle_info({:folder_updated, %Folder{} = new_folder}, socket) do
  %{root_node: root_node, folders: folders, files: files} = socket.assigns

  folders = Map.put(folders, new_folder.id, new_folder)
  mtree = materialize(root_node, folders, files)
  socket = assign(socket, folders: folders, materialized_tree: mtree)

  {:noreply, socket}
end

MarthinL · February 1, 2025, 2:48pm

Before I can start processing your constructed example I need to first ask this: You refer to two structures, at least in the simplified form both containing a type and a name with the only difference whether or not it has children. Do you really require two types and two structures in order for the sample to remain relevant to your actual data, or can we exploit the notion that the only difference between the two we care about is whether or not it has any children?

If that would be fair to say you don’t need to keep a type and always have a children field which if it’s nil or (your choice) it means a type of file and if there are children it means type is folder.

In case it’s what you’re thinking, no, it’s not nitpicking from my side but an important distinction. See, in my interpretation of recursive trees if files are it’s own type then the list of files in a folder would be considered straight node contents and not form part of the recursive structure. The structure exlusively concerns itself with actually recursive content, i.e. folders with folders as children (i.e. subfolders) n-levels deep.

If your tree is recursive in multiple dimensions, i.e. on top of folders having some files and some folders, files may also have sub-files and sub-folders too. even though that’s not how your example is set up (although in reality it’s what happens with zip files when they’re seen as folders in like a file manager) if that’s closer to how your actual data is structured, then you are most likely justified in keeping your materialised view in the manner you describe.

I too have built two kinds of recursion into my srtucture. But I present them differently to the user, only one of which involves an actual tree view while for the other I have novel (custom) nested representations I’ve come up with as key enabling concepts for my app. I handle them independently even though they all interconnect to form the massive globally distributed dataset. This is all part of the pre-work in design I’ve mentioned whereby the user experiences nothiing of the ovewhelming volume and complex nature of the underlying data. What the user sees can be extremely simple if they choose, or they can explore and manage as much detail and complexity as they feel comfortable dealing with for a while before they encapsulate the complexity again behind a synopsis of their choosing.

The point being that rather than talking in circles around an irrelevant example, I’d rather first understand if your data and my data share sufficient strucural similarities to allow for a common solution.

I believe we’ve already established one potentially viable option with regards to controlling partial updates by using LiveComponents, and that keeping structural data on its own (as a stripped-down copy or in another form) in session state enables better change detection, propagation and handling without the (memory and/or) bandwidth cost of keeping a fully populated tree in memory.

Unless our datasets follow the same or very similar recursive principles putting implementation details under a stronger microscope could see us talking cross-purposes. That would become counter-productive, frustrating and an unnecessary distraction for one or both of us. It’s very useful to feed off each other’s insights, eperiences, learning and where possible to discover and leverage relevant options and opportunities and hopefully follow aligned approaches, but it is not important that we use the same code or structures to do so.

My user interface is also highly interactive but in all likelihood in fundamentally different ways than yours. One key difference I see already is that your data is born externally (as RSS feeds) which you’re helping the user track and make sense of by doing as much interpretation of the data as you can muster on behalf of the users, while my point of departure is that the data is born internally (each user’s own) and connected to external data only through the individual (human) interpretation of users. That might be inaccurate or immaterial as a difference, or it could be the core reason why the datasets we use are and must be structured, presented and manipulated differently. I doubt we have the capacity to learn all about each other’s data and use-cases to find that out, so my suggestion is that we keep discussing general principles and opportunities on the abstract level rather than get too caught up contriving an example which represent both our use-cases without directly being either.

MarthinL · February 1, 2025, 2:59pm

garrison:

Which would produce the following “materialized tree”, which we pass to our LiveComponents.
%{
  id: 1, type: :folder, name: "Elixir stuff",
  children: [
    %{id: 2, type: :file, name: "elixir.jpg"},
    %{id: 3, type: :file, name: "jose.png"},
  ],
}

If we must discuss detail, this would be where your example departs from what we’ve been discussing. If there is a LiveComponent per node it should not (need to) be passed the entire tree but only ever a node’s worth of data at a time. It’s OK to keep the strucural part of the tree (which you call the materialised tree) in memory, but only once for a specific user session’s currently visible portion of the data. There should not be a copy of it in every node.

garrison:

So now, if we received a new folder over PubSub, we would essentially do this:

def handle_info({:folder_updated, %Folder{} = new_folder}, socket) do
  %{root_node: root_node, folders: folders, files: files} = socket.assigns

  folders = Map.put(folders, new_folder.id, new_folder)
  mtree = materialize(root_node, folders, files)
  socket = assign(socket, folders: folders, materialized_tree: mtree)

  {:noreply, socket}
end

Doesn’t this translate into exactly what we’ve been trying to avoid, i.e. if the a branch of the tree changes then the entire branch is rerendered and propagated?

garrison · February 1, 2025, 6:40pm

When you render a recursive structure, the function to render that structure is also recursive. For example, you might have the following component to render a tree (note that this is not a LiveComponent, just an example):

def tree(%{node: _} = assigns) do
  ~H"""
  <div>
    <h2><%= @node.name %></h2>
    <div class="children">
      <.tree :for={c <- @node.children} node={c} />
    </div>
  </div>
  """
end

And so, you see, you have to pass the entire tree to the root node, and each successive child gets a subset of the tree to render, recursively.

This question is encouraging to me, as I think you’re starting to build a good intuition for the rendering process. But your understanding is still not quite right.

For the example I just gave (the <tree /> component), you are correct. Because function components can’t diff arbitrary collections (like the children lists), the whole thing would be sent back down the wire.

This is where the LiveComponents come in. If you rewrote that <tree /> component to be a LiveComponent instead, each node in the tree would diff-track its own assigns, so only a minimal diff would be sent down the wire for exactly what changed between the previous and current “materialized tree”.

Note that the entire thing is always “re-rendered” on the server, there’s nothing you can do about that (it’s how LiveView works, and it’s how React works too for the record).

But what you can control is the size of the diff sent down the wire, by using LiveComponents.

garrison · February 1, 2025, 6:50pm

I designed this example to be understood, try not to take it too literally.

If you are wondering, my real trees contain many more fields than just a name. Things like icons, collapsed state, and statistics like unread counts that have to be recursively calculated as I “materialize” the tree. I also deliberately left the door open for more than just two “types” because I may want to put other things in the tree in the future (collections of bookmarks, for instance).

MarthinL · February 2, 2025, 8:35am

I didn’t wonder too much about it but my silent assumption (ass, u, me) was indeed incorrect. In terms I’ve heard you use your materialised tree is orthogonal to what I’ve been describing as the recursive portion of the data that’s been reduced to just an id per node and separated from the rest of the informational content. Thanks for pointing that out, at least we can stop talking past each other a little bit better now.

MarthinL · February 2, 2025, 5:30pm

Hi @steffend, I’ve been playing arund with your example to understand it better, but my changes keeps breaking the “Load Children” button. I then went and added debug statements to see which of the two update clauses fires when, but even using your original version the first update clause never gets called, always only the second clause.

Can I ask you to just talk me through what that first update clause is meant to achieve and how it’s supposed to go about doing that? Specifically, the code

(assigns, %{assigns: %{entry: _}} = socket)
is so far outside my frame of reference I just can’t seem to undestand it. The only assigns that ever shows up in the socket is the one for :myself.

The worst part is of course that you code seems to works as advertised without ever matching on that first clause, but when I change how the tree is rendered from simply nested ul’s to using a nested

    <details><summary>{@entry.name}</summary>
       <ul>
            ...
       </ul>
    </details>

construct it fails. It seems like the correct HTML gets generated (i.e. a new version of “subfolder 2” without the button but with a single child “more.txt”, but instead of just adding the child or at worst updating both the parent and the child, the result I get is that the original “subfolder 2” with button below it is left untouched but it gets a duplicate “subfolder 2” with “more.txt” child added to it. I can “at a stretch” imagine the result I see being cause by the code of the second update clause, but I cannot figure out why that same code works for your test case and not for mine.