Hey all, I’m building my own little app (side project) and I’m looking for a tree data structure. I’m honestly wondering if I’m overlooking something but I couldn’t find a library that’s right for me.
The whole thing should be “generic” (think, like a DOM), i.e., like this:
- Arbitrary number of children per node.
- Children are ordered.
- Easy to traverse. E.g. there should be “get sibling right after this”, “get parent” etc.
- Effiiciently find child by node ID (no zipper; I wanna build a microservice that manages the tree)
I could use a digraph from Erlang but it’s actually too flexible for my use case and would require a wrapper around it to make sure nobody’s creating a loop.
In particular 5. seems hard to satisfy with the libraries I found.
Do you guys agree that there’s currently no library that does this? Any advice on how I would build this without going crazy?
Thanks a lot!
Not sure I’m following the problem here - you can pass :acyclic
to :digraph.new/1
and the library will handle enforcing that restriction.
I only see 4 numbered bullet points, did something get lost in editing?
“Generic” is hard. Different implementations are going to have seriously different performance (both time and space) for specific operations.
For instance, a tree implemented as a closure table can answer queries like “what are all the descendants of this node” very efficiently. But moving a subtree in that implementation is very slow, because it needs to touch many rows for each element of the subtree.
An adjacency-list representation (aka “put a parent_id
column on the record”) has the opposite properties: moving an entire subtree only needs to update the parent_id
on the subtree’s root, but checking for descendants requires recursive CTEs or other SQL trickery.
Hey, thx for the pointers.
Regarding :acyclic
: This would definitely make it better but acyclic digraphs can still have undirected cycles (two nodes pointing to the same child), so it’s not enough. I guess you can get around that by restricting the access methods a bit more (e.g., don’t allow adding edges, just moving nodes).
Oh, and in a digraph, the children of a node (aka. the edges going out of a node) are not ordered, right?
Regarding 5., this should’ve been 4. (no zippers).
Regarding “generic”, of course you’re right. I think for my use case, local operations (what’s the parent / next sibling / children / etc.; move this subtree somewhere else) should be enough. So adjacency lists are probably the way to go.
I suppose all of these things are not fundamentally hard, was just wondering if someone already did it.
Would the Postgres Ltree datatype help?
NB I’ve built something based on IDs (essentially doing the usual pointer rotations in a set of maps). Will publish something not too far out.
2 Likes