LiveQuery - A new way to load data for your live views

AHBruns · March 23, 2023, 3:41pm

Introduction

I’ve begun work on a library I think will dramatically change (and, I hope, improve) how data is loaded in live views. I wanted to get this post up now to facilitate early design feedback.

The Problem

As I work on larger and larger live views I’ve found an issue appears every time, without fail. Live views are not open to extension. If you want to add a live component to a live view that needs live data (data that is updated in response to changes, usually pub sub events) you need to touch everything from the live view down to that new component. This is okay for small live views, but as live views grow this gets worse because a) you have to do more work per change (prop drilling) and b) collaborative development becomes increasingly painful since you’re increasingly updating the same things risking the chance for change conflicts (and botched manual conflict resolutions).

This is not a new problem for view engines, though it is exacerbated in our case by live views being the only components capable of directly subscribing to messages (live components and functional components don’t have handle_info), meaning all live data needs to kept at the very top level even if its only used in a small deeply nested sub-tree of the UI. In theory we could use nested live views to store live data for a sub-tree, but that ends up causing more issues than it solves since they have no process to synchronously re-render in response to new data.

The Solution Space

Anyway, like I said, this is not a new problem. Luckily, as an old problem, it has some established solutions.

The first is component composition. This is great for a bunch of reasons, but it’s not a total solution because composition comes at a cost. You can no longer use the data passed into you for conditional rendering purposes. E.g. you can’t have a component which takes a user struct and only renders the first name when the first + last is > 50 characters. This logic has to be moved into the caller. At the extreme you end up with a live view that not only loads all your live data, but also contains all the logic which works with that data! This obviously doesn’t scale.

The other solution is data dependency injection. Instead of giving up control over rendering (component composition) to the caller, components invert their data dependencies. They take data, rather than being given data. In react this might look like a global store, which a component can subscribe to a slice of. The component takes the data it needs from a well known place, rather than being given data by its parent. This relieves the caller of the responsibility to provide data to its child. This leaves your component tree open to extension, so long as you store has (or can get) the data the component being added needs, you can add that component without the rest of your component tree being updated (or even knowing).

This second approach is how all modern large (50k+ LoC) UI applications are built. Different frameworks have their own libraries and flavors (React/Solid/Vue/Angular-Query, Redux, React’s Context API, Vue’s Context API, Zustand, Jotai, MobX, etc.), but they all function off of the same principle: let components take the data they need

LiveQuery’s Approach

LiveQuery does (will do) this for live views. Unfortunately, LiveQuery will be relatively inefficient, at least initially, because live views lack some key requirements for efficient dependency injection (there’s no way for a live component to subscribe to an external store). However, trading rendering efficiency for development speed has historically proven to be a good bet (see the UI = f(state) movement).

The idea is fairly simple. When you add LiveQuery to a live view, all your live data gets moved out of your assigns and into an external store (probably somewhere in the process dictionary, unfortunately). In place of all your data, you get a store reference that you keep in your assigns. This reference contains 2 things: 1) the info required to find your store and 2) a version number which is incremented every time any data is updated in your store. You then pass this store reference as an assign everywhere. It should be ubiquitous.

Data loading then works like this: you take your store reference and either grab the data that’s already there, or if it hasn’t been loaded yet, you load the data. When you render your component tree, everything gets the data they need, and they all share the same data, even different components, since they’re all loading from the same store.

The store is really just a fancy key-value map where keys indicate the query performed to get the data, and values are the data. In addition to just storing data, the store also allows queries to defined pub sub subscriptions (which it dedupes) + pub sub event handlers. The idea being that a query not only defines how to load some data, but also how to invalidate/update that data over time.

The downside of this approach is obvious, whenever any live data changes the entire component tree must be re-rendered since there’s no way to know which sub-trees need which data. However, if this pattern catches on, as I expect it will, live view could add some very simple callbacks to allow more granular re-rendering behavior (specifically an unmount callback for live components).

Finally, while I’m initially targeting this library for single process use, one could imagine sharing a store between many different live views on the same node dramatically reducing memory usage and database performance. In effect, this is a bespoke cache. No reason it can’t be used by multiple processes. This is, of course, a down the road idea though.

Development

I’ve already started working on this library and should have a beta release out in a week or so. I’m making this thread now to gather:

Feedback about the design. Can you think of a more efficient implementation given live view’s API’s current limitations?
Feedback about the pattern as a whole. Have any of you built very large live views (50k+ lines)? If so, how did you solve this scaling issue?

tfwright · March 23, 2023, 4:09pm

In my experience with React/Redux, the biggest downside to that design is that all your views become dependent on the store, and thus any changes to the store have the potential to break an indeterminate number of views in subtle and unpredictable ways.

For example, I have seen an app where view A (“edit person”) subscribes to a slice of the store via a selector like getPerson expecting it to have the props first_name, middle_name, and last_name, and view B (“show person”) uses the same selector expecting the prop full_name. I was brought in to work on the JSON REST API primarily, which was suffering from a lot of performance problems, one of them being that it did not adhere to the sparse fieldset spec. But trying to refactor it was a nightmare because every view in the whole app expected all the data that was possible to request was always available. To make matters worse, well-meaning FE devs had sprinkled safe navigation operators everywhere to handle the myriad cases of missing data, but that just meant we got bug reports about missing data that were much more time consuming to diagnose than actual page crashes. As this app had been using React/Redux for a few years this was dozens of views with hundreds of selectors. Tests, of course, were almost non-existent.

Now, you could argue this was a bad design/use of Redux, but that would make it at best a pretty fatal foot canon. Also, testing tools/culture might be much better with LV, but of course we can’t depend on tests to catch everything. Do you have an idea about how to address this issue in your project? Because my feeling was Redux created way more problems than it solved vs just simple prop drilling and we now only use Redux for “truly global data” (like current user data).

AHBruns · March 23, 2023, 4:42pm

So, I see this as a problem with applying the reducer pattern to global stores, not global stores in general.

Global stores provide state that all components know about and can subscribe to. This is what allows data dependency injection.

The reducer pattern, state = reduce(state, action), is one way you can go about updating a global store, but it is by no means the only one. LiveQuery will not use it, at least externally.

Instead, LiveQuery will have a very, very similar API to ReactQuery. It’s store will have a fixed structure %{key -> data}, and it will expose some basic functionality to update that store in predefined ways. Basically just:

getQuery/2 which will get the data for a query, loading it if needed.
invalidateQuery/2 which will delete a query’s data, and trigger a re-render (during which the deleted query may be loaded if it’s still need).
updateQuery/2 which will allow updating/setting query in place (useful for stuff like optimistic updates).

By structuring the store around your queries rather than your application state, we alleviate the issues you discussed because it allows selectors to no longer be tied to your application’s structure. Instead they’re tied to your queries, and it’s up to the consumer to coerce the query data it grabs into a shape it can use. This ends up having better performance characteristics since you think about how you’re getting your data.

AHBruns · March 23, 2023, 4:50pm

It’s also worth pointing out that redux has itself realized this issue in the last few years, and Redux ToolKit (RTK) now strongly recommends against handling data loading in application code (e.g. load → coerce to application state structure → store). Instead, now it recommends using something like RTK Query which provides a very similar API to the one I’ve described, where data is loaded and stored in a flat key-value cache-like structure which exposes some generic operations.

The RTK Query docs really sum this up well:

Over the last couple years, the React community has come to realize that “data fetching and caching” is really a different set of concerns than “state management” . While you can use a state management library like Redux to cache data, the use cases are different enough that it’s worth using tools that are purpose-built for the data fetching use case.

I guess you could say I’m banking on the LiveView community realizing that “live data querying and caching” is really a different set of concerns than “state management”.

tfwright · March 24, 2023, 3:14pm

I am not expert in React development much less Redux but I am vaguely aware that they have been making changes, and that makes some sense. It would probably be more useful for someone who is more familiar with that side of things to chime in here. I would be interested in learning more about how what you’re proposing is different than the reducer pattern (I would probably need to see some examples).

But generally speaking from the point of view of someone who has to manage the DB/API side of things, I am wary of FE oriented state management abstractions that grow and “compete” with the abstractions on the BE (the performance issues I mentioned above were partially caused by FE needs being catered to without enough resistance by my predecessor, who in fairness simply had bigger fish to fry). But LV is really an entirely different ballgame in that regard…which is precisely why so many people are excited about it.

To sum up, right now LV’s approach to state management, if you can call it that, is simple, but cumbersome. The line in your write up that scared me a bit was " In effect, this is a bespoke cache" which is exactly what I feel the problem with Redux’s “reducer” pattern was: the FE built its own cache of the data and then pushed back all the well known problems of managing the cache to the API because fixing issues on the FE where they should be fixed (changing which data was fetched inside the component being changed) became too difficult.

AHBruns · March 26, 2023, 12:53pm

So, this is on me. I misspoke. When I said this is a “bespoke cache”. I actually meant that this is a “generic cache”.

That is what makes it different from a reducer store. You can’t just do anything (like you normally can in a reducer function). The store has a strict structure (key => query) and a fixed set of operations (get_query/invalidate_query/update_query).

Sorry if that was confusing. I unintentionally said the exact opposite of what I meant!

AHBruns · April 20, 2023, 7:02pm

So, I have a beta release out. I took the time to write up exactly what problem LiveQuery solves and how it solves that problem, here.

I will need to write a lot more documentation and tests before I consider this production ready, but I do have a playground app using it here which you can look at to see how this works from a consumer’s perspective. You’ll notice there is absolutely no “reducer” style code here. You effectively just write a simple GenServer that loads/reloads you data, and LiveQuery handles starting it when you need it, preventing it from crashing when you have buggy code, shutting it down when you’re done, caching reads in your live view to ensure stable data through out each render, etc. I does quite a bit, but has a surprisingly simple API.

@tfwright, if you have a moment, I’d love to hear your thoughts. I think this avoids the concerns you had, but maybe I’ve missed something. If so, I’d like to know.

AHBruns · April 25, 2023, 5:38pm

Just a little update

I’ve come to the conclusion that :live_query’s core (on-demand live data) is unrelated to its first use-case (live views). So, I’m going to be splitting the package up into :live_view_query which will depend on a core package called :live_query. Once those package are relatively stable, I have plans to implement :phoenix_channel_query (consume live data on the client via a Phoenix Channel) and :gen_server_query (consume live data from a regular old GenServer).

This will take a little while to do, but I think separating the pieces up front will improve maintainability long term.

AHBruns · July 24, 2023, 7:21pm

The day has come!

I’m excited to announce both LiveQuery and it’s companion live view client library have officially launched.

I’ve tested them both extensively, and written heaps of documentation (and yet somehow there’s still more to write). Please consider these release candidates. They’re still in beta due to a lack of production use, but I feel confident the API surface will stay relatively stable from here until v1.

I plan to produce a lot more documentation and educational material over the next few months, but I believe there’s enough out for an interested observer to get a project using LiveQuery up and running today! If you try and run into issues please ping me, I’m happy to provide white glove support to anyone willing to take the time to use my code.

codeanpeace · July 24, 2023, 11:05pm

This is really cool! I like how you separated out :live_query core from :live_view_query and can see how LiveQuery could be a nice abstraction for larger apps that would benefit from a generic cache – especially since it seems like you’re looking into an ETS based approach for better concurrency based on your note here.

That said, I just wanted to point out another option somewhere in between prop drilling and what you’re proposing with LiveQuery to propogate pubsub events that could be beneficial for “medium sized” LiveView apps.

By pairing handle_info/2 and send_update/3 in a LiveView, you can propagate pubsub events without prop drilling by asynchronously updating a live component with new assigns via its id. This pattern could be especially “useful for updating a component that entirely manages its own state” as pointed out by the send_update/3 docs.

So while it’s true that only live views can subscribe to messages via handle_info/2, the conclusion drawn here isn’t necessarily true as a LiveView could relay the pubsub message to a nested LiveComponent via send_update/3 without storing it at the top and passing it down via props.

AHBruns · July 24, 2023, 11:25pm

So, I’ve used the approach you’ve described for a long time, and it absolutely works. However, it has an Achilles heal:

Live components can’t clean up after themselves.

In other words, while a live component can autonomously subscribe a to a topic while mounting via the scheme you’ve described, it can’t cleanup it’s subscription when it unmounts. There’s no callback. This means you’ll end up with orphan subscriptions. For complex live views this could mean a lot of orphan subscriptions.

I tried getting the necessary callbacks added to LiveView, but the suggestion was shot down. Here’s the GitHub issues if you’re interested: Feature request: live component unmount callback · Issue #2454 · phoenixframework/phoenix_live_view · GitHub

codeanpeace · July 25, 2023, 12:37am

Thanks for linking to the interesting discussion and background there.

It seems like it’s specifically the conditional rendering of LiveComponents that would result in these orphaned subscriptions for unmounted LiveComponents in complex LiveViews. For simpler medium sized LiveViews that don’t have that problem or aren’t sending many updates to begin with, it could still be a good approach from a simplicity standpoint. For others, LiveQuery could well be the right tool for the job.

While not an ideal solution, one possible workaround off the top of my head would be to have LiveComponent acknowledge receipt so the LiveView can assume those that don’t have been unmounted and unsubscribe them. That would curtail the long tail of updates sent to orphan subscriptions, but would also obviously add a good bit of complexity and may well not be worth it. ¯\_(ツ)_/¯

Regardless, it was interesting to see what drove you to create LiveQuery!

AHBruns · July 25, 2023, 7:21pm

Yeah, the acknowledgment approach is interesting. I’m thinking something like:

send the update to the live component
immediately after send a message to yourself
have your consumer live components synchronously write an ack to a known place in the process dictionary when they receive updates

Then if you receive the message to yourself w/o an ack having been written to the process dictionary, you know the update was never received and the live component is down since message ordering is a guarantee of BEAM.

I may actually implement this for LiveQuery since it would allow me to target re-renders based on what’s using a query rather than re-rendering the whole live view every time any used query changes. Only downside is that, if two live components are using the same query they will be updated in different renders allowing for tearing where different UI components show conflicting data.

I suppose I could always fallback to a rendering the closest common ancestor. Would be complex as hell to implement, but that’s the point of libraries. Eating the complexity so consumers don’t have to.

AHBruns · July 25, 2023, 7:23pm

I think I’ll break this out into yet another library since the concept for reliable sub-process message in live views is much broader than LiveQuery.