Advantages and disadvantages of using assign_async for everything

thiagomajesk · March 25, 2024, 5:10pm

Hi everyone!

I took part in a very energetic discussion at work today and left wondering what would be the advantages/ disadvantages of using just assign_async for everything in a LiveView.

The main points of the discussion were related to avoiding the “double load problem” of LiveViews and avoiding hitting databases twice (and optimizing expensive queries).

Some people have liked that approach so much as to prescribe that any new LiveViews in the project should use this from the get-go. I have some opinions about this, but I’d like to understand more from the community to compile results.

sodapopcan · March 25, 2024, 6:40pm

Ah, the ol’ “Let’s just just pick the sledge hammer so we don’t have to worry about those pesky other tools” argument It’s a great way to get convoluted code no one wants to work on!

I’m a firm believer in “the medium is the message” and that assign_async, connected?, and the double render feature all have their place and act as a signal to the reader. If someone is async_assigning {:ok, %{some_string: "some_value"}}, I’d probably waste my time going on a hunt to find whatever magic bottleneck that code is trying to solve, then waste more time reassuring myself there really isn’t one.

thiagomajesk · March 25, 2024, 6:55pm

That is a good take… I wonder though, on a more practical level what would be the impacts of just using assign_async for everything that hits the database (which is quite compelling, to be honest).

As a side note, I know that the BEAM is very capable of handling millions of processes, but if we are going to create a task for each assign for each LiveView, I wonder how quickly this can become a bottleneck (maybe is too soon to ask for data around this).

I’m also interested in understanding if there are any specific recommendations from the core team where we should consider one over the other, it would also be cool if we had more guidelines on the advantages/ disadvantages of some approaches, like in this case, if you are trying to solve the load duplication thingy LiveView does.

sodapopcan · March 25, 2024, 7:01pm

That’s a good question. I’m always cognizant of that but I’ve never done any load testing, I just try and keep number of processes per user to a minimum since it’s usually unnecessary.

Also a good question I don’t have the answer to and would be interested in other takes. Though if the queries are responding sub 200ms or 100ms or whatever your target is, I really don’t see the point, it just becomes confusing as per my examples above (“Wait, what is User.get! doing to be considered so slow???”). But if everyone knows “all db queries use async_assign” that could be ok, I would just be constantly grumbling if I were on that team

cmo · March 25, 2024, 9:25pm

You’re pushing complexity to the html if everything has a loading state. If they’re quick then it will feel jarring to have things pop in at different times very soon after each other.

My rule of thumb is to go async when I can perceive the delay, which is probably half a second or so.

I would imagine that at a certain scale you’re going to be hurting overall performance starting so many processes and at any scale you’re making some things slower than they would have been. Have you ever been surprised that doing Task.async_stream is slower than Enum.map for some things?

thiagomajesk · March 25, 2024, 10:16pm

Yes, this is an excellent point… I must admit that I don’t think this applies 100% to the case where you are simply querying the database, but I understand how that could be a problem in some cases. Also, you have a good point on the UX problem overusing this approach can generate (SEO can also become an issue since you won’t have anything on the first-page load).

benwilson512 · March 25, 2024, 11:00pm

Yeah I think the key thing folks need to keep in mind is that the goal here is to improve UX not just have the initial render be as fast as possible. Lots of small changes, and the associated shuffling of content on the page, can be a very jarring experience. Slow loading can also be a bad experience. For the things that load slowly (> 250ms) assign_async is great. But to do it for every DB hit? I sure hope that your DB hits aren’t taking more than 250ms!

thiagomajesk · March 26, 2024, 12:04pm

Other than UX, is there anything else that you would take into consideration before choosing one approach over the other?

nmk · March 26, 2024, 12:18pm

If you are loading data from a constrained, shared resource, e.g. database or a rate-limited HTTP API, beware of exhausting the connection pool.

thiagomajesk · March 26, 2024, 1:26pm

Yeah, I think this is a good rule of thumb for any type of async work, but in this particular case, a traditional assign would hit the database twice for each opened LiveView, so I think you are safe unless you are spawning too many tasks/assigns that get hit the database at the same time.

LostKobrakai · March 26, 2024, 1:42pm

Can you expand how fetching data async relates to not doing work multiple times? Moving fetching to be async generally doesn’t change the amount of fetches happening.

sodapopcan · March 26, 2024, 2:04pm

As I understand it, the function passed to assign_async only fires once the socket is connected, so it avoids the double render.

You can still use connected? here if you only have one expensive query and the page needs the that data to be useful. I still do as then I avoid the extra markup and spawing a task. I actually haven’t used async_assign in an actual project yet.

chrismccord · March 26, 2024, 2:53pm

This is a silly premise for the same reason that simply wrapping all your Elixir code in Task.async because “concurrency good” is silly.

The team should refocus on conversations that matter

As with any code when load/perforamance is concerned, first make it work, then make it fast, if necessary. Are you measuring load in any of the discussed scenarios? Without measurements everything is a guess, and it’s almost definitely fine as is, given the team’s overly broad stance. Keep in mind the “double render” scenario is only for the initial visit or hard refresh. Live navigation after the initial will not incur additional dead renders. LiveView being stateful also reduces DB load for all the interactions because you don’t have to hit the DB to rebuild/reauth for every interaction like a traditional app. So the double render trade is nuanced because you’re reducing load in other scenarios.

tldr; If you have data fetching in mount that strains the DB or is latent enough to harm UX, then absolutely defer loading with assign_async, but doing so as a matter of course is silly

chrismccord · March 26, 2024, 3:06pm

I’ll also add an advantage of assign_async not discussed yet here, which should be part of your decision process, is error isolation. We use processes in Elixir for concurrency and/or isolation. assign_async is the same. It’s great for concurrent async ops, but also equally great for isolation operations that may fail, where you want to reflect the failure in the UI while allowing the rest of the UI to remain functional. For example, communicating with an external resource that can be overloaded, offline, etc. In the discussed case, your primary DB itself being down or overloaded is unlikely to be a graceful failure mode.

sodapopcan · March 26, 2024, 3:08pm

Not to get too off topic—although that is my calling card—but this comes up often enough around here. Not on first render but after everything is loaded, people still get really worried about making some additional db calls that are literally just grabbing some records in isolation and not rebuilding the world and re-authenticating.

nmk · March 26, 2024, 3:17pm

That’s exactly what I was thinking about. If you are issuing 10 queries in async tasks with a pool of 10 connections there’s your whole pool being checked out at the same time.

thiagomajesk · March 26, 2024, 5:58pm

Yes, that was what I was referring to

Couldn’t agree more, but I think there’s probably a better way to articulate this (I’m not saying that there’s anything wrong with what you said, but I’m currently lacking the necessary elements to help people create the same intuition).

One of my objectives with this thread is to understand the pros/cons of both approaches in different scenarios so it becomes easier to provide useful insights to other people. The feeling that I have (from key people driving this conversation at work) is that assign_async is being treated like a silver bullet (which I don’t quite agree with) and should be used as the default option since in their minds we can reap all the benefits “for free”. I remember someone mentioning that this is even the recommended approach to load data (if this is true I can’t find it in the docs).

Also, thanks for reminding me of that, I was under the impression that this was only the case for live sessions. Good to know that’s not the case!

I wish it was that easy… But I hope this thread serves as a good reference for this type of discussion. Thanks a lot for your input by the way.

chrismccord · March 26, 2024, 6:59pm

Yes sorry if I came across a little hash, but I am also going for a bit of “come on folks, lets focus on what matters” wrt to the team discussions. My follow up on the considerations for assign_async is really what folks should focus on. In Elixir when we consider task, we should ask:

Do I need concurrency?
Do I need isolation?

If the answer isn’t obviously yes to either of these, we move on. It’s the same consideration for assign_async. You want concurrency when you don’t want to block on concurrent or potentially long operations, and you want isolation when you don’t want a crash in one process to take you out. Sometimes you need one of the two, and sometimes both.

The issue of double render is way down on the list of reasons you’d considering offloading async work. Also not discussed is if you have queries that are so expensive that an occasional double mount can overload the app, you’re likely already reaching for caching solutions, in which case the double mount is lessened by the cache. And caching has its own set of tradeoffs, so like my first post, make it work, then (cache) if necessary. Only measuring will tell you if it’s necessary.