LiveView InifiniteScroll misses events during event storms

Hi all,

Central to Phoenix and LiveView’s charm is that most users, including myself, prefer to leave the bits of javascript coding it requires to the professionals. While that’s a wonderful thing, it also has the side-effect that when you do need to step into that world you have an extremely small set of very busy individuals involved enough to offer any insights or shared experiences. That’s where I’m finding myself today and I guess it means I’m reaching out to our javascript unicorns here.

I’m getting tripped up by the intricacies of implementing infinite scroll and I’m looking for others who may have written parts of, tested or used or tried to use the stock implementation that exists to support infinite scroll and perhaps encountered the same issues.

Since my pagination does not align with the offset and limit principle, my dom structure doesn’t align with the semantics of firstChild and lastChild and I need to infinite scrolling in both the horizontal and the vertical directions simultaneously I can’t directly use the provisions made for infinite scroll in hook.js and thus had to implement my own hook and support functions.

Before starting my own implementation I followed the example from the guide and though I got it working it never worked in a way I’d describe as robust - I could get it to scroll up or down one line at a time if done slowly, but when making big jumps or repeating even small jumps too quickly it seemed to lose track of things and either just stop scrolling or detect an overrun and rever to the first page. I thought at the time that I’d seen enough of how the hook and sample went about their business and started with my own implementstion.

Though doing my own implementation was rough for many reasons, the javascript I ended up with have little in common with the original except for the notion of a pendingOp. I even used a completely independent implementation of bounce rather than the stock code’s throttle implementation. I copied the pendingOp concept in an attempt to clean up the rather messy alternative of adding and removing event listeners to avoid the infinite loop created by calling element.scrollIntoView while handling a scroll event. I couldn’t faithfully copy the pendingOp concept since my code was structured differently so I just did something similar instead.

All of which points towards my code and the stock code having less and less in common. Yet, and this is what baffles me, I ended up with javascript scroll event handling that gets overwhelmed in what appears to be the sact same way - If I scroll in small steps, not too quickly, it behaves perfectly, but when I try to move around too quickly I get to a point where the content is physically scrolled to the one of the ends without the listener getting triggered for those events.

My working hypothesis is that the issue must lie in the debouncing or throttling mechanism being oblivious to the pendingOp semantics in both the stock and my own implementation so it ends up swallowing events that should fire and firing events that has no effect.

Any advances on that? If you have any experience with or insights into how the stock code is meant to work or under what conditions it fails I’d love to take your accounts of that on board before I choose which rabbit hole to try next.

To be clear, this is about things that happen with the stock standard InfiniteScroll hook as called upon by the example code in the LiveView guide. Posting my version of code that goes about its mildly different business in a slightly differently will just complicate the discussion. I believe if I can gain an understanding of what trips up the stock code it’s likely the same thing that’s tripping up my code. If I’m wrong about that it’s on me, but then at least we’d have managed to identify and hopefully addressed an issue with the standard LiveView code.

P.S. A potentially key issue I’ve encountered here is stems from the sample code provided in the LiveView guide under Client-side Integration / Bindings / Scroll events and infinite pagination is given as code snippets leaving a lot of guesswork in setting up a test bed for the facility. It might help if someone who genuinely understands how it is meant to work including what the requisite environment would be compiled a complete reference implementation of infinite pagination / scrolling as a gist or sub-project. It’s just a suggestion in acknowledgment of that possibility that the errant behaviour I found had been the consequence of incorrectly interpreting what the guide meant to convey to me.

1 Like

I had to write my own, I had issues with the stock hook (I don’t remember what they were anymore) which I think is essentially deprecated in favor of phx-viewport-top/bottom now anyway. Unfortunately I can’t use the latter because I believe they are designed only for Streams (and, well, you know). Scrolling down works okay, but when scrolling up you have to do some clever tricks to realign the scroll position synchronously or the prepended content will push the content which was at the top down, maintaining the scroll position, and causing a loop which successively preloads you back to the top (since the scroll position is never pushed back down).

I suspect the code to correct the scroll position lives somewhere in the Streams prepend implementation, though I never found it…

There are a few different things that can cause this. For one, if you debounce rather than throttle and then scroll quickly you can overrun the “buffer” before another debounced event fires and then get stuck at the bottom. If you throttle (and make sure to check the scroll position lazily) this won’t happen.

Another problem is if you overrun the buffer while fetching the next page. Fetching a page needs to take some sort of lock so that you don’t fetch multiple times (I think pendingOp serves this purpose in that hook, not certain). If you overrun the buffer during the fetch the scroll will lock up.

There is only so much you can do about that one, I think my implementation still has that bug. Making the scroll buffer larger makes it much less noticeable, and since most infinite scrolls are bugged in this way I think users probably expect it when they scroll fast enough. You could probably fix it for good by checking the scroll position after a new page has loaded. Which I just thought of, actually…

In general, infinite scrolling is surprisingly tricky to get right. It takes a lot of tuning, I’m still not totally happy with mine. Doing it in X and Y sounds like a further challenge.

The hooks.js and sample code I refer to is exactly the version based on phx-viewport-top/bottom (which I can’t use either) but the bug is present in that combination nevertheless.

Not hard to spot that in the current code - calls made to scrollIntoView. I use it too and found it to be the main reason why intervention is required to prevent an event loop.

Maybe I wasn’t clear enough. The bad behaviour happens in the original version that uses throttle as well as my version using debounce. To
Me that reads as the root cause cannot be the difference between throttling and denouncing but might yet have something to do with using any type of mechanism that discards some events while allowing others through. It seems likely that without taking into account whatever mechanism is in use to break the infinite loop created by calling scrollIntoView in the course of handling a scroll event in the process of thinning out event storms (either method) is at the heart of the problem.

I’d love to understand exactly what you, and the original authors of the hook mean by overrunning the buffer. A fair portion of the original hook code is spent on detecting and recovering from overrun, but I’ve not implemented any such logic simply because I don’t yet know for real what problem it solves and how. While it would be easy to lay the blame for my code not working as expected on the lack of buffer overrun detection and handling, it cannot possibly be the mysterious thing my code and the original code has in common that cause the same buggy behaviour in both cases.

Oh boy, my core being is not compatible with the notion of a bug that cannot be solved. Simply does not compute for me. Users expecting and accepting something that breaks is just as unthinkable to me. Sorry, it’s a technical issue and unlike soft issues all technical issues have solutions.

As for post update checking, I’ll give that a thought but my initial assessment of that is that my code at least I am actually doing something like that anyway. The problem is that the underlying DOM and hook variables are left in a state that doesn’t correspond with expectations because we missed the notification of the user expressing their expectation. So ensuring internal consistency might be exactly counter productive in this case. But thanks, I’ll continue to ponder the opportunity.

I think what has surfaced here implies the opposite of tuning being required. It seems more likely to me that the original code and mine which had been affected by the approach taken in the original code but implemented with entirely different code and mechanisms are both victims of the same fundamental thinking error which no amount of tuning could ever get rid of.

Doing X and Y scrolling together isn’t really hard at all, it’s just that there no provision for the second direction in the standard implementation. If I did pure scrolling in both directions the implementation would in fact be very close to trivial. The thing I do for the recursive data is to combine scrolling and zooming in one of the directions so that you effectively zoom into the tree as you scroll into its branches and out when your scroll the other way. That complicated matter a tad, but not really much at the JavaScript and LiveView levels which works with these rectangular grids of data. Most of the tricky bits are dealt with at the business logic and database level.

Re-reading this part it made even less sense to me the second time round. If that’s an accurate description of what happens I’m entirely lost for it doesn’t correspond with what I see happening in the browser. It’s not that I don’t know what you’re talking about or know nothing about what you’re talking about but what I know about that what you’re talking about doesn’t tie up with what you describe as happening to any discernible extent. Sorry, I don’t mean disrespect or to put you down. I’m just unable to process that passage in my frame of reference.

Having gone through that code myself, and recently too, I can confirm that I could see absolutely no dependency between phx-viewport-top/bottom and streams. They’re events that fire when the scroll container’s firstChild or lastChild (element property) fall within / become visible inside the scrollContainer.

If streams are in use, exactly which elements will be the first/lastChild will depend on correct and accurate handling of the streamed content, but that is an entirely separate concern.

I was aware of that code, but I always thought it was some legacy infinite scroll implementation. But I see that you’re right, the hook is implicitly attached to the element when you use phx-viewport-*. I don’t think this is actually documented anywhere.

I did not find any explicit dependency either, I figured it was implicit somehow in the way Stream prepends are applied. I guess not?

Either way, when I tried to use phx-viewport-* without Streams it did not work properly, and now I have no idea why. There is no documentation to use the feature without phx-update="stream" so I just assumed it was a requirement.

I can try to be more thorough if you want, it was a digression.

When a child is added to an element its height grows but the scroll position remains the same. So if you append to an element, everything looks right. But if you prepend there is a layout shift downward, which looks buggy.

But it’s worse than that. As you know the scroll event has to check if you are within a certain distance from the top/bottom of the element’s true height in order to trigger pageloads in either direction. If you append, the bottom of the element is moved away from the scroll position and it’s fine. If you prepend, the scroll position stays the same distance from the top, and so it tries to prepend again, and so on, which is very broken.

You may be correct, I did not spend any time reverse-engineering that hook because I didn’t think it had anything to do with the phx-viewport-* bindings (although I now know it does). It’s not immediately clear to me how calling scrollIntoView() in that way solves this problem, so if you’ve already done that work feel free to save me some time by explaining it :slight_smile:

My own solution was to measure the height difference before and after the DOM is patched using a hook, which allows me to offset the scroll position perfectly. It’s very smooth.

It almost certainly does, yes.

There are many possible causes of this bug, I was only listing some of the ways it can happen. Note that my wording was very careful: “check the scroll position lazily”. If you were to check the scroll position before setting the timeout for the throttle you would have another way to trigger such a bug.

Of course I cannot debug code I can’t see, so you’re on your own there.

Buffer referring to the padding applied to the top/bottom of the element, the purpose of which is to give the user room to scroll while the network loads the next page. Overrun meaning reach the end of it: scroll all the way to the top or bottom.

They make special mention of it because they implement a feature which takes you all the way to the top if you hit the end of the buffer, I think the idea being to ensure the home/end keys (or very fast scrolling) just sends you all the way back to page 1. I did not implement this either (I would probably detect a home keypress explicitly instead).

I agree. The pageload locking up can be fixed with more careful checks, but the scroll momentum getting interrupted (on macs or mobile devices) can only be mitigated with more padding as far as I am aware. That’s what I meant.

In theory the problem would be solved with a perfectly virtualized list, where the padding space is exactly the height of the real content. In reality even the best virtualized lists I have seen (Apple’s) often jump around a bit, and doing this on the web is effectively impossible.

I that old chestnut. Well that’s something else I didn’t adopt out of the original code so it’s probably not the common cause for the same symptoms in both implementations. To clarify, I didn’t quite understand where the padding would have to get applied in my case because in the simplistic up/down case it’s associated with the first and last child except when those are marked as being the actual first or beyond the last element. Once again because I don’t have such a linear first and last element that wasn’t a good option for me. Though I use a standard Ui element with a standard linear list of inner elements for the content I have a separate way of controlling the display order explicitly to create the tree efffect. That meant I didn’t need to be concerned at all about the actual order of children in the DOM in any way, whether they’re there and in some arbitrary order because scrolling back and forth or updates coming in from PubSub notifications. I can append or prepend the additional continent anyway I prefer because it won’t make any difference to their layout and doesn’t need to be consistent between clients either. That’s an enormous amount of weight off one shoulders and I’m certain will ultimately make the big difference between being able to limit what data gets updated by using streams containing only the deltas from one navigational rectangle to the next whether they differ by one row/column or comes from completely disjunct parts of the tree.

Yet I digress. I’ve chosen a different mechanism to serve a similar purpose than the padding on the first and last element. What I am doing is loading content that is wider and taller than the rectangular data window that forms the logical equivalent of the page. It means I can let the native scrolling do its builtin thing and show the buffered data which is laid out correctly to simply continue the tree display which can be done instantaneously on the client just the same as if the user had been scrolling around within the window. But as soon as the user’s scrolling ended revealing any portion of these added buffer areas I detect that in the scroll event handler (which fires after the scrolling had taken place) and “rectifies” the buffering by loading more data to extend the window and buffering in the direction of the scrolling happening. I had to bring in temporary visual cues to when this is happening in order to confirm it’s happening correctly because without it the scrolling is so smooth that you cannot detect any difference between the two types. The tricky part was (and still is) making sure that when the page update with the additional data arrives it slots into the visible data perfectly i.e. without any additional movement on the screen. Without scroll snapping it would have been 10 times harder to get that right, but still doable because scroll position is sub-pixel accurate for that very reason. What I’m really trying to say is that the whole thing about manipulating the padding is a foreign concept to me. That’s probably not where the stock code is getting it wrong.

I’ve tried rather hard to make it clear that this is not about debugging my code. I’ll do that and hope to before I upload the project for others to see. The stock code is available for all to see, documented as far as deemed viable and yet seems to have the exact same shortcoming because of presumably the same fundamental error in the thinking or approach which is the only common thing between my code that the stock code. The focus should be squarely on the code that sits in the LiveView repository today - somewhere a design flaw or reasoning error hides in there which, once we identify and understand it, would perfectly explain the errant behaviour we see under certain circumstances. My gut tells me it’s the disconnect between using a pendingOp to mitigate the endless looping and any sort of event thinning going on. But I don’t understand the original implementation’s pendingOp implementation well enough to confirm that. Specifically, the pendingOp is a local variable inside the mounted callback the same as the onScroll closure. So when pendingOp is given a value by another closure including one that had been assigned to pendingOp at some stage it is beyond my command of JavaScript variable scope rules to say with absolute certainty when the original variable declared in the mounted callback is being modified or some other automatic variable. That might be what breaks it or just as well what allows it to work when it does. Since I use my version of pendingOp not only in the context of the mounted callback but also in other callbacks like changed and closures that live outside the scope of the mounted callback, I use a pendingOp that’s a variable of the hook structure (this) thereby ensuring (by using the bind attribute of the event handler function to ensure this-consistency) that I always address the same variable. I can deduce from the observation that the buggy behaviour survives that rather significant change that though I cannot explain what scope rules would make it so, that in the stock code there is also just one variable named pendingOp which is getting seen and modified from several different places.

On mobile or I’d find you the link to the previous forum thread, but it’s worth asking if you’re using a table/thead?

(Edit: found it, My JS hook stopped working after i added phx-viewport-top - #3 by felix-starman)

It does not have a height so the phx-viewport-* attributes will always fail in fun ways.

I’m sorry if you covered this earlier but one option is an element at the bottom of your page with initial CSS that pushes it outside of the initial viewport (e.g. top margin100vh or something, I don’t remember) and then using IntersectionObserver. I think in another thread I posted a working example. I don’t know if phx-viewport-* would rely on streams, but I wouldn’t imagine it would? IntersectionObserver should get you basically the same thing though and then you get to control it and can tweak it so you don’t get a flood of callbacks on scroll

Though I can well imagine the fun ways for that to fail we’ve basically ruled out everythng about the phx-viewport-* eco-system, the margins set for and detection of the first/lastChild those events attzch to, how it is detected and how users scrolling “past” the first item is detected as overrun as having been reached as trigger for next page load as potential causes of the illusive problem . But your feedback certainly does hint towards there here being more than one problem lurking in the stock code around InfiniteScroll/Pagination. I recall the fist time I saw the writeup about it in the guide or some article the author made specific reference to how easy and straight forward the chosen way to support that renders setting up infinite scrolling, but that might have been an something of an overstatement.

Perhaps the time has come to raise an issue on GitHub so that:
a) We can get some help from the originators in setting up and assessing a test environment for infinite scroll which reflect their intent of how it was meant to be used and that people can either replicate when they wih to make use of the facility or point out in what ways their use-cases is at odds with the opinions of infinite scroll as implemented.
b) We can test if the various problems people report with infinite scrolling are present in the reference implemention of the consequence of the consequence of missing something in their applications.
c) Eventually identify root causes for the persistent issues make small or wholesale improvements, do regression testing and ultimate know if and when the underlying issues had been dealt with or not.

The issue of course is that the errant behaviour we (well, that I am, at least) is trying to chase down here will be very difficult to replicate in the Elixir’s standard testing approach where all the events happen in exactly the sequence they’re programmed to happen with at best fixed timings between them In real life, user’s actions are timed and sequences inconsitently and the issues we’ve seen all seem to relate to unanticipated permutations of sequences and timing. Best we can hope for is to anually test until we can spot some pattern of when it fails. We can then attempt to replicate the conditions in a programmed test until we can get such a test to reliably trigger the failure. It’s going to be a massive amount of time and effort, but I suppose it could be doable. The alternative remains a meeting of the minds that created the soution and those who’ve seen the symptoms comparing notes in order to spot the assumption which silently fails. It will involve core developers more but could solve the problem with less overall effort.

I firmly believe, in computing and every other aspect of iife as well, you can never consider a problem solved unless you’ve come to nderstand what the actual problem had been. Poblems don’t go away by themselves and something no longer happening for one or another reason does not mean the problem is solved, only that you can’t see it from that vantage point.

Would someone plesse suggest a practical way forward on this?

I’ll register an issue if I must, but I feel awkward about presenting my attempted implementation of the sample as correct. I really only did it to get the lay of the land about how infinite scroll was approached in standard LiveView and not as anthing akin to a reference implementation. I abandonded it as soon as I could see what was being attempted and completely ignored the narrow band of conditions under which it worked as expected. I’m not even sure how many people have been or still are impacted by this. Few have responded to my question and I’m yet to find anyone that’s actually using the stock implementation. It seems plausible that people try it, fail to get it to work as expected and move on to implementing their own version, avoid implementing infinite scroll or find another tool in the UI space to help them get to where they need to get to.

Just a general comment. I’m not a fan of the approach to pad the first and last child of a scroll container to coerce the container into putting up scroll bars and allowing scrolling beyond it’s actual first or last child.

I prefer the alternative where pages are loaded with some degree of overlap which isn’t shown to the user until they scroll into the overlapping region. This is as easily achieved by using scrollIntoView as the current code is using to adjust the view following a scroll event which triggered a page update. But it has the benefit of allowing the original scroll event past the supposedly visible page to succeed in showing additional content while the loading of the next page happens in the background.

I haven’t been successful in making an implementation of the above without running into the same problem as the original code in that unusual permutations of events and timing results in the event handling getting out of sync with the content. If I were to somehow become able to confirm my suspicion of that specific problem being caused by two competing concepts used to avoid event loops and thin out the usual flood of scroll events browsers produce, I’d be happy to contribute a re-implementation of InfiniteScroll, still via the phx-viewport-* bindings but based on overlapping pages rather than first/lastChild padding as a PR.

Until I understand what really causes the abhorant behaviour and thus what’s required to circumvent the problems it causes I’d be repeating the same mistakes expecting different results which we know to be insane.

I think a first step could be to have a reproduction of this live in a minimal example where it can be manually triggered reasonably well.
I don’t think it’s necessary to start with an exact replication in tests.
Having a reproduction that can (easily?) be trigger manually would IMO be enough for a bug report.

I agree in principle, but you’ll have to pardon my ignorance about what that would look like in practical terms?

To clarify my uncertainty: At the start of all this I used phx.new to create a barebones project solely to experiment with native InfiniteScroll. Though I struggled to make enough sense of the snippets of code from the guide to ever be confident about having done it reasonably well I’m willing to risk exposing my inadequacies as a phoenix programmer and let the whole community review what I’ve done, but how. It’s not a gist, it’s a project. It’s not a simple piece of code I can copy and paste into a forum post but involves bits and pieces of many files. Exactly the issues that plague the sample as given in the guide makes it not straight forward to “have a reproduction of this live in a minimal example” we can discuss. (Add to it that at least one of the frequent contributors on this topic isn’t keen on collaborating via GitHub but only via the forum.) One of my previous questions was very graciously and efriciently answered using a single-file gist implementation of a LiveView app which is well and truly beyond my capabilities to come up with for this case which now also would have to involve an actual database backed schema seeded with sufficient data to create conditions for infinite scroll.

If a public repository on gitHub is the way forward, are there any samples of such minimal showcase repositories to illustrate you’re expecting mine to look like?

Getting sucked into this issue was the furthest thing from my mind when I went to have peek at what LiveView considered to be “the right way” to implement infinite scroll. I never attempted to perfect how I applied the sample code in any way. My project was messy and failed easily but once I managed to see the sequence of events unfold when I kick of one scroll event at a time I had seen what i needed to . I abandoned the project straight away, kept the code for the hook open on the side as a guide and set about implementing my own scroll handling with the alterred scope I needed. In the process I did almost if not exactly everything differently to the original, not to be different but just because my needs, style and approach was different. Imagine my surprise when after all that, my code proved to be exactly as vulnerable to events coming in faster than they get handled as the version I ended up with using the stock code and example.

My exploits to date has given me some insights into the dynamics and issues around implementing scrolling but it hadn’t made me an expert on how infinite scrolling was conceptualised in stock LiveView nor how the facilities it offers are meant to be strung together in an app. I’m not unwilling to reveal my feeble attempt but I’m genuinely not sure how to go about it in the context of the forum and github.

Is it becoming clearer what I meant with a practical way forward?

I spent some time reading the code.

The pendingOp functions as both a lock on the event (to prevent it being requested multiple times) and a lock on the scroll position during the pageload.

When the first child hits the top/bottom of the scroll container, this is detected and an event is sent. While waiting for a reply, pendingOp is assigned a callback which resets the scroll position such that the first child is forced to stay at the top (or last child at the bottom, though that doesn’t matter as much as discussed previously).

During subsequent scroll events, the handler short-circuits on the pendingOp, executes it, and returns. This both prevents another event from being sent and effectively cancels the scroll event (locks the scroll position) until the next page is loaded by the server.

Now I remember why I originally wrote my own infinite scroll (when I was using streams): I don’t like that it locks up the scroll position, I want it to move freely during the pageload. And I use some extra padding in the math to ensure that the page is loaded before the scroll hits the last item so that it stays smooth. Honestly I’m not really sure why the LiveView hook doesn’t do that.

Anyway, I’m still not sure the reason for your bug, but I see how the hook works now. As I mentioned before I also observed buggy behavior with the hook; though I wasn’t using streams, so I thought that was why. I’ll have to do some testing later, maybe it’s just broken!

My own hook works smoothly though, as I said, so it’s definitely possible to do this correctly with LiveView.