Getting a specific page from Ecto Scrivener total_entries

I’m wondering if ecto_scrivener has a specific way of getting a entries from a specific page after %Scrivener.Page{} has already loaded.

For example if I run a query:

Record 
|> my_query
|> Repo.paginate()

This will return all records. But then if I have pagination at the bottom of my page and I hit page two I’m effectively running:

Record 
|> my_query
|> Repo.paginate(page: 2)

It seems silly to run the query two times when I already have all the records (especially if they are assigned in a conn or socket). I realize I I could divide the total records by the page_size and get the page number that way but I’m wondering if this is the preferred way or is there a better way?

Sorry to XY you but what are you you trying to do here? The usecase of loading all entries and then paging anyway is not one I’ve ever come across. I only know ecto_scrivener by name, but it seems your examples are missing some sort of per_page param. If Repo.paginate(query) returns everything yet Repo.paginate(query, page: 2) returns page 2, what is dictating how many results are being returned for the latter?

1 Like

I may be mistaken, but I believe Repo.paginate(my_query) should be equivalent to Repo.paginate(my_query, page: 1), thus returning a maximum number of records equivalent to page_size with an offset equal to 0. You should verify this by inspecting length(scrivener_map.entries).

To avoid all this, isn’t it simpler to always pass the correct parameters to the pagination function?

validated_params = validate_params(params)

Record 
|> my_query
|> Repo.paginate(validated_params)

You would need to define a validate_params/1 function because if we offer the user the possibility to choose for example the page_size when they request a specific page, it is better to check that the specified page_size is included in a range of reasonable values. If it exceeds a certain threshold we can simply use a default value instead. Even if we do not allow them to specify this parameter it is always prudent to force it to the default value as they could specify it themselves in the url or others… We can also force a maximal value through the max_page_size option though.

Out of curiosity, could you share with me your usual solution regarding offset-based pagination? ecto_scrivener seems to be in low maintenance mode and the last commit dates from 2021. I know this is not necessarily a problem, but I am not too comfortable with it. I was even wondering if I was not going to implement my own solution.

To get back to the subject, I believe the page_size parameter is the equivalent of per_page here. It is possible to give it a default value at the level of MyApp.Repo. See an example below.

defmodule MyApp.Repo do
  use Ecto.Repo,
    otp_app: :my_app,
    adapter: Ecto.Adapters.Postgres

  use Scrivener, page_size: 15, max_page_size: 25
end

That would make sense but OP said that it is returning all results, so that is why I was clarifying.

If you aren’t using Ash (which I am not yet either) then I say look no further than Flop and its cousin Flop Phoenix. It has a more functional API in that it returns a nested tuple instead of a struct. It took me a hot minute to get comfy with it since I was coming from Ruby pagination libs which are incredibly simple, but it’s worth it as it’s very powerful. Not only does it also do cursor-based pagination, it does filtering and sorting!

I swear I’m not the author, just a big fan!

1 Like

Thank you for sharing your experience! At a first glance, Flop looks really full-featured. I will definitely give it a try!

1 Like

I think you are right, in that it makes more sense to load page: 1 and then load by page from there.

The thinking; however, was if the record set is not too large (I’m loading between 0 - 50 records at the moment) and unlikely to grow very large does it not make more sense to query the DB once? Get all, say 30, records, possibly caching them, and then paginate with the entire list rather than querying the DB every time a user changes the page… That said, this is an internal system that few people are using so querying the DB isn’t a performance hit per se; I was just wondering.

This isn’t something you would ever want to do without thinking through it carefully. For example, if you’re going to cache pages on the client then you have to worry about invalidating it. And if you’re talking about caching per request on the client, then I could turn your worry around on you and say: “What if they never navigate past page 1? Now you’ve gone and fetched too much data!” In any event, such queries are relatively cheap and I wouldn’t give optimizing them a second thought until your server is on fire :slight_smile: In the past I’ve used sever-side caching (Varnish) for a product catalog and invalidated the whole cache any time we published new products. It was a minimal, but noticeable, difference in speed and was really just masking poor database design requiring far too many joins just to get some rows of available products.

In any event, you have an internal system so I would really not give this a second thought! One query per page is absolutely fine.

2 Likes

Thanks @sodapopcan. You’re right.
I suspect we go down a sort of academic rabbit hole about why this may or may not be a worthwhile exercise in some cases but I can’t think of any practical uses. It’s certainly not applicable to my use case and seems outside the scope of what I’m trying to do.

1 Like

Just get a job working on an enterprise app and you’ll start to get get jaded about query optimization :sweat_smile: The one I worked on, there were 10s of queries per user flying around per seemingly simple action. It would have been benefiting someone from Rails’ builtin query caching, of course, but lots of things were still slow. Not saying we shouldn’t strive for better, but I believe that sort of thing is rampant out there!