How do I solve pagination (Repo.aggregate(:count)) on query with group_by in the best way?

la_219 · April 3, 2023, 3:20pm

Hello all

Disclaimer: Since this is a work thing, I changed the name of the schemas. The structure is the same.

I’m currently working on a dashboard that shows me all the projects I have. Each project can have x tasks. I work with Ecto.Repo and have an association of has_many in the project schema to map to the tasks. The task schema has the field belongs_to project. That works well, I can add tasks and read them and everything.

On my dashboard I want to have an overview of all my active projects and the count of tasks. So basically a list with

my first project (5 tasks)
my second project (2 tasks)
…

So far so good. My code looks like this

      Project
      |> select([p, t], %{
        id: p.id,
        title: p.title,
        state: p.state,
        task_count: count(t.id),
      })
      |> join(:left, [p], t in assoc(p, :tasks))
      |> where(^filter_by(filter))
      |> group_by([p], p.id)
      |> order_by(asc: :title)
      |> Repo.all()

The filter_by is a dynamic function that allows me to filter by term, state (and more to come). That works so I will skip the code for that. I can include it, if you feel it is necessary.

Now I want to paginate everything. So far, I’ve written a generic module to do that for me (because I will need it all the time for other stuff). So in the above code, I want to use that instead of Repo.all(). The call looks like this

[... all the query stuff from above except for the Repo.all() ]
|> Pagination.page(%{number: page_number, limit: page_limit})

And my module looks like this:

defmodule Infrastrucutre.Repo.Pagination do
  import Ecto.Query

def page(query, %{number: number, limit: page_limit}) do
    query
    |> limit([r], ^page_limit)
    |> offset([r], ^((number - 1) * page_limit))

    %{
      page: Repo.all(query),
      stats: %{
        count: count(query, %{page_limit: page_limit}),
        number: number
      }
    }
  end

  defp count(query, %{page_limit: page_limit}) do
    ceil(Repo.aggregate(query, :count) / page_limit)
  end
end

The alert reader will probably already see the problem. While this module worked fine for me so far, I am now for the first time trying to aggregate a query that already has an aggregation (= count(t.id)) in it. And that does not work at all because I can’t aggregate an aggregation. The excat problem being

** (Ecto.QueryError) cannot aggregate on query with group_by in query:

To solve this, I can have two queries in my repository, one for all my projects that I hand to the method count (I would have to make this public then) that gives me the count. And one for all my projects including the task count that I return in the pagination as “page”. Then I would prepare my pagination in my repository and return that.

Or I can have two methods in pagination. One with the limit and offset and one without where I can use Repo.all on and then get the length of all the entries without the limit and offset.

Both do not feel like clean code. So I am wondering if there is another way to do it that I haven’t thought of yet. What’s the most elixir way to do it?

Thanks for your time and input! Have a great day

JohnnyCurran · April 3, 2023, 3:46pm

Could you use exclude? Ecto.Query — Ecto v3.9.5

This will let you reset the :group_by in your query (and other fields) and you can then hopefully get the count

I have only just skimmed your post and this might not be applicable to your situation. You will still need to make two queries but maybe not two functions. You could maybe do:

{with_count, without_count} = {Repo.all(exclude(query, :group_by)), Repo.all(query)}

Or something like that

la_219 · April 3, 2023, 4:13pm

Oh wow, I did not know about that function. Thank you very much, it worked instantly!

kayvon · September 25, 2024, 7:03pm

I ran across this issue as well and decided to wrap my query in a subquery to be able to get the count. That seemed to work as well.