Aggregate fields in Absinthe

cjbottaro · April 3, 2018, 9:55pm

Given the following Ecto schemas…

defmodule User do
  use Ecto.Schema
  schema "users" do
    has_many :reviews, Review
  end
end

defmodule Review do
  use Ecto.Schema
  schema "reviews" do
    field :score, :integer
    belongs_to :user, User
  end
end

What is the best way to efficiently serve graphql queries like:

{
  reviews(ids: [1, 2, 3]) {
    id
    user {
      reviews {
        count
        avg
        min
        max
      }
    }
  }
}

Notice, the root objects are a list of reviews, then it goes through user, then back through reviews. Ugh.

Is this even possible? To have aggregate fields like this?

Should I make a custom Dataloader? Trying to prevent a bunch of n+1 queries.

Thanks for the help.

kokolegorille · April 3, 2018, 10:01pm

You can use Absinthe.Middleware.Dataloader if You want. It is covered in chapter 9 of Absinthe book.

I don’t want to spoil the book, but it can help solving n+1 queries.

cjbottaro · April 3, 2018, 10:04pm

But count, avg, min, max are not columns on the reviews table. They are determined by SQL with “group by” clause and aggregate functions…

kokolegorille · April 3, 2018, 10:06pm

Each attributes can have a specific resolver… Like likes, which could come from another service.

cjbottaro · April 3, 2018, 10:27pm

I’m not understanding how to share data across resolvers.

For example, this SQL can get the data for all 4 attributes, for all the given users in a single call:

  SELECT user_id, count(*), min(score), max(score), avg(score)
    FROM reviews
   WHERE user_id IN (1, 2, 3, 4, 5)
GROUP BY user_id

So how to do that SQL once, then use it to populate

review[0].user.reviews.{count,min,max,avg}
review[1].user.reviews.{count,min,max,avg}
review[2].user.reviews.{count,min,max,avg}
...

Thanks again!

kokolegorille · April 3, 2018, 10:44pm

Can You show your review schema part? That might help…

In my case, I use a schema, which points to a resolver, which point to a context.

I use resolver, it build a Relay Connection from a context query.

I define something like this…

  ...
  alias Absinthe.Relay.Connection
  
  def player_games(_, args, %{source: player}) do
    Gaming.list_games_query(player, args)
    |> Connection.from_query(&Gaming.process_repo/1, args)
  end
  
  def list_games(_, args, _) do
    Gaming.list_games_query(args)
    |> Connection.from_query(&Gaming.process_repo/1, args)
  end

With in your term, that would be in reviews resolver…

user_reviews
list_reviews

Those resolvers use context functions to load data.

How do you load your data?

kokolegorille · April 3, 2018, 10:56pm

This is a custom example…

I can use a specific query to forge a virtual attribute.

   # Has_many
    @desc "List of games"
    connection field :games, node_type: :game do
      arg :order, type: :sort_order, default_value: :desc
      resolve &GameResolver.player_games/3
    end

In your case, I would probably create a review info object, with avg, count etc. as fields. The I would link this object to a specific query. Just a rought idea…

cjbottaro · April 4, 2018, 12:49am

I ended up using Dataloader.KV and it worked great.

Here’s my query:

{
  applications(ids: [1, 2]) {
    id
    review {
      total_count
      assigned_count
      submitted_count
    }
  }
}

Object schemas:

defmodule Document.Schema.Application do
  use Absinthe.Schema.Notation

  import Absinthe.Resolution.Helpers, only: [dataloader: 1]

  object :application do
    field :id, :id
    field :review, :review, resolve: dataloader(:review)
  end
end

defmodule Document.Schema.Review do
  use Absinthe.Schema.Notation

  object :review do
    field :total_count,     :integer
    field :assigned_count,  :integer
    field :submitted_count, :integer
  end
end

And dataloader:

defmodule Document.Dataloader.Review do

  def new do
    Dataloader.KV.new(&load/2)
  end

  def load(_batch_key, applications) do
    user_ids = Enum.map(applications, &(&1.user_id))

    total_counts = total_counts(user_ids)
    assigned_counts = started_counts(user_ids)
    submitted_counts = submitted_counts(user_ids)

    Enum.reduce applications, %{}, fn application, acc ->
      Map.put acc, application, %{
        total_count: total_counts[application.user_id] || 0,
        assigned_count: assigned_counts[application.user_id] || 0,
        submitted_count: submitted_counts[application.user_id] || 0
      }
    end
  end

  defp total_counts(user_ids) do
    import Ecto.Query

    Schema.Reviewer.Review
      |> select([r], {r.reviewer_user_id, count(r.id)})
      |> where([r], r.reviewer_user_id in ^user_ids)
      |> group_by([r], r.reviewer_user_id)
      |> Datastores.Shard.all
      |> Map.new
  end

  defp started_counts(user_ids) do
    import Ecto.Query

    Schema.Reviewer.Review
      |> select([r], {r.reviewer_user_id, count(r.id)})
      |> where([r], r.reviewer_user_id in ^user_ids)
      |> where([r], r.state == 2)
      |> group_by([r], r.reviewer_user_id)
      |> Datastores.Shard.all
      |> Map.new
  end

  defp submitted_counts(user_ids) do
    import Ecto.Query

    Schema.Reviewer.Review
      |> select([r], {r.reviewer_user_id, count(r.id)})
      |> join(:inner, [r], a in Schema.Application, r.reviewer_application_id == a.id)
      |> join(:inner, [r, a], c in Schema.Category, a.category_id == c.id)
      |> where([r, a, c], r.reviewer_user_id in ^user_ids)
      |> where([r, a, c], c.state == "submitted")
      |> group_by([r], r.reviewer_user_id)
      |> Datastores.Shard.all
      |> Map.new
  end

end

Works great, no n+1 queries!

P.S. A little different from the distilled down example from previous posts, but this is actual working code!