Leaderboards query for multiple fields

Codball · October 5, 2023, 6:08am

I’m trying to find the top 3 players for multiple different metrics. I have a query that works well for one field:

from(c in Character,
  join: ci in CharacterInfo,
  on: c.id == ci.character_id,
  limit: 3,
  select: %{
    character_name: c.name,
    player_kills: %{
      rank: rank() |> over(order_by: [desc: ci.player_kills]), 
      amount: ci.player_kills
    },
  }
)
|> Repo.all()

and this returns an array:

  [
    %{character_name: "name_1", player_kills: %{amount: 67, rank: 1}},
    %{character_name: "name_2", player_kills: %{amount: 59, rank: 2}},
    %{character_name: "name_3", player_kills: %{amount: 56, rank: 3}}
  ]

The issue is I need more than just player_kills, ie monster_kills.

How can I write this query so that it returns something like:

%{
  player_kills: [
    %{character_name: "name_1", amount: 90, rank: 1},
    %{character_name: "name_2", amount: 80, rank: 2},
    %{character_name: "name_3", amount: 70, rank: 3}
  ],
 monster_kills: [
    %{character_name: "name_7", amount: 40, rank: 1},
    %{character_name: "name_1", amount: 30, rank: 2},
    %{character_name: "name_3", amount: 20, rank: 3}
  ]
}

Codball · October 5, 2023, 7:59am

I’ve come up with something using array_agg:

player_kills =
  from(ci in CharacterInfo,
    order_by: [desc: ci.player_kills],
    limit: 3
  )

monster_kills =
  from(ci in CharacterInfo,
    order_by: [desc: ci.monster_kills],
    limit: 3
  )

from(ci in CharacterInfo,
  left_join: pk in subquery(player_kills),
  on: pk.id == ci.id,
  left_join: mk in subquery(monster_kills),
  on: mk.id == ci.id,
  select: %{
    player_kills: fragment("array_agg(?)", pk.player_kills),
    monster_kills: fragment("array_agg(?)", mk.monster_kills)
  }
)
|> Repo.all()

Which returns:

[
  %{
    monster_kills: [nil, nil, 45, 98, nil, 45],
    player_kills: [45, nil, 54, nil, 49, nil]
  }
]

Which does contain the correct records, though unordered and with the 3 nil values.

Am I approaching this correctly?
Perhaps a couple Multi.all to separate the queries and run in a single transaction would make more sense?
ie:

Ecto.Multi.new()
|> Ecto.Multi.all(:player_kills, player_kills_query)
|> Ecto.Multi.all(:monster_kills, monster_kills_query)
|> Repo.transaction()

Any advice?

ibarch · October 5, 2023, 11:54am

Perhaps a combination of union_all/2 and group_by/3 may lead you to the result you’re after.
Here’s some pseudo-code for reference:

player_kills =
  from(c in Character,
    join: ci in CharacterInfo,
    on: c.id == ci.character_id,
    select: %{
      field: "player_kills",
      character_name: c.name,
      rank: rank() |> over(order_by: [desc: ci.player_kills]),
      amount: ci.player_kills
    }
  )

monster_kills =
  from(c in Character,
    join: ci in CharacterInfo,
    on: c.id == ci.character_id,
    select: %{
      field: "monster_kills",
      character_name: c.name,
      rank: rank() |> over(order_by: [desc: ci.monster_kills]),
      amount: ci.monster_kills
    }
  )

player_kills
|> union_all(^monster_kills)
|> order_by(asc: :rank)
|> Repo.all()
|> Enum.group_by(& &1.field, &Map.delete(&1, :field))

Codball · October 5, 2023, 6:13pm

I would prefer not to have to iterate through each result after the query. Interesting idea with adding the field key though, I’ll play around with it.

Is there a way to group by the field key? group_by(:field) returns an (undefined_column) column c0.field does not exist error

ibarch · October 5, 2023, 7:17pm

Try to wrap the preceding query in a subquery/2 before applying grouping. If that doesn’t help, then maybe Ecto is unable to properly cast the dynamic :field and you need to provide it as fragment("field").

Codball · October 5, 2023, 9:08pm

I tried a couple different ways of aliasing the select key but couldn’t get it to work. I went back and wrote the raw sql to remove a layer of brain conversion and ended up with:

SELECT pk.field, pk.pos, pk.amount
FROM (
    SELECT 'player_kills' field, RANK () OVER (ORDER BY ci.player_kills DESC) pos, player_kills amount
    FROM character_infos ci
    LIMIT 3
  ) as pk

UNION ALL

(
  SELECT 'monster_kills' field, RANK () OVER (ORDER BY ci.monster_kills DESC) pos, monster_kills amount
  FROM character_infos ci
  LIMIT 3
)

UNION ALL

(
  SELECT 'any_valid_field' field, RANK () OVER (ORDER BY ci.any_valid_field DESC) pos, any_valid_field amount
  FROM character_infos ci
  LIMIT 3
)

Which gets condensed down to:

def metric_subquery(metric) do
  from(ci in CharacterInfo,
    join: c in Character,
    on: c.id == ci.character_id,
    limit: 3,
    select: %{
      field: ^Atom.to_string(metric),
      character_name: c.name,
      rank: rank() |> over(order_by: [desc: field(ci, ^metric)]),
      amount: field(ci, ^metric)
    }
  )
end

def get_leaderboards() do
  from(s in subquery(metric_subquery(:player_kills)))
  |> union_all(^metric_subquery(:monster_kills))
  |> union_all(^metric_subquery(:any_valid_field))
  |> Repo.all()
end

It’s not ideal but I’m actually ok with them not being grouped by the field name, I have to iterate through each row to display them anyways, though I’d still be interested in how that would work. I couldn’t get group by to do anything even in the raw sql version, it seemed to have no effect.

I’m assuming I would want an index on all of the fields I want to query by (ie player_kills)?