Order By Association Count within a Dynamically Built Query in Phoenix / Ecto

Hello! I need to update a dynamically built query such that items with photos order before items without photos. A Photo belongs to an Item.

What I would do in SQL is left join on the association table (“photos”), select a named count via COUNT(photos.id) AS photos_count, and then either order by it directly, or order by a case statement where photos_count > 0 (or IS NOT NULL …).

I am having a very difficult time expressing this with Ecto. I need to define this in a small helper function that is piped into an existing list of query building functions, so I dont have much control over how tables are aliased. Furthermore, all of my efforts to define a named COUNT result in an error like:

undefined function escape_count/0

Ive tried via: query |> select_merge([item, ..., photo], %{photos_count: count(photo.id)}

Or: `query |> select_merge([item, …, photo], fragment(“COUNT(?) AS photos_count”, photo.id))

Or (guessing here) select_merge([item, ..., photo], %{has_photos: fragment("1 CASE (?) IS NULL THEN 0 ELSE 1", photo.id)})

I couldnt find a way of doing this directly in the order_by clause, though that feels unintuitive anyway.

I would just do this in raw SQL, but being part of a dynamically built query, I cant control the table names and such. I also cant do an in-memory sort with elixir code, since other parts of the query control pagination, etc.

How can I append to an existing query, some clauses that will allow me to order by the presence of associations?

Not sure if this will help in your particular situation but starting simply with this raw SQL:

SELECT COUNT(t.id), a.title FROM albums AS a
  JOIN tracks AS t ON a.id = t.album_id
  GROUP BY a.id
  ORDER BY COUNT(t.id) DESC;

which can be expressed in keywords form as

query = from(a in Album, [
  join: t in Track, on: a.id == t.album_id,
  group_by: a.id,
  select: [count(t.id), a.title],
  order_by: [desc: count(t.id)]
])
iex(1)> alias MusicDB.{Repo,Album,Track}
[MusicDB.Repo, MusicDB.Album, MusicDB.Track]
iex(2)> import Ecto.Query
Ecto.Query
iex(3)> query = from(a in Album, [
...(3)>   join: t in Track, on: a.id == t.album_id,
...(3)>   group_by: a.id,
...(3)>   select: [count(t.id), a.title],
...(3)>   order_by: [desc: count(t.id)]
...(3)> ])
#Ecto.Query<from a in MusicDB.Album, join: t in MusicDB.Track,
 on: a.id == t.album_id, group_by: [a.id], order_by: [desc: count(t.id)],
 select: [count(t.id), a.title]>
iex(4)> rows = Repo.all(query)

21:54:28.302 [debug] QUERY OK source="albums" db=2.4ms
SELECT count(t1."id"), a0."title" FROM "albums" AS a0 INNER JOIN "tracks" AS t1 ON a0."id" = t1."album_id" GROUP BY a0."id" ORDER BY count(t1."id") DESC []
[
  [10, "You Must Believe In Spring"],
  [9, "Portrait In Jazz"],
  [5, "Cookin' At The Plugged Nickel"],
  [5, "Kind Of Blue"],
  [4, "Live At Montreaux"]
]
iex(5)> 

or in expression form as

from(a in Album)
|> join(:inner, [a], t in Track, a.id == t.album_id)
|> group_by([a,t], a.id)
|> select([a,t], [count(t.id), a.title])
|> order_by([a,t], desc: count(t.id))
iex(1)> alias MusicDB.{Repo,Album,Track}
[MusicDB.Repo, MusicDB.Album, MusicDB.Track]
iex(2)> import Ecto.Query
Ecto.Query
iex(3)> make_query =
...(3)>   fn ->
...(3)>     from(a in Album)
...(3)>     |> join(:inner, [a], t in Track, a.id == t.album_id)
...(3)>     |> group_by([a,t], a.id)
...(3)>     |> select([a,t], [count(t.id), a.title])
...(3)>     |> order_by([a,t], desc: count(t.id))
...(3)>   end
#Function<20.127694169/0 in :erl_eval.expr/5>
iex(4)> rows = Repo.all(make_query.())

21:56:14.978 [debug] QUERY OK source="albums" db=2.6ms
SELECT count(t1."id"), a0."title" FROM "albums" AS a0 INNER JOIN "tracks" AS t1 ON a0."id" = t1."album_id" GROUP BY a0."id" ORDER BY count(t1."id") DESC []
[
  [10, "You Must Believe In Spring"],
  [9, "Portrait In Jazz"],
  [5, "Cookin' At The Plugged Nickel"],
  [5, "Kind Of Blue"],
  [4, "Live At Montreaux"]
]
iex(5)> 

Whether or not you can successfully compose such a query in your particular situation depends entirely on the details …

3 Likes

Thanks for the reply. I am very clear on how to do this in raw SQL, and I am familiar with the keyword based approach. Ill see if there is a way I can perhaps merge a new keyword query into an existing Ecto query.

What I cant seem to figure out is:

  • how to name a column that is the result of an aggregate COUNT (e.g, COUNT(id) AS some_count)
  • how to reference that named count column in an order-by clause

There may simply be no such functionality with Ecto, and I may need to approach this by building an entirely new query and merging it into a dynamically built query.

The standard approach, even in SQL, is to simply repeat the COUNT expression in the ORDER BY as I have already shown:

 ORDER BY COUNT(t.id) DESC;
|> order_by([a,t], desc: count(t.id))

The RDBMS engine is supposed to correlate that expression to the one in the SELECT list without evaluating it twice - so there should be no need to alias the column.

Adding the GROUP BY is the real trick - but it may not be possible to adapt your existing query to tolerating the GROUP BY.

2 Likes

Yes I had just arrived at this realization, and was about to update the thread accordingly -

I was missing the bigger problem by assuming the “aggregate column” had to be named so that it could be matched by the query builder. While this did uncover an additional problem (discussed in my original post), I dont need to do this at all, as you demonstrated.

The real problem is handling the GROUP BY clause, which, given that this is a dynamically built query, might not be possible. At least, Id have to make this function, which produces the query fragment in question, aware of every column selected by the time it is reached.

[edit]

I suppose I dont have the required permission/role for editing a thread or thread title after a certain time period.

More accurately, this thread is about “Order by Association Count within a Dynamically Built Query in Phoenix / Ecto”, and the core of the problem is with forming the GROUP BY clause when COUNT is used in ORDER BY.

In the simplest case it is possible as Ecto queries are composable. Example:

alias MusicDB.{Repo,Album,Track}
import Ecto.Query
base_query = fn ->
  from(a in Album)
  |> select([a], %{title: a.title})
end
count_query = fn q ->
  q
  |> join(:inner, [q], t in Track, q.id == t.album_id)
  |> group_by([q,t], q.id)
  |> select_merge([q,t], %{track_count: count(t.id)})
  |> order_by([q,t], desc: count(t.id))
end
query = base_query.()
albums = Repo.all(query)
query_with_counts = count_query.(query)
albums_with_counts = Repo.all(query_with_counts)
iex(5)> query = base_query.()
#Ecto.Query<from a in MusicDB.Album, select: %{title: a.title}>
iex(6)> albums = Repo.all(query)

15:53:02.602 [debug] QUERY OK source="albums" db=1.4ms queue=0.1ms
SELECT a0."title" FROM "albums" AS a0 []
[
  %{title: "Kind Of Blue"},
  %{title: "Cookin' At The Plugged Nickel"},
  %{title: "You Must Believe In Spring"},
  %{title: "Portrait In Jazz"},
  %{title: "Live At Montreaux"}
]
iex(7)> query_with_counts = count_query.(query)
#Ecto.Query<from a in MusicDB.Album, join: t in MusicDB.Track,
 on: a.id == t.album_id, group_by: [a.id], order_by: [desc: count(t.id)],
 select: %{title: a.title, track_count: count(t.id)}>
iex(8)> albums_with_counts = Repo.all(query_with_counts)

15:53:02.611 [debug] QUERY OK source="albums" db=2.4ms
SELECT a0."title", count(t1."id") FROM "albums" AS a0 INNER JOIN "tracks" AS t1 ON a0."id" = t1."album_id" GROUP BY a0."id" ORDER BY count(t1."id") DESC []
[
  %{title: "You Must Believe In Spring", track_count: 10},
  %{title: "Portrait In Jazz", track_count: 9},
  %{title: "Cookin' At The Plugged Nickel", track_count: 5},
  %{title: "Kind Of Blue", track_count: 5},
  %{title: "Live At Montreaux", track_count: 4}
]
iex(9)>

In your case, the manner in which you are composing the query may be getting in the way at this point. So you may have to restructure your query composition flow to somehow make the GROUP BY fit when it is needed.

2 Likes

Yeap, I think you are correct. The problem is not with Ecto lacking any such functionality, but rather a problem any ORM exhibits when dynamically building queries. Looks like the system in which my problem case is exhibited needs redesigned - this is simply the first aggregate query introduced, and thus the first time its showing its limitation.

Refactoring seems to be the answer, plus I may need to upgrade Ecto to get newer features to name/alias JOIN statements (so I can later reference each table in a GROUP BY).