Ecto preloading associations for my subquery

Hi I am trying to preload associations for a subquery like such:

from(p in Post, distinct: [p.group_id], order_by: [asc: p.published_date], preload: [:comments])
|> subquery()
|> limit(some_limit)
|> order_by([q], asc: q.title)

but Ecto is throwing a ** (Ecto.QueryError) cannot preload associations in subquery in query.

Is there another way to get around this error? My use case specifically require the use of subquery because I am using Postgres DISTINCT ON clause and Postgres dictates that the ORDER BY must then follow the columns specified in the DISTINCT ON clause.

To shed more light on the use case, imagine I have a table (Post) with a column called group_id (and many Posts will share the same group_id), and my query will returns only 1 Post per group_id where the Post with the earliest published date among the group will be returned. This is easily achievable by distinct: [p.group_id], order_by: [asc: p.published_date] but for the full result sets I actually want them sorted by Post’s title, hence the use of subquery but using subquery somehow eliminate the possibility of preloads. I know there are other ways (that are more SQL-oriented as DISTINCT ON is a Postgres feature like window function) but I find DISTINCT ON the simplest to reason with.

Hence my question is, is there a way to still use preload for subquery?

1 Like

Let’s not get fixated on how you figure this must be solved for now.

Can you give an example of what must the result set look like exactly?

1 Like

Hi @steven7 I’ve come across the exact same challenge and wondering what solution or workaround you ended up finding?

Did you try putting the preload on the outer query instead of the sub query?

I will try that, thanks

Hello guys. Is it possible to use named bindings from inner query when I need to preload in parent query?
For me it fail with error like: ** (Ecto.QueryError) unknown bind name ':foo' in query:

Can you give an example of query you are running, to get more context?

@jmnda Something like that:

Post
|> join(:inner, [p], a in assoc(p, :author), as: :author)
|> where([author: author], author.reputation > 100)
|> select([p], %{p | rn: row_number() |> over(order_by: p.id)})
|> subquery()
|> select([p], %{p | comments_count_total: sum(p.comments_count) |> over(order_by: a.rn)})
|> where([p], p.rn > 10)
|> subquery()
|> where([p], p.comments_count_total < 1000)
|> preload([author: author], author: author)

It tells me : ** (Ecto.QueryError) unknown bind name ':author' in query:
I don’t want to join authors again, can I reuse them somehow from subquery?

Yeah because of |> subquery(). The query passed into preload has no :author binding. When using subquery all the returned columns become part of the binding you use subquery for.

Subqueries can refer to a named parent using parent_as: Ecto.Query.API — Ecto v3.9.5

Example from docs:

child_query = from c in Comment, where: parent_as(:posts).id == c.post_id
from p in Post, as: :posts, inner_lateral_join: c in subquery(child_query)
1 Like

I don’t think that parent_as is suitable for what I want to achieve because it is used when we need to access binding from parent query in subquery, but I need to do vice versa - access binding in the parent query. Or do I miss something?

Does it mean that I don’t have other choice except joining with authors table again or just preload without using named binding (e.g separate query)?

I’m not sure how ecto does does map columns to fields when preloading. You could try using the binding of the subquery to preload (essentially [x] in your case). It should hold the necessary columns still.