How do you preload multiple (non nested) associations?

Let’s A, B and C be three schemas with A having one-to-many assocs :bs and :cs towards B and C (A has many Bs and Cs, and B and C belongs to one A).

What do you use to do to preload both assocs?

First option might be:

query = from(a in A, preload: [:bs, :cs])

This code will generate 3 SQL queries, efficiency depends on the number of A rows and the length of the query strings depends also on the number of A rows.

Second option might be

query = from(a in A, left_join: b in assoc(a, :bs), left_join: c in assoc(a, :cs), preload: [bs: b, cs: c])

I this case we get just one SQL query, but the efficiency will depends on the product of B rows x C rows since the SQL query contains two left joins that are unrelated (efficiency has nothing to with indexes):

SELECT a..., b..., c... FROM "as" AS a LEFT JOIN "bs" AS b ON b.a_id = a.id LEFT JOIN "cs" AS c ON c.a_id = a.id

I think none of the options are acceptable so I am fetching entries for two queries (introducing the same :order_by) and then I am merging the preloads with a function add_preload (implemented with Enum.zip):

q1 = from(a in A, left_join: b in assoc(a, :bs), preload: [bs: b])
q2 = from(a in A, left_join: c in assoc(a, :cs), preload: [cs: c])
Repo.all(q1) |> add_preload(Repo.all(q2), :cs)

I wrote a macro that takes a query (from a in A) and the names of the assocs and expand the code to the previous pattern but the problem with this approach is that it is not very composable with other preloads (nested ones in particular).

What do you use to do in these cases? Am I missing anything magical in Ecto?

You can also preload by using Repo.preload/3, which is how you usually preload relations:

Repo.all(MySchema)
|> Repo.preload([:bs, :cs])
1 Like

Yes, but that generates the same 3 SQL queries than the first option with the preload in the query.

Preloads are designed to work this way, if your aim is to do a single query, then the assoc option you posted is the way to go.

If we are talking about performance, then that is another question entirely. Generally speaking you should not care whether you use joins or preloads, unless we are talking about huge databases, and even then this goes case to case depending on the structure of the tables and amount of data.

but what is the problem with multiple queries?
as a rule of thumb, if you dont have filterning to do, is better to get associations in different queries. joins are prone to nested loops that can lead to a single bad performing query. multiple queries has the advantage of:

  • avoid nested loops
  • split queries into several very small transactions
1 Like

I have no problem with running multiple queries but I would like to write less :slight_smile:

personally i’d worry less about it. keep in mind that this is not a N+1 problem. since what ecto does is 1 query with all ids it needs to fetch.

other than that, going to your initial post…

i’d avoid those type of solutions. the macro one for the reason that you said, it won’t compose very well. and the first one because it’s actually worse than the preload. you’re doing 2 queries but traversing as twice, that is worse then preloading… while preloading do 3 queries, it only traverse as once.

1 Like

I think one important concern here is the cardinality of the relationship. A one-to-many relationship has different concerns than a one-to-one relationship. Just today I used a “join” plus “preload” in the query for a one-to-one relationship in order to avoid a 2nd database hit. But I wouldn’t do that for a one-to-many (unless the root entity had few fields and I knew there would be few of the child entity).

Are you familiar with preload queries? Those should get you something close to your custom add_preload approach, but use Ecto logic to do the deserialization work for you.

2 Likes

Multiplicity of the associations is important and I agree you that the a one-one association can be managed in a different form (even a many-many one must be managed differently).

I am familiar with preload queries, in fact both examples I show rely on preload queries.

Thanks for your comments!

Really? They don’t look like the example I linked to

comments_query = from c in Comment, order_by: c.published_at
Repo.all from p in Post, preload: [comments: ^comments_query]

In that example the value in the preload keyword list is a query, not a named binding.

You are right, I was a bit unobservant.

Nevertheless, the SQL queries generated by Ecto are the same generated for the first example (from(a in A, preload: [:bs, :cs])). Anyway, I’ll explore this option as a source of expressiveness. Thanks!