I had a discussion about this on IRC and it seems to often be better to not join manually and just preload the data (and let it do multiple queries).
This would issue one request to the DB:
post_query =
from p in Post,
where: p.id == ^id,
join: c in assoc(p, :comments),
join: u in assoc(c, :user),
preload: [comments: {c, user: u}]
But it would return a lot of excessive data because the join would cause the Post
data to be sent for each comment (due to how SQL works). Ecto hides this though by throwing away the excessive data.
You can se this if you run something like the query I wrote before:
post_query =
from p in Post,
where: p.id == ^id,
join: c in assoc(p, :comments),
select: %{id: p.id, comments: c}
It will return one struct for each Comment so if a Post has two comments:
[%{
id: 1,
comments: %{id: 123, text: "foo"}
}, %{
id: 1,
comments: %{id: 124, text: "Next comment"}
}]`
You don’t see this behaviour when running without the select statement because Ecto takes care of mapping the data.
When just preloading:
post_query =
from p in Post,
where: p.id == ^id,
preload: [comments: [:user]]
This would issue three requests but no excessive data so in many cases it is actually more efficient (Ecto even does the requests in parallell when possible).
For example, if the same user has made all the comments, the user would only be fetched once. When joining, the user would be returned for each comment.
So, unless you know you have a bottleneck here it may not be worth it…
(I’m quite new to Ecto, so if someone sees something wrong here, please let me know! )