Hello everyone, I was working on a project and was faced with this problem and I’d like some input from you on the subject. Imagine this scenario:
- Posts can be published and reviewed.
- A Post has many Authors associated with it and vice-versa.
- Authors can review Posts (identified by the
reviewer
column in the association). - Post(s) can be listed with its authors which must have the virtual
reviewer
flag filled.
schema "posts" do
field :title, :string
field :body, :string
many_to_many :authors, Author, join_through: AuthorPost, on_replace: :delete)
end
schema "authors" do
field :name, :string
field :email, :string
many_to_many :posts, Post, join_through: AuthorPost, on_replace: :delete)
field :reviewer, virtual: true
end
@primary_key false
schema "authors_posts" do
field :author_id, :integer, primary_key: true
field :post_id, :integer, primary_key: true
field :reviewer, :boolean
end
The way I see there are three ways that we could approach this at first:
- Use a preload-join to load authors associated with posts in a single query. This way I can’t fill the
reviewer
field. Also, I’m aware that it may not be the best option to do a preload-join forhas_many
andmany_to_many
associations as explained here:
- https://twitter.com/michalmuskala/status/1118119934758928385
- Advantages of Ecto preloads with/without joins
- https://www.amberbit.com/blog/2019/4/16/composing-ecto-queries-filters-and-preloads
Repo.all(
from p in Post,
join: a in assoc(p, :authors),
preload: [authors: a]
)
- Use a “simple” preload without a join, to load authors associated with posts issuing two queries. This suffers the same problem as the first option because I can’t customize how it’s being fetched from the database. It seems though that the query Ecto outputs for this is a little bit more optimized than whatever I’m doing right now:
Repo.all(
from p in Post,
preload: [:authors]
)
SELECT a0.*, p1."id"
FROM "authors" AS a0
INNER JOIN "posts" AS p1
ON p1."id" = ANY($1)
INNER JOIN "authors_posts" AS a2
ON a2."post_id" = p1."id"
WHERE (a2."author_id" = a0."id")
ORDER BY p1."id" [[1, 2, 3, 4, 5, 6, 7, 8, 9]]
- Use a custom preload query, to load authors associated with posts issuing two queries. This allows me to better configure how I want to return the Authors, but the query generated seems a little bit too complex if you consider the second option:
query =
from a in Author,
join: ap in AuthorPost,
on: a.id == ap.author_id,
select_merge: %{reviewer: ap.reviewer}
Repo.all(
from p in Post,
preload: [authors: ^query]
)
SELECT a0.*, p2."id"
FROM "authors" AS a0
INNER JOIN "authors_posts" AS a1
ON a0."id" = a1."author_id"
INNER JOIN "posts" AS p2
ON p2."id" = ANY($1)
INNER JOIN "authors_posts" AS a3
ON a3."post_id" = p2."id"
WHERE (a3."author_id" = a0."id")
ORDER BY p2."id" [[1, 2, 3, 4, 5, 6, 7, 8, 9]]
So, how would you approach doing this, considering that you must retrieve the records through a Post and not loading the associations directly from the database? I figured that this might be a very common use case, in the end, I went with this last option, but I’d like to explore how to better use Ecto for this purpose.
Some things that I’ve learned fiddling with Ecto SQL this week:
- It seems that Ecto does not support preloading from a subquery (bummer)
- It seems that Ecto does not let you customize your join queries (only
where
's are aloud) - Sometimes it may be better to do two queries to preload instead of a preload-join (not so clear in my mind yet tbh)