A very common task for me is to count the number of entries that match certain criterias.
And I usualy do so by doing the following:
from(o in Organization,
join: w in Workspace,
where: o.id == w.org_id)
|> Repo.aggregate(:count)
But recently I took a look at the source code of Ecto to try to understand which option would be better, if it’s to continue to count it that way, or instead:
from(o in Organization,
join: w in Workspace,
where: o.id == w.org_id),
select: count(o.id)
Could someone enlighten me in the differences for the two options and which one should I choose?
The only difference seems to be that you are able to choose whether to apply it to an existing query, as your 1st example, or build it in to your query, as your 2nd example.
Ecto.Repo.Queryable.aggregate/4 builds a %SelectQuery{} and updates the query map %{query | select: select}. See in line 486
SELECT count(*) FROM "organization" AS o
INNER JOIN "workspace" AS w
ON TRUE WHERE (o."id" = w."org_id")
With select: count() query, scope for composing it with queries will be limited. You can compose second query from first.
query = from(o in Organization, join: w in Workspace, where: o.id == w.org_id)
Repo.all(query)
Repo.aggregate(query, :count)
# second query can be composed from first without duplicating
count_query = from(query, select: count())
Repo.one(count_query)
When writing queries in Ecto, i will give preference to composability (else there will be duplication of queries).