Ecto query on empty association

So I have this entity with association. I can sort those by association length:

import Ecto.Query

from(e in MyApp.Entity,
  left_join: a in assoc(e, :association),
  group_by: e.id,
  order_by: [desc: count(a.id)]

But I’d like to filter only those where association is empty:

from(e in MyApp.Entity,
  left_join: a in assoc(e, :association),
  group_by: e.id,
  where: count(a.id) == ^0
)

but I get: ** (Postgrex.Error) ERROR 42803 (grouping_error) aggregate functions are not allowed in WHERE :frowning:

Can anyone help on how to write this?

Thanks

1 Like

With group by, you need to use having instead of where. where works on table rows, while having will work on the entire group filtered by group by by means of aggregate functions (count, sum, avg, min, max .etc).

1 Like

@sfusato gave you an answer, however your query still can be slow. Mostly because COUNT in Postgres can sometimes be slow. Instead it would be better to use EXISTS query (which will remove need for GROUP BY as well). The problem is that Ecto right now do not support such queries in Ecto.Query so you need to use fragment as a workaround:

from e in MyApp.Entity,
  where: fragment("NOT EXISTS (SELECT * FROM associations a WHERE a.entity_id = ?", e.id)

Check plan and see which one is more performant (and do not forget about indices).

3 Likes

Trying multiple queries and comparing the execution plans is the best idea in this case (well, in most of the cases that require a non-trivial query).

The result can also be achieved with a left/right join + group by + a where clause on null id as illustrated by the below image (later edit: Venn diagrams are used in conjunction with sets and here it’s a cross product between tables, so keep this distinction in mind…thanks @hauleth for clarifying this).

I would be curious which is the most efficient query as well.

2 Likes
8 Likes

Say NO to Venn Diagrams When Explaining JOINs is amazing - instantly understandable - thx @hauleth

1 Like

Thanks for your feedback.

@hauleth since performance is not a criteria for me here (I’m doing a maintenance mix task), I’ll go for the having syntax, as I did not succeed into writing the fragmented version:

    from(c in Entity,
      where: fragment("NOT EXISTS (SELECT * FROM assoc cc WHERE cc.entity_id = ?", c.id)
    )

# gives me:

** (Postgrex.Error) ERROR 42601 (syntax_error) syntax error at end of input

    query: SELECT c0."id", c0."slug", c0."name", c0."inserted_at", c0."updated_at" FROM "entities" AS c0 WHERE (NOT EXISTS (SELECT * FROM assoc cc WHERE cc.certifier_id = c0."id")
    (ecto_sql) lib/ecto/adapters/sql.ex:612: Ecto.Adapters.SQL.raise_sql_call_error/1
    (ecto_sql) lib/ecto/adapters/sql.ex:545: Ecto.Adapters.SQL.execute/5
    (ecto) lib/ecto/repo/queryable.ex:192: Ecto.Repo.Queryable.execute/4
    (ecto) lib/ecto/repo/queryable.ex:17: Ecto.Repo.Queryable.all/3
    (vae) lib/mix/tasks/certification/rncp_update.ex:49: Mix.Tasks.RncpUpdate.clean_avril_data/0
    (mix) lib/mix/task.ex:331: Mix.Task.run_task/3
    (mix) lib/mix/cli.ex:79: Mix.CLI.run_task/2
    (elixir) lib/code.ex:767: Code.require_file/2

I forgot about ), it should be:

from(c in Entity,
      where: fragment("NOT EXISTS (SELECT * FROM assoc cc WHERE cc.entity_id = ?)", c.id)
    )
1 Like