How to find duplicate records in Ecto.Query

Hi OGs,
I have a postgresQL query as follow

select * from users ou
where (select count(*) from users inr
where inr.email = ou.email and inr.status_id = 3 and ou.status_id = 3) > 1 order by email

How can I transform this into Ecto.Query style ?
Cheers

Hey @docjazither, what have you tried so far?

Hi @benwilson512
I have tried this

User
    |> where(
      [a, b],
      a.email == b.email and a.status_id == 3 and b.status_id == 3
    )
    |> Repo.all()

But I dont know how to make the element count(*) > 1 involved

From Ecto v3.8.3 (literally the last release at the time of writing), you can have subqueries anywhere in where. The only trick is to use parent_as to access the parent user:

matching_parent_user =
  from u in User,
    where: u.email == parent_as(:user).email and u.status_id == 3 and parent_as(:user).status_id == 3,
    select: count()

query =
  from User,
    as: :user,
    where: subquery(matching_parent_user) > 1,
    order_by: :email

If this doesn’t work, then a fragment should:

query =
  from u in User,
    where: fragment("(select count(*) from users inr where inr.email = ? and inr.status_id = 3 and ? = 3)", u.email, u.status_id) > 1,
    order_by: :email
4 Likes

It is also possible to write the query with exists/1 function like:

from users_out in User,
  as: :users_outter,
  where:
    exists(
      from users_inner in Users,
        where: parent_as(:users_outter).email == users_inner.email,
        where: parent_as(:users_outter).status_id == 3,
        where: users_inner.status_id == 3,
        group_by: users_inner.email,
        having: count(users_inner.email) > 1,
        select: 1
    ),
  order_by: :email

2 Likes

This is the solution as well, thank you :pray:

This is the solution as well, thank you :pray: