How to translate a union query with limit and order by to Ecto

I accidentally deleted the last post, I don’t mean to spam, sorry.

I came up with the following query that works best with my schema (simplified):

select *
from ((select *
       from token_transfers
       where from_address_hash = $1
       order by block_number desc, log_index desc
       limit 10)
      union all
      (select *
       from token_transfers
       where from_address_hash = $2
       order by block_number desc, log_index desc
       limit 10)
      union all
      (select *
       from token_transfers
       where from_address_hash = $3
       order by block_number desc, log_index desc
       limit 10)) tt
order by tt.block_number desc, tt.log_index desc
limit 10;

but with Ecto this is the most similar query that I could get:

  def test do
    query1 =
      from(t in TokenTransfer,
        where: t.from_address_hash == ^"0x0000000000000000000000000000000000000001",
        order_by: [desc: t.block_number, desc: t.log_index],
        limit: 10
      )

    query2 =
      from(t in TokenTransfer,
        where: t.from_address_hash == ^"0x0000000000000000000000000000000000000002",
        order_by: [desc: t.block_number, desc: t.log_index],
        limit: 10
      )

    query3 =
      from(t in TokenTransfer,
        where: t.from_address_hash == ^"0x0000000000000000000000000000000000000003",
        order_by: [desc: t.block_number, desc: t.log_index],
        limit: 10
      )

    queries = [query1, query2, query3]

    united_query = queries |> Enum.reduce(fn query, acc -> union_all(acc, ^query) end) |> exclude(:order_by)

    from(t in subquery(united_query),
      order_by: [desc: t.block_number, desc: t.log_index],
      limit: 10
    )
    |> Explorer.Repo.all()
  end

that translates to (simplified for readability):

SELECT *
FROM (SELECT *
      FROM "token_transfers" AS st0
      WHERE (st0."from_address_hash" = $1)
      UNION ALL
      (SELECT *
       FROM "token_transfers" AS st0
       WHERE (st0."from_address_hash" = $2)
       ORDER BY st0."block_number" DESC, st0."log_index" DESC
       LIMIT 10)
      UNION ALL
      (SELECT *
       FROM "token_transfers" AS st0
       WHERE (st0."from_address_hash" = $3)
       ORDER BY st0."block_number" DESC, st0."log_index" DESC
       LIMIT 10)
      LIMIT 10) AS s0
ORDER BY s0."block_number" DESC, s0."log_index" DESC
LIMIT 10;

and these queries are semantically different from what I can say, following this simple example:

the query similar to the first one:

select *
from ((select *
       from generate_series(1, 50) a
       order by a desc
       limit 10)
      union all
      (select *
       from generate_series(50, 100) b
       order by b desc
       limit 10)
      union all
      (select *
       from generate_series(100, 150) c
       order by c desc
       limit 10)) x
order by x desc
limit 10;

result:

+---+
|a  |
+---+
|150|
|149|
|148|
|147|
|146|
|145|
|144|
|143|
|142|
|141|
+---+

the query similar to the second one:

SELECT *
FROM (SELECT *
      FROM generate_series(1, 50) a
      UNION ALL
      (SELECT *
       FROM generate_series(50, 100) b
       ORDER BY b DESC
       LIMIT 10)
      UNION ALL
      (SELECT *
       FROM generate_series(100, 150) c
       ORDER BY c DESC
       LIMIT 10)
      LIMIT 10) AS x
ORDER BY x DESC
LIMIT 10;

result:

+--+
|a |
+--+
|10|
|9 |
|8 |
|7 |
|6 |
|5 |
|4 |
|3 |
|2 |
|1 |
+--+

How I can achieve first query behaviour with Ecto?

The problem here is that the Ecto union binds the limit to the outer union query, not the inner query that you want. This is documented behavior, and is consistent with how postgres parses the syntax.

E.g.

SELECT * FROM a LIMIT 10
UNION
SELECT * FROM b;

The above is not valid, because the first limit belongs at the end. Wrapping the first query in parentheses produces the expected result, I believe because it effectively creates a subselect/subquery:

(SELECT * FROM a LIMIT 10)
UNION
SELECT * FROM b;

I’m not sure how to get that SQL syntax in Ecto, though. Perhaps this might have the desired result (untested):

union_all(from(subquery(query1)), from(subquery(query2)))

Maybe someone else knows of a better solution :slight_smile:

2 Likes

Thank for the explanation and suggestion, your answer inspired me to do this:

  def test do
    query1 =
      from(
        subquery(
          from(t in TokenTransfer,
            where: t.from_address_hash == ^"0x0000000000000000000000000000000000000001",
            order_by: [desc: t.block_number, desc: t.log_index],
            limit: 10
          )
        )
      )

    query2 =
      from(t in TokenTransfer,
        where: t.from_address_hash == ^"0x0000000000000000000000000000000000000002",
        order_by: [desc: t.block_number, desc: t.log_index],
        limit: 10
      )

    query3 =
      from(t in TokenTransfer,
        where: t.from_address_hash == ^"0x0000000000000000000000000000000000000003",
        order_by: [desc: t.block_number, desc: t.log_index],
        limit: 10
      )

    queries = [query1, query2, query3]

    united_query = queries |> Enum.reduce(fn query, acc -> union_all(acc, ^query) end)

    from(t in subquery(united_query),
      order_by: [desc: t.block_number, desc: t.log_index],
      limit: 10
    )
    |> Repo.all()
  end

and this does somewhat what I want, e.g.:

explain analyse
SELECT *
FROM (SELECT *
      FROM (SELECT *
            FROM "token_transfers" AS sst0
            WHERE (sst0."from_address_hash" = $1)
            ORDER BY sst0."block_number" DESC, sst0."log_index" DESC
            LIMIT 10)
      UNION ALL
      (SELECT *
       FROM "token_transfers" AS st0
       WHERE (st0."from_address_hash" = $2)
       ORDER BY st0."block_number" DESC, st0."log_index" DESC
       LIMIT 10)
      UNION ALL
      (SELECT *
       FROM "token_transfers" AS st0
       WHERE (st0."from_address_hash" = $3)
       ORDER BY st0."block_number" DESC, st0."log_index" DESC
       LIMIT 10)) AS s0
ORDER BY s0."block_number" DESC, s0."log_index" DESC
LIMIT 10;
1 Like