Help with CTE query in Ecto

I’m trying to convert this SQL to Ecto:

WITH CTE AS (
    SELECT 
        *,
        SUM(amount) OVER (ORDER BY inserted_at DESC) AS running_total
    FROM transactions
)
SELECT *
FROM CTE
WHERE running_total < 1000; 

It’s supposed to find the top N rows where their total amount is less than 1000.

I can’t figure out how to reference the columns in the CTE query…

transactions_with_sum = Ecto.Query.from(t in Transaction,
  select: %{
    t: t,
    running_total: over(sum(t.amount), order_by: [desc: t.inserted_at])
  }
)

Transaction
|> Ecto.Query.with_cte("cte", as: ^transactions_with_sum)
|> Ecto.Query.where([t], t.running_total < 1000)
|> Repo.all()

I get this error:

** (Ecto.QueryError) lib/foo/transaction.ex:27: field `running_total` in `where` does not exist in schema Transaction in query:

from t0 in Transaction,
  where: t0.running_total < ^1000,
  select: t0

Any ideas? Thanks for the help!

Why use CTE here at all? Wouldn’t just subquery do the job?

I’m pretty unfamiliar with CTE… I assumed it was somehow more performant than using subquery.

How would you do it with subquery?

CTE and subquery in this case are the same. It’s more about what you think looks more readable. To use it in ecto just wrap your query in subquery(your_query_here) and use it in joins or froms.

In general, CTEs can have better, worse, or the same performance as subqueries. CTEs have the option to be materialized, which means the result is evaluated once, independently of the rest of the query, and then referenced wherever needed.

The downside of this is that you can’t push optimizations from outer queries into the CTE. The upside is that expensive operations/side effects will only be done once. See here: for some examples: PostgreSQL: Documentation: 15: 7.8. WITH Queries (Common Table Expressions).

2 Likes

The CTE example in the docs uses the name passed to with_cte as a table name in the subsequent join:

Product
|> recursive_ctes(true)
|> with_cte("category_tree", as: ^category_tree_query)
|> join(:left, [p], c in "category_tree", on: c.id == p.category_id)

Re: the subquery angle - here’s an example of filtering by a window function using a subquery, from the Postgres docs:

SELECT depname, empno, salary, enroll_date
FROM
  (SELECT depname, empno, salary, enroll_date,
          rank() OVER (PARTITION BY depname ORDER BY salary DESC, empno) AS pos
     FROM empsalary
  ) AS ss
WHERE pos < 3;
1 Like

If you are after an example of something working:

This is the format I like to use

@comment_tree """
    SELECT id, parent_id, ARRAY[id] AS path
    FROM comments
    WHERE parent_id IS NULL
    UNION ALL
    SELECT c.id, c.parent_id, ct.path || c.id
    FROM comments AS c
    JOIN comment_tree AS ct ON ct.id = c.parent_id
  """

  def list_replies(comments) do
    ancestor_ids = Enum.map(comments, & &1.id)
  
    Comment
    |> recursive_ctes(true)
    |> with_cte("comment_tree", as: fragment(@comment_tree))
    |> join(:inner, [c], ct in "comment_tree", on: ct.id == c.id)
    |> select([c, ct], %{c | id: c.id})
    |> where([c, ct], c.depth >= 1 and c.ancestor_id in ^ancestor_ids)
    |> order_by([c, ct], ct.path)
    |> preload([:user, :parent, [post: :user]])
    |> Repo.all()
  end
1 Like