Recursive Query Depth (or Level) with Ecto

TL;DR:

Is there a way, without resorting to large fragments for a CTE to add a depth or level field to a recursive CTE as described here?

Longer story:

If you want to know why I want to do this without a fragment, the answer is I want to take two Ecto queries that can be filtered independently and then use union_all/2 to put them together.

I restructured the fragment to work entirely with Ecto queries using this with_cte/3 and it works great!

Well, it works great with one exception. A lot of recursive queries use a depth or level field as a strategy for for limited recursion and preventing infinite recursion. See this post for an example.

The trouble comes from needing to SELECT a thing I don’t know how to select in SQL. In the example you have these:

WITH RECURSIVE t(item_id, json, level)

I don’t see any way to get named columns for a CTE. It appears to only work through inference in Ecto.

And:

SELECT parent.item_id, to_jsonb(parent) || jsonb_build_object( 'children', t.json ), level + 1

I cannot see a way to add an arbitrary select to an existing query, like level + 1.

I see that with_cte/3 has an example of where I can pass in a tuple like so:

{"category_tree", Category}
|> recursive_ctes(true)
|> with_cte("category_tree", as: ^category_tree_query)
|> join(:left, [c], p in assoc(c, :products))
|> group_by([c], c.id)
|> select([c, p], %{c | products_count: count(p.id)})

I feel like the answer might be in here but I’m missing something.

Admittedly, CTEs and recursion are two things I use so rarely in SQL it’s a situation of “find an example and mess with it until it works as expected.” So it’s quite possible there’s something fundamental I’m missing in all of this. (SELECT level + 1 in the example kinda blows my mind.)

Will a select_merge help you in this scenario? I think you can project more columns using a base select merge that has the arithmetic expression.

The name is a bit misleading on second glance, it isn’t specific to MERGE statements :slight_smile:

This is one of the things I was messing with and it sorta works and the answer is… it might.

I can do something like this:

select_merge: %{col: fragment("depth + 1")}

The problem is that depth here can’t point to anything. I need to figure out some way to accomplish this:

select_merge: %{depth: fragment("prefix.depth + 1")}

This works in the sense that I end up with the SELECT I want. However, getting prefix is… its own problem.

The biggest issue here, is I see no way to produce this SQL from Ecto:

WITH RECURSIVE cte_name(col_1, col_2, col_3)

That’s really the missing piece. You kinda need that for the self-referencing incrementing column in the CTE.

After a bunch of messing around, I got it. As I said before, this is not an area of SQL I use a bunch so, if someone sees a better or more efficient way to handle this, let me know. With that said, it can be done!

The two main things I wanted to achieve were to:

  1. Select arbitrary SQL.
  2. Reference a CTE within itself.

Since I can’t really share work code, let’s consider the following contrived schemas:

defmodule Node do
  schema "nodes" do
    field :name, :string

    has_one :parent_node, __MODULE__,
      join_keys: [child_node_id: :id, parent_node_id: :id],
      join_through: "node_links"

    many_to_many :child_nodes, __MODULE__,
      join_keys: [parent_node_id: :id, child_node_id: :id],
      join_through: "node_links"
  end
end

defmodule NodeLink do
  schema "node_links" do
    field :active, :boolean, default: true

    belongs_to :parent_node, Node
    belongs_to :child_node, Node
  end
end

defmodule Node.Queries do
  def descendants(node, opts \\ []) do
    root_query =
      from link in NodeLink,
        select: [:active],
        select_merge: %{depth: fragment("0")},
        where: link.parent_node_id == ^node.id

    branch_query =
      from link in NodeLink,
        join: branch in "node_tree",
        on: branch.child_node_id == link.parent_node_id,
        select: [:active],
        select_merge: %{depth: fragment("? + 1", branch.depth)},
        where: fragment("? + 1", branch.depth) < ^Keyword.get(opts, :max_depth, 10)

    cte_union = union_all(root_query, ^branch_query)

    node_tree_query =
      "node_tree"
      |> recursive_ctes(true)
      |> with_cte("node_tree", as: ^cte_union)
      |> select([:active, :depth])

    from node in __MODULE__,
      inner_join: link in subquery(node_tree_query),
      on: link.child_node_id == node.id
  end
end

Just as a warning, I adapted this from other code and had to do a lot of renaming. I think it’s correct but I didn’t bother testing it. (The real stuff works.)

Aside from the use of fragment in select_merge the key is this line:

        join: branch in "node_tree",

That lets me use the CTE “before” it even exists. I feel like there’s a better way to do this, but it’s possible some of the examples I have looked at do some kind of implicit joining.

One slight annoyance is this:

    node_tree_query =
      "node_tree"
      |> recursive_ctes(true)
      |> with_cte("node_tree", as: ^cte_union)
      |> select([:active, :depth])

Note select/2 at the end. In raw SQL, I don’t need that. But in order to use a query, even as a subquery, without a schema, select/2 is required. However, when using it as a join, I didn’t actually need that. (In SQL, it seems to infer name/order from the root_query.)