Support :insert_all in data-modifying CTE / WITH statements

Ecto currently supports some data-modifying WITH statements / CTEs for Postgres:

Options: […]
:operation - one of :all , :update_all , or :delete_all indicating the operation type of the CTE query. If blank, it defaults to :all , making the CTE query a SELECT query.
[…]
For Postgres built-in adapter, it is possible to define data-modifying CTE queries:

update_categories_query =
  Category
  |> where([c], is_nil(c.parent_id))
  |> update([c], set: [name: "Root category"])
  |> select([c], c)

{"update_categories", Category}
|> with_cte("update_categories", as: ^update_categories_query, operation: :update_all)
|> select([c], c)

Postgres, however, also supports INSERT statements inside CTEs:

-- EXAMPLE
with expensive_calc as (
   select customer_id, sum(amount) as total 
   from orders 
   group by customer_id
),
new_segments as (
   insert into segments (customer_id, total)
   select customer_id, total 
   from expensive_calc
   returning id, customer_id
)
insert into segment_history (segment_id, customer_id)
select id, customer_id 
from new_segments;

This allows multiple INSERTs to access the same source data, as well as the newly inserted data without crossing the database boundary.

The closest solutions in Ecto are:

  1. Create temp table for expensive_calc (needs fragment)
  2. Repeat expensive_calc as subquery (likely does the calculation twice)

… then use two sequential Repo.insert_all statements. However, this needs serialization of the RETURNING data and unnecessarily crosses the database ↔︎ app boundary.

Other examples:

Ecto syntax

I assume this would require a new macro insert/3 similar to update/3 which allows adding returning: [p.id, p.title, ...]? Haven’t given this too much thought.

Judging by the ecto_sql source and associated PR, this was initially planned but not implemented yet:

Maybe now is the right time?

1 Like

Bump to gather comments from top contributors — wdyt @josevalim @ericmj @michalmuskala?