Upsert with where clause

I have a soft delete column in my table (deleted_at). I am trying to work out how to write an “upsert” using Repo.insert/2 with the on_conflict and conflict_target options that respects the deleted_at column (i.e., treats a row with a non-null deleted_at value as if the row didn’t exist for purposes of upserting).

In other words, I want the upsert to insert if there is no record at all or if there is no record where deleted_at is NULL, and update if there is a record and deleted_at is NULL.

For a concrete example, let’s say I have an association table that records relationships between Companies and Programs:

  schema "company_programs" do
    field :leader, :string
    field :deleted_at, :utc_datetime
    timestamps(type: :utc_datetime)
    belongs_to :company, Company
    belongs_to :program, Program
  end

Further, my migration establishes a uniqueness constraint on the combination of company_id and program_id.

And let’s say I have the following record in that table:

id | company_id | program_id | leader |     inserted_at     |      updated_at     |     deleted_at
---|------------|------------|--------|---------------------|---------------------|---------------------
 1 |     1      |     5      |   Mary | 2021-07-13 12:00:00 | 2021-07-15 12:00:00 | 2021-08-01 12:00:00

I thought the upsert code should look something like:

on_conflict = from(r in CompanyPrograms, where: is_nil(r.deleted_at), update: [set: [leader: ^leader]])
Repo.insert(
  conflict_target: [:company_id, :program_id],
  on_conflict: on_conflict,
  returning: true
)

but using this code, if I run an upsert using the same company_id and program_id as show above, the result is the error “(Ecto.StaleEntryError) attempted to insert a stale struct” since the set of records matching the conflict criteria is the empty set (company_id matches, program_id matches but deleted_at is not null).

Is it possible to structure this upsert so that, for the record above, doing a subsequent upsert with the same company_id and program_id would insert a NEW record in the table with a different PK but the same company_id and program_id as the above record?

Thanks,

Dave

What about a conditional unique index?

Thanks! I will try that out and follow up here with results for posterity.

2 Likes

@qhwa is right. The unique index on company_id and program_id needed to have a where clause excluding soft deleted rows - a partial index.

This makes total sense (anyway) since I do want to allow the possibility of another record with the same combination of company_id and program_id to be created after one is soft deleted.

So now, in my contrived example, the index looks like this:

CREATE UNIQUE INDEX company_programs_company_id_program_id_index
    ON public.company_programs USING btree
    (company_id ASC NULLS LAST, program_id ASC NULLS LAST)
    TABLESPACE pg_default
    WHERE deleted_at IS NULL;

and the upsert logic looks like this:

Repo.insert(
  entity,
  conflict_target: {:unsafe_fragment, "(company_id, program_id) WHERE deleted_at is null"},
  on_conflict: {:replace, [:leader]},
  returning: true
)

I did verify that:

  1. After soft deleting a record (i.e., entering a date into the deleted_at column of the row), this upsert logic will insert a new record with the same company_id and program_id as the soft deleted row.
  2. That upserting for a row that is not soft deleted does update the row.
  3. Removing the where deleted_at is null in the conflict target expression breaks the upsert and results in an error (Postgrex.Error) ERROR 42P10 (invalid_column_reference) there is no unique or exclusion constraint matching the ON CONFLICT specification

Thanks again @qhwa

6 Likes

Awesome! I haven’t tried it before so I learned it today from you. Thank you for sharing the full solution!

1 Like