Upsert with insert_all and on_conflict causing null violation on unspecified field

tfwright · December 11, 2020, 12:17am

I need to update a bunch of records at once (as part of a single request) and I decided to try out using Repo.insert_all with the on_conflict option set to {:replace, fields} because I know the specific list of fields I want to add to the existing records:

Repo.insert_all(MySchema, [%{id: 1, my_field: "test"}], on_conflict: {:replace, [:my_field]}, conflict_target: :id)

I wrote a simple test that asserts my_field on row with id 1 is now “test”, but it failed complaining about null value for inserted_at. My understanding is the replace option I’m using should not update the timestamps, which are already set to non-null values. If I specify the timestamps, the test passes. Additionally, no other fields are being updated. I can see in the test they have the same values I used during the test setup that inserted the record with id 1.

Any suggestions what I could be doing wrong?

akash-akya · December 11, 2020, 10:56am

Hard to guess without more details, such as schema, db etc.

Maybe you dont have any uniqueness constraint on id? or there is no row to update.

I’m guessing that your insert_all query is trying to insert a new row (not updating). Note that insert_all does not set timestamp fields such as inserted_at or updated_at) like the usual insert/update function.

You can enable logs and check the queries it’s making to db and confirm.

tfwright · December 11, 2020, 2:55pm

I’m on postgres 12. id is an auto-incrementing primary key added by Ecto in a create table migration.

I’m pretty sure the issue has to do with timestamps, which I don’t want to update. If I include them, the existing row updates as expected so I think the conflict setting is working. This is the SQL:

INSERT INTO "my_table" ("my_field","id") VALUES ($1,$2) ON CONFLICT ("id") DO UPDATE SET "tax" = EXCLUDED."tax","ship_from_zip" = EXCLUDED."ship_from_zip"... ["UPS", 37]

The timestamp columns seem omitted from the list of fields, and they are the only ones that are being erroneously updated.

tfwright · December 11, 2020, 5:19pm

From what I can tell this is a limitation of postgres. It applies constraints as if a new row were being inserted, so even if the existing row + update clause are valid, the latter query will still error if it doesn’t include all not null columns. Perhaps someone with more knowledge of postgres can confirm this and hopefully provide some insight as to why that is the case.

akash-akya · December 12, 2020, 7:18pm

I tried this locally and got the same null value error from postgres

Looking back at the spec

ON CONFLICT can be used to specify an alternative action to raising a unique constraint or exclusion constraint violation error.

As per the spec ON CONFLICT will be used when insert raise unique constraint violation error, but in this case insert raises null value error. Likely it checks this error before it checks for uniqueness (anyway I don’t think ordering matters).

In other words, “there is no conflict here because insert is invalid”

tfwright · December 12, 2020, 7:42pm

Yep this is how I understand as well, although the postgres docs could certainly be clearer on this point!

benzene73 · February 26, 2025, 8:51am

Summary for those searching for this error like I did:

PostgreSQL requires all NOT NULL columns to be provided in an INSERT, even if the row already exists.
If any NOT NULL column is missing, the INSERT is rejected before checking for conflicts.

ON CONFLICT DO NOTHING does NOT bypass this

Normally, ON CONFLICT DO NOTHING should ignore conflicts, but it still fails because PostgreSQL never reaches the conflict resolution step.
Instead, the INSERT itself is invalid, so the query never proceeds to the ON CONFLICT clause.

spicychickensauce · April 29, 2025, 1:51pm

I’m kinda baffled that there is no easier solution in Ecto for doing bulk updates like this, e.g. when we can’t derive the values from other values in the DB and update_all won’t do.

I ended up here after hours of searching and thanks to this thread found a good enough workaround. I hope this helps anyone else ending up here:

Repo.insert_all(
  MySchema,
  [%{id: 42, a_field: "value", inserted_at: {:placeholder, :now}}],
  on_conflict: {:replace, [:a_field]},
  conflict_target: [:id],
  placeholders: %{
    now: NaiveDateTime.utc_now() |> NaiveDateTime.truncate(:second)
  }
)

Note that we need to include a value for each non null field, even if they are not used!
The inserted_at field is not actually updated, it just needs to be there in case it actually does an insert.

Also note that I used the placeholders feature for this, such that we do not waste any bandwidth for fields that are ignored anyway.

If anyone knows of an even better solution, please let us know!