We are using
Ecto with a Postgres DB. Occasionally we get errors like below in the Postgres log:
2022-09-17 12:36:59.478 EDT  ERROR: could not serialize access due to concurrent update 2022-09-17 12:36:59.478 EDT  CONTEXT: SQL statement "update collections set scoped_seq = scoped_seq + 1 where id = collection_id returning scoped_seq" PL/pgSQL function advance_seq(uuid) line 5 at SQL statement SQL statement "select advance_seq(new.collection_id)" PL/pgSQL function set_rev_id() line 3 at SQL statement
90% of the time this results in a
Postgrex.Error exception with a code of
:serialization_failure which permits us to retry the transaction.
However, in a small percentage of cases we get the same error in the Postgres log, but instead a
DBConnection.ConnectionError exception with a message of:
12:17:13.217 [error] Postgrex.Protocol (#PID<0.2447.0>) disconnected: ** (DBConnection.ConnectionError) transaction rolling back
My questions are:
- Why do we sometimes get a Postgrex error and sometimes a DBConnection disconnect for what appears to be the same error?
- Why should it disconnect?
- Can we stop it from disconnecting (or force a reconnect)?
Now I can retry the transaction further up the stack, but that involves re-doing lots of data processing and doesn’t always work.
We are running latest of db_connection, ecto and postgrex.