I recently upgraded Oban and Oban Pro to the latest version, 2.19.2 and 1.5.2 respectively.
I’ve faced a situation related to the unique constraint that crashed the queue and started creating hundreds of producers for the same queue.
There is a worker with the following unique configuration:
[
fields: [:queue, :worker, :args],
keys: [:conversation_id],
states: [:scheduled, :executing, :retryable]
]
As you can see, the available
state was not included in the states
list.
There was a job on available
state and other job was scheduled with the same args. The insert worked, but I think when it tried to update the scheduled
job to executing
it crashed everything. The fix was to manually delete the two jobs. The error was:
GenServer {Oban.Registry, {Oban, {:producer, "my_queue"}}} terminating
** (Postgrex.Error) ERROR 23505 (unique_violation) duplicate key value violates unique constraint "oban_jobs_unique_index"
table: oban_jobs
constraint: oban_jobs_unique_index
Key (uniq_key)=(KcFMKL8Lc5Yhu9w58TM27eZNdPcftdhRYWHIXYNaygM) already exists.
(ecto_sql 3.12.1) lib/ecto/adapters/sql.ex:1096: Ecto.Adapters.SQL.raise_sql_call_error/1
(ecto_sql 3.12.1) lib/ecto/adapters/sql.ex:994: Ecto.Adapters.SQL.execute/6
(ecto 3.12.5) lib/ecto/repo/queryable.ex:232: Ecto.Repo.Queryable.execute/4
(oban_pro 1.5.2) lib/oban/pro/engines/smart.ex:1113: Oban.Pro.Engines.Smart.fetch_jobs/2
(ecto 3.12.5) lib/ecto/multi.ex:897: Ecto.Multi.apply_operation/5
(elixir 1.18.2) lib/enum.ex:2546: Enum."-reduce/3-lists^foldl/2-0-"/3
(ecto 3.12.5) lib/ecto/multi.ex:870: anonymous fn/5 in Ecto.Multi.apply_operations/5
(ecto_sql 3.12.1) lib/ecto/adapters/sql.ex:1400: anonymous fn/3 in Ecto.Adapters.SQL.checkout_or_transaction/4
I believe the same could happen if there was an executing job and I tried to re-run a canceled/discarded job with the same args.
Is that expected?
For now, I changed my workers to include the available
state to the unique
configuration.
Checking the unique index, is on uniq_key
on condition (uniq_key IS NOT NULL)
, and it’s interesting, because the two jobs in question only one had uniq_key
, the other was empty, but both had it in the meta
.