Ecto transaction rolling back with no exceptions? Weird behavior or not missing docs!?

thiagomajesk · July 14, 2022, 7:36pm

Hi everyone, it seems I can’t get out of this forum this week (you guys are awesome btw).

So, I got surprised by this behavior while using Ecto and I’m still banging my head against the wall with some conflicting information.

A coworker opened this issue: Updating a record in a Repo transaction failure · Issue #3944 · elixir-ecto/ecto · GitHub yesterday where he describes something unexpected we found with how transactions work.

The context of the whole thing is in the issue, but to summarize… The docs about transaction/2 say:

If an unhandled error occurs the transaction will be rolled back and the error will bubble up from the transaction function. If no error occurred the transaction will be committed when the function returns. A transaction can be explicitly rolled back by calling rollback/1, this will immediately leave the function and return the value given to rollback as {:error, value}

This is what we are trying to do… We always have a “tag” persisted after trying to save a “post”. If we are unable to persist the “tag”, we also don’t care about saving the “post”:

Repo.transaction(fn repo ->
  case repo.insert(changeset) do
    {:ok, post} ->
     # Great! Please also save the tag confirming it
     # If we can't just explode and rollback...
      tag
      |> Ecto.Changeset.change(%{name: "success"})
      |> repo.update!()
    {:error, changeset} ->
     # Oh noes! Please save some information about it
     # If we can't save just explode and rollback...      
      tag
      |> Ecto.Changeset.change(%{name: "failure"})
      |> repo.update!()
    end
end)

What surprised us that is that this operation fails, even though we are handling the errors in the first insert. This is the exception raised when the code gets to the repo.update! (this is from the adapter, so it also happens with the safe alternative repo.update):

** (Postgrex.Error) ERROR 25P02 (in_failed_sql_transaction) current transaction is aborted, commands ignored until end of transaction block

Ok, apparently there’s something preventing the transaction from succeeding. After conducting some tests it seems that repo.insert is putting the transaction in a bad state and repo.update! can’t succeed because of it. We had that confirmed by jose and as it seems, this error is happening because an operation already failed inside the transaction. He also seems to agree that the docs are talking about exceptions and not just “errors”. So, what exactly are we missing here?

If the docs are right about the transaction rolling back on exceptions, why can’t this code succeed? If not, and we agree that there’s something to be improved in the docs, what should the correct behavior be?

I’m certain that there’s some important information I’m missing, but I’m feeling a bit misguided by the docs and we want to improve it for posterity.

joey_the_snake · July 14, 2022, 8:04pm

I think this is normal database behaviour. Whatever database you are using shouldn’t let you continue with the same transaction if a previous command caused an error. This is an error from the database not from Ecto:

** (Postgrex.Error) ERROR 25P02 (in_failed_sql_transaction)

thiagomajesk · July 14, 2022, 8:20pm

Yep, we found that out ourselves afterward, but still unexpected considering the docs description. I guess it’s perhaps one of those “Ecto is not your ORM” moments .

Still, there are some improvements to be made to the docs and we were already feeling kinda stressed out by apparently contradictory information. I’m honestly kinda drained at the point of having things spelled out to me.

BTW, just to vent a little bit: It’s not immediately obvious to more experienced people that sometimes questions like this one are a symptom of some sort of cognitive dissonance or lack of broader understanding about a subject. So answering those questions too objectively, without taking into account the reason for such misalignment of information might worsen the experience even more and in the end, make people more stressed overall. I know this is a technical community, but even so, we are still only people .

trisolaran · July 14, 2022, 8:23pm

What I guess it’s going on is that the first insert failed because it was violating some DB constraints, but this DB error was caught by your changeset (perhaps using unique_constraint/3 or similar) and converted into a changeset error. So the exception has been handled in the code, but the DB transaction is still tainted by the error and unable to continue (this is what the DB error is telling you).

I guess the documentation may be technically correct in that it says nothing about what happens when an error is handled That said, it would make sense to describe this possible situation.

All of this of course assuming that my above assumption is correct.

thiagomajesk · July 14, 2022, 8:28pm

Hey, @trisolaran this was intuitively what we thought at first, but look at this repro over here: ecto-transaction-flow-sample/script.exs at master · rafaeliga/ecto-transaction-flow-sample · GitHub. There is a constraint check in the changeset but it doesn’t seem to be enough to prevent this from happening.

Technically correct can be the best kind of correct , but unfortunately not enough to help our thick skulls I’m afraid!? Jokes aside, We will try to help with that if we can so no one else goes through the same experience ever again.

trisolaran · July 14, 2022, 8:38pm

Yeah that’s what I meant. unique_constraint doesn’t prevent the DB error, it works by waiting for the DB error to happen on insert, catching it and putting it into the changeset. What you’re seeing is expected. When you’re calling insert here you’re triggering a DB error. unique_constraint is simply catching the error for you and turning it into a nice human friendly error for your changeset, but the DB transaction is still in an error state.

thiagomajesk · July 14, 2022, 8:59pm

Although a little unintuitive at first; considering that Ecto doesn’t do much for those cases it makes sense that it behaves like that. Thanks for the clarifications @trisolaran .

So as it seems, it simply isn’t possible to wrap two operations in the same transaction if one of them relies on checking any kind of constraints.

That question that remains is if there’s another way to solve this without having to manually check for the existence of the record in the then to prevent tainting the transaction. Any particular suggestions!?

joey_the_snake · July 14, 2022, 9:53pm

You could do an insert with on_conflict: :nothing. This will not cause an error.

thiagomajesk · July 14, 2022, 10:15pm

Wow!!!

I have never thought about using the :on_conflict option with intent other than for upserts but this makes complete sense . Excellent, excellent observation @joey_the_snake!!!

al2o3cr · July 15, 2022, 12:12am

For whatever it’s worth, this issue has been confounding people for a long time (note the date!):

trisolaran · July 15, 2022, 5:17am

I’m afraid that won’t help. on_conflict: :nothing will prevent Ecto from raising, but the DB doesn’t care about it: the conflict violation will still have occurred, your post will not be inserted and the transaction will be in an error state. Plus you will not get any feedback on whether you were able to successfully insert your post or not, and you want to know that, right?

My suggestion: rollback your transaction at the first error and handle error reporting and logging outside of it.

joey_the_snake · July 15, 2022, 11:07am

Why do you think this? ON CONFLICT DO NOTHING is a database command that avoids raising the error. I have used it in production. PostgreSQL: Documentation: 16: INSERT

The optional ON CONFLICT clause specifies an alternative action to raising a unique violation or exclusion constraint violation error

trisolaran · July 15, 2022, 11:33am

You’re right, thanks for the clarification. Sure, no error will be raised, but the DB won’t insert anything in the DB yet insert will return {:ok, post} as if something was inserted.

joey_the_snake · July 15, 2022, 11:34am

Whether or not that’s fine is up to the developer though, no?

trisolaran · July 15, 2022, 11:40am

I personally wouldn’t want that in my code and I would find it confusing if I found it in other codebases. In any case, in this specific example, it’s an issue because @thiagomajesk wans to distinguish between {:ok, post} and {:error, changeset} and be able to take different actions. My point was that he can simplify things and avoid this entire issue by moving his error management code outside of the transaction

thiagomajesk · July 15, 2022, 1:20pm

That’s right @trisolaran! Just noticed this behavior testing today… Here’s the log if someone else is curious about it (from the repro I linked in the issue):

10:05:40.814 [debug] QUERY OK db=2.3ms queue=0.3ms
INSERT INTO "tags" VALUES (DEFAULT) RETURNING "id" []

10:05:40.819 [debug] QUERY OK db=2.4ms queue=0.2ms
INSERT INTO "posts" ("title") VALUES ($1) RETURNING "id" ["123"]

10:05:40.819 [debug] QUERY OK db=0.1ms
begin []

10:05:40.819 [debug] QUERY OK db=0.1ms
INSERT INTO "posts" ("title") VALUES ($1) ON CONFLICT DO NOTHING RETURNING "id" ["123"]
POST: %Post{__meta__: #Ecto.Schema.Metadata<:loaded, "posts">, id: nil, title: "123"}

10:05:40.822 [debug] QUERY OK db=0.1ms
commit []

I can see that the error was suppressed correctly but the “post” not persisted (id == nil).
I see how this is unintuitive that we are returning :ok for a failed operation and it wouldn’t be my first choice either. Since this is a corner case, perhaps we could solve it by pattern-matching and just forget about it:

Repo.transaction(fn repo ->
  case repo.insert(changeset) do
    {:ok, %{id: nil}} ->
       # failed first, we don't care anymore
       Repo.rollback(:post)
    {:ok, post} ->    
      tag
      |> Ecto.Changeset.change(%{name: "success"})
      |> repo.update!()
    {:error, changeset} -> 
      tag
      |> Ecto.Changeset.change(%{name: "failure"})
      |> repo.update!()
    end
end)

The real problem with this approach is that we don’t have any information whatsoever about why this has failed and in this specific scenario with no information about the constraint to report to users.

How would you do that if both operations are dependent on each other? I mean, they somehow have to be inside the same transaction to guarantee that we save at least one of them, right!?

BTW I see someone else suggesting the usage of savepoints, but I haven’t used those in Ecto before. I’m not sure how would the fit in this scenario.

trisolaran · July 15, 2022, 2:10pm

I wonder why you absolutely need to work with these “error” and “failure” tags in the DB and can’t instead log the error or show it to the user?

However, putting this aside, how about this:

write a “failure” tag before starting the transaction

Start the transaction:

insert the post, if insertion fails, roll back transaction
update the “failure” tag to be a “success” tag
commit

This way, if anything goes wrong with insertion, you’ll have a failure tag.
Wouldn’t this work?

thiagomajesk · July 15, 2022, 3:35pm

Yes, this is absolutely required!

This flow would work, but I’d have to clean all “tags” not referencing “posts” in the DB at the end, which is an operation that could fail on its own and leave us with loose “tags”. Remember that the “tag” in this case, can only exist on two occasions: if a “post” is persisted and if a “post” has errors. It’s almost like a logging mechanism, if I create the log before and there’s nothing worth logging I have to remove it somehow.

BTW, It seems someone else already had this exact same problem: How to not rollback transaction on failed insert because of unique constraint - #8 by benwilson512. I conducted some tests using one of the post’s suggestions and the savepoint feature seems to work without much effort. The only difference is that no additional match on the case was required:

Repo.transaction(fn repo ->   
  case repo.insert(changeset, mode: :savepoint) do
    {:ok, post} ->
      IO.inspect(post, label: "POST")

      tag
      |> Ecto.Changeset.change(%{name: "success"})
      |> repo.update!()

      {:error, changeset} ->
        IO.inspect(changeset, label: "CHANGESET")

        tag
        |> Ecto.Changeset.change(%{name: "failed"})
        |> repo.update!()
    end
end)

This returns the tuple with the changeset containing the constraint errors and commits the transaction as expected:

12:31:46.616 [debug] QUERY OK db=1.6ms queue=0.2ms
INSERT INTO "tags" VALUES (DEFAULT) RETURNING "id" []

12:31:46.620 [debug] QUERY OK db=1.5ms queue=0.1ms
INSERT INTO "posts" ("title") VALUES ($1) RETURNING "id" ["123"]

12:31:46.620 [debug] QUERY OK db=0.1ms
begin []

12:31:46.622 [debug] QUERY ERROR db=1.5ms
INSERT INTO "posts" ("title") VALUES ($1) RETURNING "id" ["123"]
CHANGESET: #Ecto.Changeset<
  action: :insert,
  changes: %{title: "123"},
  errors: [
    title: {"has already been taken",
     [constraint: :unique, constraint_name: "posts_title_index"]}
  ],
  data: #Post<>,
  valid?: false
>

12:31:46.624 [debug] QUERY OK db=0.2ms
UPDATE "tags" SET "name" = $1 WHERE "id" = $2 ["failed", 1]

12:31:46.626 [debug] QUERY OK db=1.8ms
commit []

joey_the_snake · July 16, 2022, 12:33am

~~What happens if an exception is thrown when using mode: :savepoint? For instance could this happen:~~
~~1. insert succeeds~~
~~2. update fails~~
~~3. transaction goes to savepoint directly before the update~~
~~4. transaction completes successfully~~

edit: i didn’t notice the savepoint is per statement and not on the transaction. pls ignore.

Alvinkariuki · September 18, 2024, 11:41am

Intuitive I like it