Ecto transaction rolling back with no exceptions? Weird behavior or not missing docs!?

Hi everyone, it seems I can’t get out of this forum this week :sweat_smile: (you guys are awesome btw).

So, I got surprised by this behavior while using Ecto and I’m still banging my head against the wall with some conflicting information.

A coworker opened this issue: Updating a record in a Repo transaction failure · Issue #3944 · elixir-ecto/ecto · GitHub yesterday where he describes something unexpected we found with how transactions work.

The context of the whole thing is in the issue, but to summarize… The docs about transaction/2 say:

If an unhandled error occurs the transaction will be rolled back and the error will bubble up from the transaction function. If no error occurred the transaction will be committed when the function returns. A transaction can be explicitly rolled back by calling rollback/1, this will immediately leave the function and return the value given to rollback as {:error, value}

This is what we are trying to do… We always have a “tag” persisted after trying to save a “post”. If we are unable to persist the “tag”, we also don’t care about saving the “post”:

Repo.transaction(fn repo ->
  case repo.insert(changeset) do
    {:ok, post} ->
     # Great! Please also save the tag confirming it
     # If we can't just explode and rollback...
      tag
      |> Ecto.Changeset.change(%{name: "success"})
      |> repo.update!()
    {:error, changeset} ->
     # Oh noes! Please save some information about it
     # If we can't save just explode and rollback...      
      tag
      |> Ecto.Changeset.change(%{name: "failure"})
      |> repo.update!()
    end
end)

What surprised us that is that this operation fails, even though we are handling the errors in the first insert. This is the exception raised when the code gets to the repo.update! (this is from the adapter, so it also happens with the safe alternative repo.update):

** (Postgrex.Error) ERROR 25P02 (in_failed_sql_transaction) current transaction is aborted, commands ignored until end of transaction block

Ok, apparently there’s something preventing the transaction from succeeding. After conducting some tests it seems that repo.insert is putting the transaction in a bad state and repo.update! can’t succeed because of it. We had that confirmed by jose and as it seems, this error is happening because an operation already failed inside the transaction. He also seems to agree that the docs are talking about exceptions and not just “errors”. So, what exactly are we missing here?

If the docs are right about the transaction rolling back on exceptions, why can’t this code succeed? If not, and we agree that there’s something to be improved in the docs, what should the correct behavior be?

I’m certain that there’s some important information I’m missing, but I’m feeling a bit misguided by the docs and we want to improve it for posterity.

I think this is normal database behaviour. Whatever database you are using shouldn’t let you continue with the same transaction if a previous command caused an error. This is an error from the database not from Ecto:

** (Postgrex.Error) ERROR 25P02 (in_failed_sql_transaction)

Yep, we found that out ourselves afterward, but still unexpected considering the docs description. I guess it’s perhaps one of those “Ecto is not your ORM” moments :blush:.

Still, there are some improvements to be made to the docs and we were already feeling kinda stressed out by apparently contradictory information. I’m honestly kinda drained at the point of having things spelled out to me.


BTW, just to vent a little bit: It’s not immediately obvious to more experienced people that sometimes questions like this one are a symptom of some sort of cognitive dissonance or lack of broader understanding about a subject. So answering those questions too objectively, without taking into account the reason for such misalignment of information might worsen the experience even more and in the end, make people more stressed overall. I know this is a technical community, but even so, we are still only people :slightly_smiling_face:.

What I guess it’s going on is that the first insert failed because it was violating some DB constraints, but this DB error was caught by your changeset (perhaps using unique_constraint/3 or similar) and converted into a changeset error. So the exception has been handled in the code, but the DB transaction is still tainted by the error and unable to continue (this is what the DB error is telling you).

I guess the documentation may be technically correct in that it says nothing about what happens when an error is handled :slight_smile: That said, it would make sense to describe this possible situation.

All of this of course assuming that my above assumption is correct.

Hey, @trisolaran this was intuitively what we thought at first, but look at this repro over here: ecto-transaction-flow-sample/script.exs at master · rafaeliga/ecto-transaction-flow-sample · GitHub. There is a constraint check in the changeset but it doesn’t seem to be enough to prevent this from happening.

Technically correct can be the best kind of correct :sweat_smile:, but unfortunately not enough to help our thick skulls I’m afraid!? Jokes aside, We will try to help with that if we can so no one else goes through the same experience ever again.

Yeah that’s what I meant. unique_constraint doesn’t prevent the DB error, it works by waiting for the DB error to happen on insert, catching it and putting it into the changeset. What you’re seeing is expected. When you’re calling insert here you’re triggering a DB error. unique_constraint is simply catching the error for you and turning it into a nice human friendly error for your changeset, but the DB transaction is still in an error state.

2 Likes

Although a little unintuitive at first; considering that Ecto doesn’t do much for those cases it makes sense that it behaves like that. Thanks for the clarifications @trisolaran :fist_right::fist_left:.

So as it seems, it simply isn’t possible to wrap two operations in the same transaction if one of them relies on checking any kind of constraints.

That question that remains is if there’s another way to solve this without having to manually check for the existence of the record in the then to prevent tainting the transaction. Any particular suggestions!?

You could do an insert with on_conflict: :nothing. This will not cause an error.

1 Like

Wow!!! :clap::clap::clap:

I have never thought about using the :on_conflict option with intent other than for upserts but this makes complete sense :see_no_evil:. Excellent, excellent observation @joey_the_snake!!! :fist_right::boom::fist_left:

For whatever it’s worth, this issue has been confounding people for a long time (note the date!):

2 Likes

I’m afraid that won’t help. on_conflict: :nothing will prevent Ecto from raising, but the DB doesn’t care about it: the conflict violation will still have occurred, your post will not be inserted and the transaction will be in an error state. Plus you will not get any feedback on whether you were able to successfully insert your post or not, and you want to know that, right?

My suggestion: rollback your transaction at the first error and handle error reporting and logging outside of it.

Why do you think this? ON CONFLICT DO NOTHING is a database command that avoids raising the error. I have used it in production. PostgreSQL: Documentation: 16: INSERT

The optional ON CONFLICT clause specifies an alternative action to raising a unique violation or exclusion constraint violation error

You’re right, thanks for the clarification. Sure, no error will be raised, but the DB won’t insert anything in the DB yet insert will return {:ok, post} as if something was inserted.

Whether or not that’s fine is up to the developer though, no?

I personally wouldn’t want that in my code and I would find it confusing if I found it in other codebases. In any case, in this specific example, it’s an issue because @thiagomajesk wans to distinguish between {:ok, post} and {:error, changeset} and be able to take different actions. My point was that he can simplify things and avoid this entire issue by moving his error management code outside of the transaction

That’s right @trisolaran! Just noticed this behavior testing today… Here’s the log if someone else is curious about it (from the repro I linked in the issue):

10:05:40.814 [debug] QUERY OK db=2.3ms queue=0.3ms
INSERT INTO "tags" VALUES (DEFAULT) RETURNING "id" []

10:05:40.819 [debug] QUERY OK db=2.4ms queue=0.2ms
INSERT INTO "posts" ("title") VALUES ($1) RETURNING "id" ["123"]

10:05:40.819 [debug] QUERY OK db=0.1ms
begin []

10:05:40.819 [debug] QUERY OK db=0.1ms
INSERT INTO "posts" ("title") VALUES ($1) ON CONFLICT DO NOTHING RETURNING "id" ["123"]
POST: %Post{__meta__: #Ecto.Schema.Metadata<:loaded, "posts">, id: nil, title: "123"}

10:05:40.822 [debug] QUERY OK db=0.1ms
commit []

I can see that the error was suppressed correctly but the “post” not persisted (id == nil).
I see how this is unintuitive that we are returning :ok for a failed operation and it wouldn’t be my first choice either. Since this is a corner case, perhaps we could solve it by pattern-matching and just forget about it:

Repo.transaction(fn repo ->
  case repo.insert(changeset) do
    {:ok, %{id: nil}} ->
       # failed first, we don't care anymore
       Repo.rollback(:post)
    {:ok, post} ->    
      tag
      |> Ecto.Changeset.change(%{name: "success"})
      |> repo.update!()
    {:error, changeset} -> 
      tag
      |> Ecto.Changeset.change(%{name: "failure"})
      |> repo.update!()
    end
end)

The real problem with this approach is that we don’t have any information whatsoever about why this has failed and in this specific scenario with no information about the constraint to report to users.

How would you do that if both operations are dependent on each other? I mean, they somehow have to be inside the same transaction to guarantee that we save at least one of them, right!?

BTW I see someone else suggesting the usage of savepoints, but I haven’t used those in Ecto before. I’m not sure how would the fit in this scenario.

I wonder why you absolutely need to work with these “error” and “failure” tags in the DB and can’t instead log the error or show it to the user?

However, putting this aside, how about this:

  • write a “failure” tag before starting the transaction

Start the transaction:

  • insert the post, if insertion fails, roll back transaction
  • update the “failure” tag to be a “success” tag
  • commit

This way, if anything goes wrong with insertion, you’ll have a failure tag.
Wouldn’t this work?

Yes, this is absolutely required!

This flow would work, but I’d have to clean all “tags” not referencing “posts” in the DB at the end, which is an operation that could fail on its own and leave us with loose “tags”. Remember that the “tag” in this case, can only exist on two occasions: if a “post” is persisted and if a “post” has errors. It’s almost like a logging mechanism, if I create the log before and there’s nothing worth logging I have to remove it somehow.

BTW, It seems someone else already had this exact same problem: How to not rollback transaction on failed insert because of unique constraint - #8 by benwilson512. I conducted some tests using one of the post’s suggestions and the savepoint feature seems to work without much effort. The only difference is that no additional match on the case was required:

Repo.transaction(fn repo ->   
  case repo.insert(changeset, mode: :savepoint) do
    {:ok, post} ->
      IO.inspect(post, label: "POST")

      tag
      |> Ecto.Changeset.change(%{name: "success"})
      |> repo.update!()

      {:error, changeset} ->
        IO.inspect(changeset, label: "CHANGESET")

        tag
        |> Ecto.Changeset.change(%{name: "failed"})
        |> repo.update!()
    end
end)

This returns the tuple with the changeset containing the constraint errors and commits the transaction as expected:

12:31:46.616 [debug] QUERY OK db=1.6ms queue=0.2ms
INSERT INTO "tags" VALUES (DEFAULT) RETURNING "id" []

12:31:46.620 [debug] QUERY OK db=1.5ms queue=0.1ms
INSERT INTO "posts" ("title") VALUES ($1) RETURNING "id" ["123"]

12:31:46.620 [debug] QUERY OK db=0.1ms
begin []

12:31:46.622 [debug] QUERY ERROR db=1.5ms
INSERT INTO "posts" ("title") VALUES ($1) RETURNING "id" ["123"]
CHANGESET: #Ecto.Changeset<
  action: :insert,
  changes: %{title: "123"},
  errors: [
    title: {"has already been taken",
     [constraint: :unique, constraint_name: "posts_title_index"]}
  ],
  data: #Post<>,
  valid?: false
>

12:31:46.624 [debug] QUERY OK db=0.2ms
UPDATE "tags" SET "name" = $1 WHERE "id" = $2 ["failed", 1]

12:31:46.626 [debug] QUERY OK db=1.8ms
commit []

What happens if an exception is thrown when using mode: :savepoint? For instance could this happen:
1. insert succeeds
2. update fails
3. transaction goes to savepoint directly before the update
4. transaction completes successfully

edit: i didn’t notice the savepoint is per statement and not on the transaction. pls ignore.