Seeking thoughts on advantages of the Repo.transact pattern vs disadvantages I’ve read about Ecto.Multi

wojtekmach · February 20, 2024, 4:03pm

I think it needs to be something with transaction given that’s what people will most likely look for.

To me the contenders are:

transact
transaction_with
with_transaction

sodapopcan · February 20, 2024, 4:17pm

This has been a rollercoaster of a thread that I’ve been silently but diligently following from the outset. It has also caused me a rollercoaster of emotions

I do like multis but admit that I do find them clunky. When it comes to boundaries, I’ve never felt good about where they belong. They are side-effect-free so I want to put them in my schemas (or wherever one would normally store queries or changesets), but then you end up with the ol’ “everything is happening somewhere else” problem.

TL;DR, I think being able to use context functions along with with is great and am now excited at the prospect of this being added to Ecto.

As far as the name goes, including with in the name feels a little clunky to me since you get with twice… though it’s not that big of a deal.

Spitball ideas: begin/1 or commit/1 to line up with other db terminology? Likely not one of those but something along those lines…

sodapopcan · February 20, 2024, 4:51pm

Just throwing this out there, what if the transaction/1 could also take a block?

Repo.transaction do
  with {:ok, _} <- # ...
end

wojtekmach · February 20, 2024, 5:03pm

do … end is sugar for Repo.transaction(do: …) and so the inner code would run in the same context but we need it to run in a lambda.

sodapopcan · February 20, 2024, 5:07pm

Oh right it’s not a macro, d’oh.

gregvaughn · February 20, 2024, 5:11pm

For what little my opinion matters, I actually like transact. transaction is a noun, and logically transact is the verb form of that. I like using verbs for function names.

lud · February 20, 2024, 11:11pm

I do not have my code on this computer but to describe quickly: We have some jobs (a bit similar to an Oban job) that are materialized as a row in database. They have a status: “pending”, “started”, a bunch of other intermediate statuses, and “done”. When processing the jobs we have several milestones, each corresponding to a status change.

So:

We open a transaction, select the job, check if the status is the expected one, load the data.
We do some work in memory with the data.
We want to persist the results, along with updating the job status. So we open a transaction, select the run with a lock FOR UPDATE and check the expected status. If the status is “cancelled”, then our expect_status(job, :some_status) call will return an error tuple and we stop there. (the status can be updated to “cancelled” from outside the current process). If the status is ok, we store different things in database. Again, if some changeset function returns error, we would stop there. Finally we update the job status, this uses a changeset too. If everything returned ok then the transaction is committed. If we returned an error at some point, it is rolled back and we return the error to the upper layer.

belaustegui · February 22, 2024, 1:02pm

I’ve just opened a conversation in the mailing list to discuss whether the Ecto team would be in favour of upstreaming Repo.transact into Ecto itself.

I’ve quoted your thoughts on the topic as I think that they are a great summary of the benefits provided by this pattern.

tfwright · February 22, 2024, 7:14pm

If this is the consensus I wonder whether adding an option to Repo.transaction that alters its default behavior only when passed could be considered?

Something like

Repo.transaction(fn -> 
    with {:ok, user} <- Accounts.create_user(params),
         {:ok, _log} <- Logs.log_action(:user_registered, user),
         {:ok, _job} <- Mailer.enqueue_email_confirmation(user) do
      {:ok, user}
    end
end, normalize_return: true)

sasajuric · February 23, 2024, 6:10am

I personally already dislike the existing duality of transaction, so I’m not in favor of making it even more polymorphic

darraghenright · February 24, 2024, 3:36pm

Great pattern and conversation! There are some behemoth Ecto.Multi chains in the code I’m working with at the moment, so this is a very compelling alternative.

Tough naming challenge. Is Repo.unit_of_work/1 overloaded or just too plain fussy? I almost love Repo.with_transaction/1 but of course the callback doesn’t have to contain a with statement, so it feels a little inaccurate. But yeah, it’s pretty good.

I really like the suggestion Repo.atomic/1. Otherwise if I were to implement this in a project I’d be tempted to namespace it and call it Repo.Transaction.execute/1 or run/1 something like that.

They weren’t lying, naming things IS hard

voughtdq · February 25, 2024, 4:56am

Saša’s article made me wish that every function in Ecto.Repo were overrideable. I’m not really a fan of fetch/transact, so would prefer to override one/get/get_by/transaction instead.

I get that this is largely incompatible with the way things work, but I wish there were some solution that basically allowed me to override parts of Ecto.Repo but keep everything else. The only way I could figure out how to do it was to create a wrapper module.

Wrapping has its problems though - I have to defdelegate all non-overrided functions to the BaseRepo module and reference BaseRepo for configuration.

I wish there were a way to tell Ecto that I’m wrapping a repo so whenever it sees WrappedRepo it would actually use BaseRepo for config and the “private” functions (e.g. Ecto.Repo.Schema.insert(...)).

It’s not really that big of a deal, but Ecto letting us define custom wrappers around the repo would really improve code ergonomics. I’m fine with handling the incompatibility of external libraries expecting the repo to work like a normal non-overrided repo.

If the code police is reading this, I plead the fifth. If you have any ideas about how to do what I’m trying to do, I definitely want to hear them.

bad, illegal, offensive code

defmodule DM.Repo do
  alias DM.BaseRepo

  def one(queryable, opts \\ []) do
    case BaseRepo.one(queryable, opts) do
      nil -> {:error, :not_found}
      record -> {:ok, record}
    end
  end

  def get(queryable, id, opts \\ []) do
    case BaseRepo.get(queryable, id, opts) do
      nil -> {:error, :not_found}
      record -> {:ok, record}
    end
  end

  def get_by(queryable, clauses, opts \\ []) do
    case BaseRepo.get_by(queryable, clauses, opts) do
      nil -> {:error, :not_found}
      record -> {:ok, record}
    end
  end

  def transaction(fun, opts \\ []) when is_function(fun) do
    BaseRepo.transaction(
      fn repo ->
        result =
          case Function.info(fun, :arity) do
            {:arity, 0} -> fun.()
            {:arity, 1} -> fun.(repo)
          end

        case result do
          {:ok, result} -> result
          {:error, reason} -> repo.rollback(reason)
        end
      end,
      opts
    )
  end

  # query API
  defdelegate aggregate(queryable, aggregate, opts \\ []), to: BaseRepo
  defdelegate aggregate(queryable, aggregate, field, opts), to: BaseRepo
  defdelegate all(queryable, opts \\ []), to: BaseRepo
  defdelegate delete_all(queryable, opts \\ []), to: BaseRepo
  defdelegate exists?(queryable, opts \\ []), to: BaseRepo
  defdelegate get!(queryable, id, opts \\ []), to: BaseRepo
  defdelegate get_by!(queryable, clauses, opts \\ []), to: BaseRepo
  defdelegate one!(queryable, opts \\ []), to: BaseRepo
  defdelegate stream(queryable, opts \\ []), to: BaseRepo
  defdelegate update_all(queryable, updates, opts \\ []), to: BaseRepo

  # schema API
  defdelegate delete(struct_or_changeset, opts \\ []), to: BaseRepo
  defdelegate delete!(struct_or_changeset, opts \\ []), to: BaseRepo
  defdelegate insert(struct_or_changeset, opts \\ []), to: BaseRepo
  defdelegate insert!(struct_or_changeset, opts \\ []), to: BaseRepo
  defdelegate insert_all(schema_or_source, entries_or_query, opts \\ []), to: BaseRepo
  defdelegate insert_or_update(changeset, opts \\ []), to: BaseRepo
  defdelegate insert_or_update!(changeset, opts \\[]), to: BaseRepo
  defdelegate load(schema_or_map, data), to: BaseRepo
  defdelegate preload(structs_or_struct_or_nil, preloads, opts \\ []), to: BaseRepo
  defdelegate reload(struct_or_structs, opts \\ []), to: BaseRepo
  defdelegate reload!(struct_or_structs, opts \\ []), to: BaseRepo
  defdelegate update(changeset, opts \\ []), to: BaseRepo
  defdelegate update!(changeset, opts \\ []), to: BaseRepo

  # transaction API
  defdelegate checked_out?(), to: BaseRepo
  defdelegate checkout(function, opts \\ []), to: BaseRepo
  defdelegate in_transaction?(), to: BaseRepo
  defdelegate rollback(value), to: BaseRepo

  # runtime API
  defdelegate __adapter__(), to: BaseRepo
  defdelegate get_dynamic_repo(), to: BaseRepo
  defdelegate put_dynamic_repo(name_or_pid), to: BaseRepo
  defdelegate start_link(opts \\ []), to: BaseRepo
  defdelegate stop(timeout \\ 5000), to: BaseRepo

  defdelegate disconnect_all(interval, opts \\ []), to: BaseRepo
  defdelegate explain(type, query, opts \\ []), to: BaseRepo
  defdelegate query(sql, params, opts \\ []), to: BaseRepo
  defdelegate query!(sql, params, opts \\ []), to: BaseRepo
  defdelegate query_many(sql, params, opts \\ []), to: BaseRepo
  defdelegate query_many!(sql, params, opts \\ []), to: BaseRepo
  defdelegate to_sql(type, query), to: BaseRepo

end

defmodule DM.BaseRepo do
  use Ecto.Repo,
  otp_app: :dm,
  adapter: Ecto.Adapters.Postgres
end

LostKobrakai · February 25, 2024, 6:12am

You can overwrite or extend repo generated functions, but ding so will be very confusing to the next developer working on that project.

voughtdq · February 25, 2024, 7:47am

(I’m kind of going off-topic now, sorry)

The way to overwrite is a little too workaroundy for me -

# after `use Ecto.Repo, ...`
warning: this clause for get/3 cannot match because a previous clause at line 6 always matches
  lib/app/repo.ex:46

# before `use Ecto.Repo, ...`
warning: this clause for get/3 cannot match because a previous clause at line 6 always matches
  lib/app/repo.ex:20

(and the other warnings you get when functions aren’t grouped together)

If accidentally defined after use, the function clause won’t match.

This did remind me that it’s just a behaviour, but still pretty crunchy to have to redefine everything. Still workable though.

I think these changes to the repo wouldn’t be too surprising since it would be ubiquitous in the codebase.

sasajuric · February 25, 2024, 9:17am

Don’t do that because it’s changing the accepted behavior of well-known functions. People will find this very confusing. Instead define new functions, such as Repo.fetch, Repo.fetch_one, Repo.fetch_by…

Btw, the implementation you provided has subtle issue. If the queryable selects an existing nil (e.g. from foo in Foo, select: foo.bar, where the row exists but the value of bar is nil), these functions would return {:error, :not_found}, which is incorrect.

What you want to do instead is something like:

case Repo.all(queryable, opts) do
  [record] -> {:ok, record}
  [] -> :error
  other -> raise Ecto.MultipleResultsError, queryable: queryable, count: length(other)
end

josevalim · February 25, 2024, 9:33am

Yes, the API was designed to mirror databases: if you rollback, you immediately return to the state before the transaction. I am sure it was also designed before with was added to the language.

There is still one benefit of Ecto.Multi in that it immediately aborts in case of errors. This is important because if an insert fails, the transaction is aborted, and you can’t do anything else useful in there. This means, that if someone were to write this:

Repo.transaction(fn ->
  with {:ok, user} <- insert_user(),
       {:ok, compant} <- insert_company() do
    ...
  else
    :error -> # do another query
  end
end)

Whatever you do on the else branch won’t work. But this is already a problem with Repo.transaction, just not with Ecto.Multi.

I can understand wanting to override one or transaction, but overriding get by fetch would be very confusing. get in Repo behaves consistently with all gets in the language: it returns the value, not ok/error tuples.

fmn · February 25, 2024, 10:40am

i am not sure if i agree with this one, feels subjective.
for me creating a struct and using it as a vessel to gather changes before doing something with them feels right - even if this is not exactly what happens in technical terms. i like to think same way about Plug pipelines in Phoenix.

on top of that, this is semantics you will see in most if not all transactional RDBMSes: you open a transaction, you are doing stuff, and either all passed or none.

if you play with RDBMS and transactions, you really need to be aware of this.
i understand argument that it’s tempting to cover external functionality “native” way, with primitives language provides out of the box (i actually preach it quite often), but this won’t and can’t replace awareness of RDBMS semantics. in best case, if you are fairly successful you will build an abstraction which has mental cost to it. is it worth it? what’s the gain? i am not sure.

an anecdote: i witnessesed whole generation of enterprise Java developers taking a shortcut (for multiple reasons) using ORMs as a replacement and not an addition to RDBMS, and results were often tragic.

(as a side-note: my default reflex is to trust you, more than my own brain, so i will be getting back to what you wrote in this thread whenever i deal with the subject in anger )

sodapopcan · February 25, 2024, 2:15pm

Which article? Was looking through this thread and can’t find it.

More to the point, don’t get, one, and all return nil (or [nil]) because a query that returns 0 results is not an error? It’s a successful query that returned zero rows represented by nil. I’ve always operated under the idea that :ok/:error tuples describe the result of some side effect not under our control or a pure function that receives data we don’t control, like passing user-generated HTML to Floki.parse_document.

dimitarvp · February 25, 2024, 2:17pm

It’s linked inside the article I posted in the OP (changed medium to scribe):

https://scribe.rip/very-big-things/towards-maintainable-elixir-the-core-and-the-interface-c267f0da43

sodapopcan · February 25, 2024, 2:40pm

Double thank you for introducing me to scribe, holy heck!