Combining server-generated values with changsets

binarypaladin · May 8, 2023, 11:07pm

Okay, short story:

Let’s say I want to a user-defined SQL function or even a subquery to set a value on an insert or an update with Ecto.Repo combined with a changeset.

So, as a simple example, let’s just say I have some timestamp field that I want to set to NOW() in PostgreSQL rather than rely on the application server setting the time. I know I can do this with a fragment but… this is often done in conjunction with a pretty standard changeset cycle. If the changeset is valid, it’s pretty common to add some automatic values not set by the client.

I’m hoping there is some option I can add on Repo.insert or Repo.update that, combined with returning will result in what I am looking for without me having to do a bunch of destructuring and recombining.

al2o3cr · May 9, 2023, 12:12am

Sounds like you want what the read_after_writes option to Ecto.Schema.field does.

binarypaladin · May 9, 2023, 12:32am

This still doesn’t explain how to set the value to a fragment combined with a changeset.

read_after_writes is pretty much the same as always listing the field in the returning option.

Let’s say I have a field called voided and if it’s set there I want to also set voided_at to NOW(). Forget all the ways I could do this and just presume I want the application to say, “hey, during this update, set voided_at to NOW().” read_after_writes or returning will get the value back to me, but how do I set it?

sodapopcan · May 9, 2023, 8:20am

Ecto.Changeset.prepare_changes/2 possibly?

dimitarvp · May 9, 2023, 1:10pm

Before trying to give you a working solution I’ll ask you this:

Why do you need voided at all? It’s basically voided_at IS NOT NULL. It seems superfluous.

binarypaladin · May 9, 2023, 4:07pm

Dang. I thought that was going to be it. Nope.

After looking through a bunch of docs last night, I’m probably going to have to write my own Repo function to handle this.

binarypaladin · May 9, 2023, 4:24pm

You’re focusing too much on a contrived example. Here’s another one (also contrived). I want to be able to do something like this:

%ExampleSchema{id: 1, name: "Jane Smith", changed_name_at: nil}
|> change(name: "Jane J Smith")
|> put_sql_change(:changed_name_at, sql_fragment("NOW()"))
|> Repo.update(returning: [:changed_name_at])

If I wanted the date generated by the server, not the application, is there a way to do this either in a changeset or as an option in update (and similarly, with insert)?

dimitarvp · May 9, 2023, 4:52pm

Ah, so you need something similar to how updated_at behaves but not pertaining to the entire record, more like for individual fields.

LostKobrakai · May 9, 2023, 5:02pm

:autogenerate the functionality used for timestamps does compute the datetimes in elixir as well, not in the db.

dimitarvp · May 9, 2023, 5:07pm

Yep I’m aware, simply trying to understand the problem that OP is trying to solve.

kwando · May 9, 2023, 6:19pm

I do not know how to to this as you wish on the Elixir side… but I would probably go for a trigger and :read_after_write if I absolutely had to do it this way

sodapopcan · May 9, 2023, 6:28pm

I was also going to suggest a trigger. I am curious as to why your are so set on doing it on the db side. None of my business, of course, lol. If it’s not too late to convert your timestamp fields to utc_datetime_usec then it won’t matter if you set it in Elixir or on the db. But again, if you have a very good reason for doing it in the db then so be it! I’m just not sure of the cleanest way to do it. You could always use a Multi but that is two steps.

binarypaladin · May 9, 2023, 6:51pm

Triggers are typically how I would handle this and I don’t even use :read_after_write but manually specify :returning when I know I need it.

With that said, on a Ruby project I worked on a while back, there were times where the execution logic for a function took data outside the scope of the SQL row from the application for a calculation. A trigger won’t work in this case because the trigger is only going to deal with data in SQL.

I think there is still way too much focus on the example. Timestamps are just super simple to explain because NOW() is a function everyone understands and the datatype that comes with it is understood. I could have a function like… super_expensive_calculation_that_has_been_highly_optimized_for_sql() that perhaps I want to give data from the application to the SQL function.

Ruby Sequel allowed me to do this really easily.

As soon as you have to involve the database more explicitly, some of the “cleanliness” goes away but that’s also often the price to be paid when you’re dealing with a legacy schema or something that has to be highly optimized. It’s sounding like destructure the changeset and then generate my own queries if I want to do this is very much in my future. I was just hoping it wasn’t.

Multi wouldn’t work because there’s still no way that I am aware of inside multi to say, “Hey, on this step do some raw SQL.” If that were possible, I could do it on the single changeset.

I can wrap the operation in a transaction where I do one query via a changeset and then another “bare” query but that’s pretty inefficient. I don’t want to do this in two queries where it’s not possible.

sodapopcan · May 9, 2023, 7:15pm

There’s probably a much better way to do this, but a Multi would technically work if you stuck a Multi.run in there to grab the changed fields. After running the Multi.update or Multi.insert you can use a Multi.insert_all to run a “raw” Ecto query to update the changed fields using a fragment or use another Multi.run to perform a totally raw SQL query if you wanted with Repo.query. It’d be pretty dang janky, but is possible.

EDIT: Sorry, didn’t mean to focus on the whole “timestamp” thing, but like, you can run “raw” queries with Multi. Ideally through an update_all but you could do Repo.query if you really wanted to (but shouldn’t be necessary).

kwando · May 9, 2023, 7:35pm

Hmm, sounds like it will be generating your own queries then. Depending on what else you have to do in the same query it doesn’t have to be that bad I think

I haven’t checked but maybe there is a way to “get an ecto query” from a changeset that you then can manipulate before sending to the database.

sodapopcan · May 9, 2023, 7:36pm

I was looking for this same thing! Didn’t find anything, though could probably mash together an Ecto query with an changeset.changes fairly cleanly… maybe?

binarypaladin · May 9, 2023, 7:38pm

Yeah.

The problem with Ecto—at least, something I have observed—is there are some “impure” ways of dealing with database-specific issues. They’re cool because something like this is awesome:

Repo.insert!(
  changeset,
  conflict_target: [:col_a, :col_b],
  on_conflict: [inc: [total: num]],
  returning: [:total]
)

This is a great upsert situation. I think the best place to handle what I want would be similar—not on the changeset, but actually on the Repo functions since the Repo itself has more awareness, as it were, to the actual storage medium.

The problem is that I feel like they become specific rather than more generic. I dunno if I am explaining that well.

This would get me like… 90% of what I want actually. With that said, looking through the actual source code for Repo, it did not look trivial and I don’t know that there’s an obvious function for this. There is a lot of stuff being handled by changesets when doing non-trivial CRUD, that’s what I gathered from that exercise.

And… this is the fallback. When I say destructuring, querying, and putting it all back together… this is it.

LostKobrakai · May 9, 2023, 7:49pm

There isn’t because a changeset doesn’t map to a single query. Association handling means there might be a multitude of queries, which might even affect each other (e.g. create parent before the child record).

kwando · May 9, 2023, 8:00pm

Right, I almost never use associations in changesets so I didn’t think about that