Ecto - how to force a column to update?

DanielW · August 21, 2020, 5:24am

I have a process that loads a batch of records, sets their processed_at column as the current datetime, processes them and then sets the column to nil.

The problem is that when the records are first loaded their processed_at is already nil and so Ecto filters out that column. Currently to overcome this I load the records again, code looks something like this:

records = MyRecord.process_next(100)
 # set processed_at to current datetime so they're not picked up by other queries:
MyRecord.started_processing!(records)
# reload the records otherwise I can't set column to nil:
record_ids = records |> Enum.map(&(&1.id)) 
records = MyRecord.get_by_ids(record_ids)
# process each record and set column to nil

How can I skip loading them a second time?

mindok · August 21, 2020, 6:21am

What does your MyRecord.started_processing!(records) function look like?

Also, have you tried using repo.update_all with a where clause using your list of record_ids to directly set the values to nil?

Finally, you may get some inspiration from tracking job progress here: https://github.com/sorentwo/oban/blob/master/lib/oban/query.ex

DanielW · August 21, 2020, 6:41am

It looks like this:

def started_processing!(records) do
  record_ids = list_ids(records)
  from(r in Schema.MyRecord, where: r.id in ^record_ids)
  |> Repo.update_all(set: [processed_at: DateTime.utc_now])
end

No I haven’t tired setting the values to nil with update_all. When I’m processing the records I iterate over them and update each one separately, was hoping to do the update during the iteration (if possible).

Thanks I’ll have a look at that.

mindok · August 21, 2020, 7:18am

Makes sense, but the same approach would work - using a Repo.update_all but just with a single id

DanielW · August 21, 2020, 10:12am

Thanks, that works.