I have a process that loads a batch of records, sets their
processed_at column as the current datetime, processes them and then sets the column to
The problem is that when the records are first loaded their
processed_at is already
nil and so Ecto filters out that column. Currently to overcome this I load the records again, code looks something like this:
records = MyRecord.process_next(100)
# set processed_at to current datetime so they're not picked up by other queries:
# reload the records otherwise I can't set column to nil:
record_ids = records |> Enum.map(&(&1.id))
records = MyRecord.get_by_ids(record_ids)
# process each record and set column to nil
How can I skip loading them a second time?
What does your
MyRecord.started_processing!(records) function look like?
Also, have you tried using
repo.update_all with a where clause using your list of record_ids to directly set the values to nil?
Finally, you may get some inspiration from tracking job progress here: https://github.com/sorentwo/oban/blob/master/lib/oban/query.ex
It looks like this:
def started_processing!(records) do
record_ids = list_ids(records)
from(r in Schema.MyRecord, where: r.id in ^record_ids)
|> Repo.update_all(set: [processed_at: DateTime.utc_now])
No I haven’t tired setting the values to nil with
update_all. When I’m processing the records I iterate over them and update each one separately, was hoping to do the update during the iteration (if possible).
Thanks I’ll have a look at that.
Makes sense, but the same approach would work - using a
Repo.update_all but just with a single id