Create oban jobs via SQL triggers

lud · March 13, 2024, 8:47am

Hello,

We would like to publish events in a transactional fashion whenever a database entity is inserted or updated. That entity is managed with Ecto but we do not “own” the codebase and cannot be sure that all inserts and update will also transactionnaly insert an Oban job.

So we would like to try handling that at the database level with triggers.

I am not sure this is a good idea but it does not seem to hard. I would like to know what fields should be set with what values. The jobs would be performed by a fixed worker that would then dispatch to actual implementations. And we do not care about uniqueness, as those are events it’s fine to have the same event multiple times if it reflects what happens on the database.

What we want is basically call our worker with limited args: the table name of the entity (like “users”), the primary key of the entity, and the type of event “inserted” | “updated” | “deleted”.

I guess we should insert the following values:

state         "available"
queue         "our_custom_queue"
worker        "Our.Custom.Worker"
args          {"table": "users", "pk": "46cf2e3e-e115-11ee-a159-9be783b5b70b", "action": "updated"}
meta          {}
tags          []
errors        []    
attempt       0
attempted_by  NULL
max_attempts  some integer
priority      some integer
attempted_at  NULL
cancelled_at  NULL
completed_at  NULL
discarded_at  NULL
inserted_at   field-compatible now()
scheduled_at  NULL

We also want to start Oban with a custom name, to not mess with Oban used in that codebase, but I do not see the namespace reflected in the job data. I guess that is not a problem as long as we use our custom queue name.

Would that be ok? Or is it irrealistic somehow?

Thank you.

sorentwo · March 13, 2024, 10:22pm

Coincidentally somebody asked about this in Slack today as well. The answer is it’s possible, and not that difficult because most columns have defaults.

You can see an example of inserting jobs from another language (Ruby) in this official guide: Migrating from Other Languages — Oban v2.17.6

lud · March 14, 2024, 10:12am

Alright, perfect !

Thank you @sorentwo

dimitarvp · March 14, 2024, 12:15pm

While it’s possible as @sorentwo said, I am not sure you need a trigger for this – why not simply import jobs based on a cursor (say, a datetime timestamp denoting “when was the last time I imported jobs?”)?

Furthermore, it seems like you might actually be looking for Postgres CDC; your scenario looks like that and not actually about importing jobs.

lud · March 14, 2024, 1:04pm

Very interesting! We do not want to tell the other team to change their PG configuration, as it is not our codebase, but yeah this is basically what we need.

why not simply import jobs based on a cursor

That was my solution but we want a soft-immediate trigger because the data represent people and thay may decide to turn off their profile. In that case we want to immediately publish the event.

So basically the trigger solution is not optimal but covers what we want.

dimitarvp · March 15, 2024, 12:47am

Huh? So what if they turn it off? If that does not lead to a DB deletion then you can just as easily say “give me all users created since the last time I checked” (say, 1 hour) and just work with that? Though I don’t know the specifics, of course.

lud · March 15, 2024, 1:36am

Well we haven’t chosen a solution yet, I am not sure if we are going with the triggers, and that soft-immediate thing is not a hard requirement, but still… If you want to turn off visibility on your profile for whatever privacy reason, you want that done in a matter of seconds, mot minutes, not hours.

So even if we do not use triggers and rely on a date-based polling mechanism, we would at least send a message to a process to say “hey, poll right now”.

Oban is great as it covers a lot of our needs. And is well integrated in the database. Actually I think that it would be great if @sorentwo made an SQL API to Oban

Anyway, I am convinced that we need decent reactivity here (and I don’t really like things that poll continuously. I am aware that Oban may poll (but maybe not in all setups?) but at list I’d rather feed it with some actual work instead of polling elsewhere in parallel) but I should indeed check if this is an acutal requirement with my team.

dimitarvp · March 15, 2024, 1:40am

I still don’t understand why somebody turning off visibility of their profile should exclude them from this “DB scraping” that you’re doing but again, I don’t know your business needs.

My point was that you should just store a cursor value somewhere and then scan the tables with inserted_at > last_time_we_checked and slurp everything and work with it.

But if that’s not an option for privacy reasons then triggers (or a full-blown SQL API) seems like the best catch-all idea, yep.