Sorry for the long post, but in the process of building out a job queue using Oban, the desire to have messages include captured functions has come up. Certain variables are known at the time when the job message is created, so it is possible to do something like the following:
job_args = %{
captured_fn: fn -> MyModule.something(input, opts) end
|> :erlang.binary_to_term()
|> Base.encode64()
}
Serialization using :erlang.binary_to_term/2
and Base.encode64
is required so that Elixir functions/structs/atoms/etc may be safely inserted into the database as an Oban job because the args are JSON encoded.
I understand that if we were to require other languages to create jobs records in the database, they would have a hard time creating such a message (anyone feel like reverse-engineering :erlang.binary_to_term/2
in, say, Python?). Likewise, if any other system had to read job data out of the database when it was encoded in this way, it would be a similar pain.
However, the alternative to send args that identify the module, function, and function arguments isnât much better:
job_args = %{
module: MyModule,
fun: :something,
fun_args: [input, opts]
}
The module name and the function atom convert to strings and can be converted back with a little tweaking, but the opts
⊠those could be nearly anything⊠so ⊠what to do?
The ability of having a worker that can run any function you throw at it is pretty tempting, and I think that flexibility may outweigh the cons of requiring Elixir to be on both ends of the pipeline.
If this wide-open flexibility of having carte-blanche captured functions is really an anti-pattern, then the only other way I can think to structure the worker is to have it have it operate on messages like this:
job_args = %{
type: "something",
input: input,
opts: opts
}
and then in the worker it could do something like:
case type do
"something" -> MyModule.something(input, opts)
end
i.e. the worker would need to know in advance what possibilities it should expect. In practice, there might only be a couple dozen.
However, even this approach would still fail the JSON encoding if the input were a struct or when the options were a keyword list.
Iâm hoping someone can shed some light on this problem â maybe Iâm not thinking about this the correct way. I understand that serializing certain things (pids or refs) is asking for trouble, but structs, modules, atoms, and functions seem pretty safe.
Thank you in advance for your thoughts!