I’m building some abstraction in a toy project around Ecto and I’m wondering if there’s any reasons not to have the behavior default to executing each query within a Task?
Any thoughts here?
I’m building some abstraction in a toy project around Ecto and I’m wondering if there’s any reasons not to have the behavior default to executing each query within a Task?
Any thoughts here?
Eh, it is another process spawn when the work might otherwise be finished faster than the cost it adds is the main reason I’d think of, but even that is pretty minor…
Task.async
or Task.start(_link)
(returning result via cast
or call
to the spawning process)? My main beef with Task.await
is that it blocks (so what’s the point?).
The other issue is what do you plan/need to do when one of those tasks fail. When a process starts spawning other processes it is prone to taking on “supervisor-y” type of responsibilities - which may be actually better taken care of in an actual supervisor.
Just my two cents.
In fact Ecto is trying to do exactly that on it’s own wherever possible. For example, when you preload associations, the preloading will be done in parallel, wherever possible.
The Task.await
blocking is actually the main reason for the abstraction. Say I’ve got a page load that triggers 5 queries to render. I’m triggering the 5 queries in the controller but not calling await
until the first point in the code where I actually need to use the result, which is sometimes much farther down in the view.
It will need to block at that point to render whatever data was coming back, but the goal is to wait until the last possible moment to actually do it.
I do this in a couple of places too. I wish I could stuff them in to an Ecto.Multi and have them run in parallel or so, but as far as I’ve seen in the logs they are run serially…
EDIT: Do note that copying the output between processes may have a sizable overhead, so I try to minimize it as much as possible to prevent too many copies from flying around.
Yes, in Ecto.Multi the queries are executed in the order they were added to multi. It’s slightly different purpose, for multi, since it’s really a wrapper for transaction.
Anyways, Ecto is doing Task.async/await
internally to execute preloading queries, so I think @brightball you are good with this approach too:
Again not knowing the details of this particular use case - in some scenarios your argument may point to farming all 5 queries to a single separate process which aggregates the result - hopefully reducing the output to be copied to the originator. The originator would only have to deal with 1 instead of 5 processes and the decision whether to run the queries in parallel or not is deferred into the “single process”.
There is another reason for starting a task and doing a blocking wait. As the task is a separate process anything it does with its process, links or trap exits or uses the process dictionary or …, will not affect the calling process and so it is much safer. And for that matter it will not be affected by the calling processes local settings. It is also much easier and freer to crash when necessary without affecting the calling process.
This can be useful even if you just want to do a synchronous operation where you sit and wait. The erlang compiler, and LFE for that matter, does this, the compilation is run in a separate process.