Elixir task timeout pitfall

minhajuddin · October 31, 2016, 12:01pm

I have written a blog post about a common misunderstanding about tasks

Would love your inputs

benwilson512 · October 31, 2016, 12:35pm

it should show you that running in parallel with timeouts is not just a Task.await away.

That isn’t what this shows at all. This shows that if you want a timeout to apply to a group, you should write it to apply to a group. There’s even a function for it in the standard library: http://elixir-lang.org/docs/master/elixir/Task.html#yield_many/2

benwilson512 · October 31, 2016, 12:37pm

In fact, this actually perfectly demonstrates parallel execution at work. The total amount of sleeping time called is 49.5 seconds!

1..10 |> Enum.map(&(&1 * 900)) |> Enum.sum

Despite this, the total wait time is only about as long as the longest item 9005.466 ms (not the 10 seconds in the article).

minhajuddin · October 31, 2016, 12:44pm

timeout = 5000
urls = 1..10 |> Enum.to_list

urls
|> Enum.map(fn url -> Task.async(url) end)
|> Enum.map(fn task -> Task.await(task, timeout) end)

If we take the above code as an example, aren’t we setting different timeouts for different tasks? Is that what a reader would expect?

benwilson512 · October 31, 2016, 12:51pm

No, you’re setting the same timeout for each task. However, each call to Task.await is consecutive. Thus, the first task is given 5 seconds to timeout. Then the next task is given 5 seconds to timeout, and so forth. Every task is given 5 seconds. This is absolutely expected because the Task.await call is happening inside a loop. It will happen for each item independently, because that’s how loops work.

If you want one timeout for the entire group you have to do something differently.

minhajuddin · October 31, 2016, 12:53pm

The effecive timeout for the first task is 5 seconds and the second task is 10 seconds and so on, the last task would have had 50 seconds before being timed out, because when we the current process is executing the first Task.await the other processes are still running and happily doing their computation.

benwilson512 · October 31, 2016, 12:56pm

Look, suppose you wrote the following code:

task = Task.async(fn -> Process.sleep(:infinity) end)

Process.sleep(5_000)
Task.await(task, 5_000)

How long before it times out? 10 seconds of course. But this is obvious and expected. This is exactly what you’re doing by making the Task.await calls consecutive. It’s just that instead of sleeping in the main process you’re waiting on a different task. Task.await is blocking, this is expected.

benwilson512 · October 31, 2016, 12:58pm

Put another way, Elixir functions aren’t gonna know about anything not passed to them directly. Thus if you want ALL tasks to have the same timeout, you’re gonna need to pass ALL tasks to some function that can handle dealing with them. If you only look at one task at a time, you’re gonna get that behaviour.

minhajuddin · October 31, 2016, 12:59pm

I understand what you are saying. May be as a more experienced elixir dev it doesn’t confuse you
I got confused by that code in my project.

benwilson512 · October 31, 2016, 1:07pm

Look, I’m 100% behind having a blog post which says "hey, keep in mind that these calls will run sequentially, that’s why there’s Task.yield_many`.

I mostly just feel like the sentence “it should show you that running in parallel with timeouts is not just a Task.await away.” implies that there’s something wrong with how the Elixir Task.await function works, and that no easy solution is present. A more accurate summary would be something like:

it should show you that if you want tasks to share a timeout, you want Task.yield_many instead of consecutive Task.await

The other issue is that the blog post doesn’t actually explain WHY the behaviour is happening, and why it isn’t actually unexpected once you understand what each part does.

I don’t really agree that it’s a “common misunderstanding” or a “pitfall” either, but those are definitely more subjective areas.

minhajuddin · October 31, 2016, 1:34pm

Valid points. I’ll update the blog post.

OvermindDL1 · October 31, 2016, 3:25pm

Yeah I agree, await is blocking just like join is in lower-level languages, and you are joining one one task at a time in sequence, so it will take an aggregate of the times, so this is entirely expected as Task is emulating a lower-level fork/join.

prodis · May 1, 2019, 8:33am

An old article but still relevant: https://www.theerlangelist.com/article/beyond_taskasync