Why sleep (Process.sleep or :timer.sleep) freed process messages?

Have code:

single_api_call = fn ->
  sleep_time = Enum.random(554..2944)
  Process.sleep(sleep_time) # this line affects results
  sleep_time
end

defmodule Worker do
  def start(fun) do
    pid = self()
    send(pid, {self(), fun.()})
  end
end

Enum.to_list(1..10)
|> Enum.map(
    fn _num ->
      spawn Worker, :start, [single_api_call]
    end)

Process.sleep(3000)
Process.info(self(), :messages) |> IO.inspect()

please see at line with Process.sleep.
Now code works this way:

[#PID<0.103.0>, #PID<0.104.0>, #PID<0.105.0>, #PID<0.106.0>, #PID<0.107.0>,
 #PID<0.108.0>, #PID<0.109.0>, #PID<0.110.0>, #PID<0.111.0>, #PID<0.112.0>]
{:messages, []}

messages is empty! why?

after deleting line with Process.sleep() it works fine:

[#PID<0.103.0>, #PID<0.104.0>, #PID<0.105.0>, #PID<0.106.0>, #PID<0.107.0>,
 #PID<0.108.0>, #PID<0.109.0>, #PID<0.110.0>, #PID<0.111.0>, #PID<0.112.0>]
{:messages,
 [
   {#PID<0.103.0>, 2049},
   {#PID<0.104.0>, 619},
   {#PID<0.105.0>, 2854},
   {#PID<0.106.0>, 2584},
   {#PID<0.107.0>, 706},
   {#PID<0.109.0>, 2700},
   {#PID<0.110.0>, 2397},
   {#PID<0.111.0>, 2303},
   {#PID<0.112.0>, 1863}
 ]}

It looks like a race condition. The messages aren’t being sent before the main process finishes sleeping. What happens if you change this:

sleep_time = Enum.random(554..2944)

To have lower numbers? like 1..10

Actually my bad. There is a bug in your code. The Worker is sending the message to itself instead of the main process. You have to do this

defmodule Worker do
  def start(pid, fun) do
    send(pid, {self(), fun.()})
  end
end

Enum.to_list(1..10)
|> Enum.map(
    fn _num ->
      spawn Worker, :start, [self(), single_api_call]
    end)
3 Likes

The reason is that in Worker.start/ the function defined in single_api_call is called before a message is sent. This is because that function is called when creating the message to be sent {self(), fun.()} and the send requires all ita arguments to be evaluated which means it waits until the function returns which is after it has slept.

2 Likes

yep, i set Sleep(3000) at main process to wait all children finished

checked, now we sent it to the correct process, but still no messages at the main process at the end of the script:

single_api_call = fn ->
  sleep_time = Enum.random(1..10)
  Process.sleep(sleep_time)
  sleep_time
end

defmodule Worker do
  def start(pid, fun) do
    send(pid, {self(), fun.()})
  end
end

Enum.to_list(1..10)
|> Enum.map(
    fn _num ->
      spawn Worker, :start, [self(),single_api_call]
    end)
|> IO.inspect()

Process.sleep(3000)
Process.info(self(), :messages) |> IO.inspect()

returns:

[#PID<0.103.0>, #PID<0.104.0>, #PID<0.105.0>, #PID<0.106.0>, #PID<0.107.0>,
 #PID<0.108.0>, #PID<0.109.0>, #PID<0.110.0>, #PID<0.111.0>, #PID<0.112.0>]
.............end after 3 seconds.................
{:messages, []}

I think it should work…I get stuff like this consistently

{:messages,
 [
   {#PID<0.218.0>, 578},
   {#PID<0.220.0>, 777},
   {#PID<0.212.0>, 1205},
   {#PID<0.219.0>, 1587},
   {#PID<0.217.0>, 1774},
   {#PID<0.215.0>, 2153},
   {#PID<0.221.0>, 2364},
   {#PID<0.214.0>, 2437},
   {#PID<0.213.0>, 2722},
   {#PID<0.216.0>, 2895}
 ]}

I’m not sure how you’re running it, is it possible the module needs to be recompiled?

I’m running directly in terminal via elixir script.exs

…added: I’m running inside docker! it works at system directly… O_o than its solved, but still want to know WHY docker affects

docker-compose.yml:

services:
  backend:
    image: elixir:latest
    volumes:
      - ./:/home/app
    tty: true

pure guessing, but maybe the number of cpus assigned to the docker container is too small and so the spawned tasks don’t get scheduled until after the main process is finished

3 Likes

I’ll be the guy to ask:

What is it that you’re truly aiming at?

There are good job queue / worker pool libraries, we can point you at some of them if you don’t want to roll your own.

2 Likes

As it is also recommended in the hex docs, using sleep shall be avoided.
If you want to use the first sleep to “simulate” something, leave it there for your testing.
However, the sleep of the parent process can be relaxed with a receive as suggested in the aforementioned Hexdocs, e.g. with a receive:

parent = self()
Task.start_link(fn ->
  do_something()
  send(parent, :work_is_done)
  ...
end)

receive do
  :work_is_done -> :ok
after
  # Optional timeout
  30_000 -> :timeout
end

In this example from Hexdocs, replace do_something with single_api_call and the Task with your Process.

2 Likes

That is definitely not a problem

Perhaps you’re not rebuilding your image

it’s only in educational reason, interesting how and why only

nope, I checked it =)

How did you check it?

Most code that relies on sleeping is not actually async, and it is only accidentally working. :smiley:

There are better ways as @bdarla showed you.

2 Likes

Honest question (I don’t know the answer):

If there is only one scheduler how would their program be executed?

Processes are preemptively scheduled, so when one of them reaches a point at which they can yield control, the scheduler will let another run. (Points that can yield include sleeping, waiting for IO, or reaching a certain number of reductions)

2 Likes

In the user’s example, is it possible for this to happen:

  1. The main process spawns all the Workers
  2. It hits Process.sleep and yields to the Workers
  3. The Workers hit Process.sleep and yield back to the main process
  4. The main process finishes before any of the Workers send their message