Reviving the thread because a) there’s been more recent advancements that can help anyone landing here and b) I still have trouble making tests involving Phoenix.Presence
deterministic.
In my particular case, my fetcher
doesn’t use the DB but calls an external API, which I’m trying to mock with TestServer
(context TestServer - No fuzz mocking of third-party services - #6 by rhcarvalho ), but I believe the underlying trouble is the same.
First, summarizing the knowledge from the thread, in 2017 @luizpvasc and @sorentwo suggested something like:
ref = leave(socket)
assert_reply ref, :ok
:timer.sleep(10)
In late 2019, 2020, this GitHub issue brings more light into the problem:
opened 02:37PM - 23 Nov 19 UTC
closed 05:27PM - 10 Jun 20 UTC
### Environment
* Elixir version (elixir -v):
```
Erlang/OTP 22 [erts-10.4.… 4] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:1] [hipe] [dtrace]
Elixir 1.9.0 (compiled with Erlang/OTP 22)
```
* Phoenix version (mix deps): `phoenix 1.4.11 (Hex package) (mix)`
* NodeJS version (node -v): `0.16.2`
* NPM version (npm -v): `6.9.0`
* Operating system: macOS Mojave
### Actual behavior
Running `mix test` on the Rumbl application from Programming Phoenix 1.4 ends up with an error:
```
15:18:54.721 [error] Postgrex.Protocol (#PID<0.291.0>) disconnected: ** (DBConnection.ConnectionError) owner #PID<0.578.0> exited
Client #PID<0.581.0> is still using a connection from owner at location:
:prim_inet.recv0/3
(postgrex) lib/postgrex/protocol.ex:2837: Postgrex.Protocol.msg_recv/4
(postgrex) lib/postgrex/protocol.ex:2553: Postgrex.Protocol.recv_transaction/4
(postgrex) lib/postgrex/protocol.ex:1858: Postgrex.Protocol.rebind_execute/4
(ecto_sql) lib/ecto/adapters/sql/sandbox.ex:370: Ecto.Adapters.SQL.Sandbox.Connection.proxy/3
(db_connection) lib/db_connection/holder.ex:293: DBConnection.Holder.holder_apply/4
(db_connection) lib/db_connection.ex:1255: DBConnection.run_execute/5
(db_connection) lib/db_connection.ex:1342: DBConnection.run/6
(db_connection) lib/db_connection.ex:596: DBConnection.execute/4
(ecto_sql) lib/ecto/adapters/postgres/connection.ex:80: Ecto.Adapters.Postgres.Connection.execute/4
(ecto_sql) lib/ecto/adapters/sql.ex:580: Ecto.Adapters.SQL.execute!/4
(ecto_sql) lib/ecto/adapters/sql.ex:562: Ecto.Adapters.SQL.execute/5
(ecto) lib/ecto/repo/queryable.ex:177: Ecto.Repo.Queryable.execute/4
(ecto) lib/ecto/repo/queryable.ex:17: Ecto.Repo.Queryable.all/3
(rumbl_web) lib/rumbl_web/channels/presence.ex:78: RumblWeb.Presence.fetch/2
(phoenix) lib/phoenix/presence.ex:318: anonymous fn/5 in Phoenix.Presence.handle_diff/5
(stdlib) maps.erl:232: :maps.fold_1/3
(phoenix) lib/phoenix/presence.ex:316: anonymous fn/4 in Phoenix.Presence.handle_diff/5
(elixir) lib/task/supervised.ex:90: Task.Supervised.invoke_mfa/2
(stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
The connection itself was checked out by #PID<0.581.0> at location:
(ecto_sql) lib/ecto/adapters/postgres/connection.ex:80: Ecto.Adapters.Postgres.Connection.execute/4
(ecto_sql) lib/ecto/adapters/sql.ex:580: Ecto.Adapters.SQL.execute!/4
(ecto_sql) lib/ecto/adapters/sql.ex:562: Ecto.Adapters.SQL.execute/5
(ecto) lib/ecto/repo/queryable.ex:177: Ecto.Repo.Queryable.execute/4
(ecto) lib/ecto/repo/queryable.ex:17: Ecto.Repo.Queryable.all/3
(rumbl_web) lib/rumbl_web/channels/presence.ex:78: RumblWeb.Presence.fetch/2
(phoenix) lib/phoenix/presence.ex:318: anonymous fn/5 in Phoenix.Presence.handle_diff/5
(stdlib) maps.erl:232: :maps.fold_1/3
(phoenix) lib/phoenix/presence.ex:316: anonymous fn/4 in Phoenix.Presence.handle_diff/5
(elixir) lib/task/supervised.ex:90: Task.Supervised.invoke_mfa/2
(stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
```
Not sure if this was reported, but I still hit this bug with Phoenix 1.4.11. In my setup it happens every time. So I'm opening this as noted by @josevalim:
> Good catch! Basically the Presence wants to query the DB and send updates but the test has shut down and there are no database connections. To fix this consistently, we would need to query the presence supervisor and ask all of its children to shutdown at the end of each test. But to do so, we will need to add a new API to Phoenix. If you can open up an issue in Phoenix issues tracker, it would be extra helpful. Thank you!
### Reproduction
Download and extract https://pragprog.com/titles/phoenix14/source_code. The relevant app is in `code/testing_otp/rumbl_umbrella`. Strangely enough I can't seem to even run the tests (the error report above is from my follow-along version):
```
** (Mix) Could not start application rumbl_web: RumblWeb.Application.start(:normal, []) returned an error: shutdown: failed to start child: RumblWeb.Presence
** (EXIT) shutdown: failed to start child: Phoenix.Tracker
** (EXIT) shutdown: failed to start child: RumblWeb.Presence_shard0
** (EXIT) an exception was raised:
** (ArgumentError) argument error
(stdlib) :ets.lookup(RumblWeb.PubSub, :node_name)
(phoenix_pubsub) lib/phoenix/pubsub.ex:288: Phoenix.PubSub.call/3
(rumbl_web) lib/rumbl_web/channels/presence.ex:10: RumblWeb.Presence.init/1
(phoenix_pubsub) lib/phoenix/tracker/shard.ex:120: Phoenix.Tracker.Shard.init/1
(stdlib) gen_server.erl:374: :gen_server.init_it/2
(stdlib) gen_server.erl:342: :gen_server.init_it/6
(stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
```
### Expected behavior
The tests should pass without error.
Quoting José Valim:
The issue is that once the test terminates, the channel process will terminate, the presence process will notice the channel termination, and then invoke the callbacks without the database.
All of this happens async, so it is hard to make it sync. I am not sure at the moment how to fix those.
The conversation goes on and a new API has been added along with some docs:
In 2021, @ruslandoga notices something I’ve experienced as well in practice, that the new Presence.fetchers_pids()
might pick up an empty list and so adds some sleep time before calling it:
opened 02:37PM - 23 Nov 19 UTC
closed 05:27PM - 10 Jun 20 UTC
### Environment
* Elixir version (elixir -v):
```
Erlang/OTP 22 [erts-10.4.… 4] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:1] [hipe] [dtrace]
Elixir 1.9.0 (compiled with Erlang/OTP 22)
```
* Phoenix version (mix deps): `phoenix 1.4.11 (Hex package) (mix)`
* NodeJS version (node -v): `0.16.2`
* NPM version (npm -v): `6.9.0`
* Operating system: macOS Mojave
### Actual behavior
Running `mix test` on the Rumbl application from Programming Phoenix 1.4 ends up with an error:
```
15:18:54.721 [error] Postgrex.Protocol (#PID<0.291.0>) disconnected: ** (DBConnection.ConnectionError) owner #PID<0.578.0> exited
Client #PID<0.581.0> is still using a connection from owner at location:
:prim_inet.recv0/3
(postgrex) lib/postgrex/protocol.ex:2837: Postgrex.Protocol.msg_recv/4
(postgrex) lib/postgrex/protocol.ex:2553: Postgrex.Protocol.recv_transaction/4
(postgrex) lib/postgrex/protocol.ex:1858: Postgrex.Protocol.rebind_execute/4
(ecto_sql) lib/ecto/adapters/sql/sandbox.ex:370: Ecto.Adapters.SQL.Sandbox.Connection.proxy/3
(db_connection) lib/db_connection/holder.ex:293: DBConnection.Holder.holder_apply/4
(db_connection) lib/db_connection.ex:1255: DBConnection.run_execute/5
(db_connection) lib/db_connection.ex:1342: DBConnection.run/6
(db_connection) lib/db_connection.ex:596: DBConnection.execute/4
(ecto_sql) lib/ecto/adapters/postgres/connection.ex:80: Ecto.Adapters.Postgres.Connection.execute/4
(ecto_sql) lib/ecto/adapters/sql.ex:580: Ecto.Adapters.SQL.execute!/4
(ecto_sql) lib/ecto/adapters/sql.ex:562: Ecto.Adapters.SQL.execute/5
(ecto) lib/ecto/repo/queryable.ex:177: Ecto.Repo.Queryable.execute/4
(ecto) lib/ecto/repo/queryable.ex:17: Ecto.Repo.Queryable.all/3
(rumbl_web) lib/rumbl_web/channels/presence.ex:78: RumblWeb.Presence.fetch/2
(phoenix) lib/phoenix/presence.ex:318: anonymous fn/5 in Phoenix.Presence.handle_diff/5
(stdlib) maps.erl:232: :maps.fold_1/3
(phoenix) lib/phoenix/presence.ex:316: anonymous fn/4 in Phoenix.Presence.handle_diff/5
(elixir) lib/task/supervised.ex:90: Task.Supervised.invoke_mfa/2
(stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
The connection itself was checked out by #PID<0.581.0> at location:
(ecto_sql) lib/ecto/adapters/postgres/connection.ex:80: Ecto.Adapters.Postgres.Connection.execute/4
(ecto_sql) lib/ecto/adapters/sql.ex:580: Ecto.Adapters.SQL.execute!/4
(ecto_sql) lib/ecto/adapters/sql.ex:562: Ecto.Adapters.SQL.execute/5
(ecto) lib/ecto/repo/queryable.ex:177: Ecto.Repo.Queryable.execute/4
(ecto) lib/ecto/repo/queryable.ex:17: Ecto.Repo.Queryable.all/3
(rumbl_web) lib/rumbl_web/channels/presence.ex:78: RumblWeb.Presence.fetch/2
(phoenix) lib/phoenix/presence.ex:318: anonymous fn/5 in Phoenix.Presence.handle_diff/5
(stdlib) maps.erl:232: :maps.fold_1/3
(phoenix) lib/phoenix/presence.ex:316: anonymous fn/4 in Phoenix.Presence.handle_diff/5
(elixir) lib/task/supervised.ex:90: Task.Supervised.invoke_mfa/2
(stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
```
Not sure if this was reported, but I still hit this bug with Phoenix 1.4.11. In my setup it happens every time. So I'm opening this as noted by @josevalim:
> Good catch! Basically the Presence wants to query the DB and send updates but the test has shut down and there are no database connections. To fix this consistently, we would need to query the presence supervisor and ask all of its children to shutdown at the end of each test. But to do so, we will need to add a new API to Phoenix. If you can open up an issue in Phoenix issues tracker, it would be extra helpful. Thank you!
### Reproduction
Download and extract https://pragprog.com/titles/phoenix14/source_code. The relevant app is in `code/testing_otp/rumbl_umbrella`. Strangely enough I can't seem to even run the tests (the error report above is from my follow-along version):
```
** (Mix) Could not start application rumbl_web: RumblWeb.Application.start(:normal, []) returned an error: shutdown: failed to start child: RumblWeb.Presence
** (EXIT) shutdown: failed to start child: Phoenix.Tracker
** (EXIT) shutdown: failed to start child: RumblWeb.Presence_shard0
** (EXIT) an exception was raised:
** (ArgumentError) argument error
(stdlib) :ets.lookup(RumblWeb.PubSub, :node_name)
(phoenix_pubsub) lib/phoenix/pubsub.ex:288: Phoenix.PubSub.call/3
(rumbl_web) lib/rumbl_web/channels/presence.ex:10: RumblWeb.Presence.init/1
(phoenix_pubsub) lib/phoenix/tracker/shard.ex:120: Phoenix.Tracker.Shard.init/1
(stdlib) gen_server.erl:374: :gen_server.init_it/2
(stdlib) gen_server.erl:342: :gen_server.init_it/6
(stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
```
### Expected behavior
The tests should pass without error.
on_exit(fn ->
:timer.sleep(10) #### WAIT FOR FETCHER PROCESSES TO BE STARTED ####
for pid <- RumblWeb.Presence.fetchers_pids() do
ref = Process.monitor(pid)
assert_receive {:DOWN, ^ref, _, _, _}, 1000
end
end)
I used GitHub Search to see what people are doing in the open: Code search results · GitHub
The first hit for me is LiveBeat which does use a 100ms sleep:
defp wait_for_children(children_lookup) when is_function(children_lookup) do
Process.sleep(100)
for pid <- children_lookup.() do
ref = Process.monitor(pid)
assert_receive {:DOWN, ^ref, _, _, _}, 1000
end
end
setup tags do
pid = Ecto.Adapters.SQL.Sandbox.start_owner!(LiveBeats.Repo, shared: not tags[:async])
on_exit(fn -> Ecto.Adapters.SQL.Sandbox.stop_owner(pid) end)
on_exit(fn ->
wait_for_children(fn -> LiveBeatsWeb.Presence.fetchers_pids() end)
end)
Second hit is NervesHub that just follows the documentation and has no sleep:
on_exit(fn ->
for pid <- NervesHubWeb.Presence.fetchers_pids() do
ref = Process.monitor(pid)
assert_receive {:DOWN, ^ref, _, _, _}, 1000
end
end)
Going down the list I find both cases of with and without sleep, with different amounts of sleep. And found this commit message from @gpreston which reinforces people don’t know what to do I don’t know what to do… the sleep time still feels non-deterministic, racy.
committed 11:03PM - 20 Dec 21 UTC
TODO:
I’m tempted to suggest updating the docs with the sleep before calling fetchers_pid
Discuss what else can we do. Could tests reasonable synchronize with Presence fetchers? Can we write tests knowing that a certain fetcher will be called exactly N number of times? Is that a bad idea to begin with?
Would love to learn more about this corner of Elixir. There are so many pieces involved in making Presence work that understanding how everything fits together (and points of synchronization) is no easy feat