Tests randomly failing when run in a VM

Here is the test case: https://github.com/semlabs/ecto_sandbox
and here you can see the failing test after a couple of tries: https://semaphoreci.com/semlabs/ecto_sandbox/branches/master/builds/25

It’s a very simple app:

  • Create supervised workers which run as GenServers (Module Worker)
  • Make a request with a specific worker name to a POST route sending some data (Module CollectorController)
  • The worker processes and stores the data in the database (Module Server)
  • Make sure the current test waits until the data was stored (Module Notification)
  • Enable ownership logging for db_connection
  • The test uses ConnCase which sets the ownership mode to shared

As you can see the test randomly fails with:

** (DBConnection.OwnershipError) cannot find ownership process for #PID<0.438.0>.

When using ownership, you must manage connections in one
of the four ways:

* By explicitly checking out a connection
* By explicitly allowing a spawned process
* By running the pool in shared mode
* By using :caller option with allowed process

The first two options require every new process to explicitly
check a connection out or be allowed by calling checkout or
allow respectively.

The third option requires a {:shared, pid} mode to be set.
If using shared mode in tests, make sure your tests are not
async.

The fourth option requires [caller: pid] to be used when
checking out a connection from the pool. The caller process
should already be allowed on a connection.

If you are reading this error, it means you have not done one
of the steps above or that the owner process has crashed.

See Ecto.Adapters.SQL.Sandbox docs for more information.
    (db_connection) lib/db_connection.ex:934: DBConnection.checkout/2
    (db_connection) lib/db_connection.ex:750: DBConnection.run/3
    (db_connection) lib/db_connection.ex:592: DBConnection.prepare_execute/4
    (ecto) lib/ecto/adapters/postgres/connection.ex:86: Ecto.Adapters.Postgres.Connection.execute/4
    (ecto) lib/ecto/adapters/sql.ex:256: Ecto.Adapters.SQL.sql_call/6
    (ecto) lib/ecto/adapters/sql.ex:542: Ecto.Adapters.SQL.struct/8
    (ecto) lib/ecto/repo/schema.ex:547: Ecto.Repo.Schema.apply/4
    (ecto) lib/ecto/repo/schema.ex:213: anonymous fn/14 in Ecto.Repo.Schema.do_insert/4
    (ecto_sandbox) lib/ecto_sandbox/workers/server.ex:20: EctoSandbox.Workers.Worker.Server.create/1
    (ecto_sandbox) lib/ecto_sandbox/workers/server.ex:11: EctoSandbox.Workers.Worker.Server.store/2
    (ecto_sandbox) lib/ecto_sandbox/workers/server.ex:31: EctoSandbox.Workers.Worker.Server.handle_call/3
    (stdlib) gen_server.erl:636: :gen_server.try_handle_call/4
    (stdlib) gen_server.erl:665: :gen_server.handle_msg/6
    (stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3
Last message (from #PID<0.436.0>): {:process, %{"data" => "test"}}

Notice that the test runs with #PID<0.436.0> as seen in the last line.
And a little bit before we can see the ownership changes. So the test process loses its proxy before the worker gets it:

.13:01:06.667 [error] #PID<0.436.0> owns proxy #PID<0.437.0>
13:01:06.671 [error] #PID<0.428.0>, #PID<0.430.0> lose proxy #PID<0.429.0>
13:01:06.731 [error] #PID<0.436.0> lose proxy #PID<0.437.0>

And that’s even before the test ends.

2 Likes