Possible deadlock in tests

I have been banging my head against the wall with this for a while. I have a factory to create database entries for my tests in a Phoenix app, it is pretty much lifted from What’s New In Ecto 2.1:

defmodule Deadlock.Factory do
  def build(:post) do
    %Deadlock.Blog.Post{title: "Hello"}
  end

  def build(name, attributes \\ []) do
    name |> build() |> struct(attributes)
  end

  def insert!(name, attributes \\ []) do
    Deadlock.Repo.insert!(build(name, attributes))
  end
end

Then I have the following two tests:

defmodule DeadlockWeb.PageControllerTest do
  use DeadlockWeb.ConnCase
  import Deadlock.Factory

  test "This works..." do
    assert %Deadlock.Blog.Post{} = insert!(:post)
  end

  test "This seems to deadlock..." do
    assert %Deadlock.Blog.Post{} = insert!(:nothing_matching_in_factory)
  end
end

The first one passes as expected. The second fails with the following error:

21:30:02.946 [error] Postgrex.Protocol (#PID<0.296.0>) disconnected: ** (DBConnection.ConnectionError) owner #PID<0.328.0> timed out because it owned the connection for longer than 15000ms


  1) test This seems to deadlock... (DeadlockWeb.PageControllerTest)
     test/deadlock_web/controllers/page_controller_test.exs:9
     ** (ExUnit.TimeoutError) test timed out after 60000ms. You can change the timeout:

       1. per test by setting "@tag timeout: x"
       2. per case by setting "@moduletag timeout: x"
       3. globally via "ExUnit.start(timeout: x)" configuration
       4. or set it to infinity per run by calling "mix test --trace"
          (useful when using IEx.pry)

     Timeouts are given as integers in milliseconds.

I discovered this by accident since I mistyped something in my test, and it took me a while to work out what was going on. I would expect the second test to fail quickly since it can’t find a matching build function in the factory module, but it’s obviously trying to do something with the database.

A minimal repo to reproduce is here: https://github.com/amarraja/deadlock

I’m sure it’s something simple, but I just can’t see it!

Thanks for reading

t’s obviously trying to do something with the database.

This isn’t actually the case. The timeout is just for the test case itself, there’s nothing database related in the error.

You have an infinite loop. insert!(:nothing_matching_in_factory) calls:

  def insert!(name, attributes \\ []) do
    Deadlock.Repo.insert!(build(name, attributes))
  end

which calls build(name, attributes) with build(: nothing_matching_in_factory, [])

This is where we enter the infinite loop. Since the first clause doesn’t match it goes here:

  def build(name, attributes \\ []) do
    name |> build() |> struct(attributes)
  end

If we break that apart a little the first function call is build(:nothing_matching_in_factory). This does match a function clause! It matches the function we’re in right now: def build(name, attributes \\ []) do. Voila, infinite loop. It just runs until the test case times out.

7 Likes

Aah, so obvious!! I’ve been looking at this too long. Thanks for the great explanation :smile:

2 Likes