TDD in Elixir with tests that hit the database

gophertroll · October 22, 2019, 2:57am

I’m new to Elixir, and I’m wondering if I might be doing something wrong in my tests. Right now, I have only 105 tests, but they take ~30 seconds to run. For TDD, this is much too slow. The tests that seem to be the slowest are those where I need one or more database record to exist in order to test that I can then add other related records. For example, I might need a user so that I can create a bank account for that user, and I need both the user and the bank account in order to add or query for transactions. In one such test set, 9 unit tests are taking 2+ seconds, and in another 15 tests are taking ~3.5 seconds.

I’ve seen other posts that suggest that unit tests in Elixir should be very fast, even when they interact with the database. I’ve also seen some posts about using the async flag, but, based on those posts, I’m confused as to whether that is good or bad when the tests use the database. In Java, I would likely isolate the data access layer so that it could be mocked for testing or replaced with an in-memory database to improve speed, but it seems those techniques are not generally used with Elixir.

Are there some general tips, tricks, or common pitfalls that might help me speed up these tests?

ityonemo · October 22, 2019, 5:33am

You can use async exunit with tests against a database:
https://hexdocs.pm/ecto/testing-with-ecto.html#content

Basically the way to think about it is that you have your database sandbox itself so each test exists in its own parallel universe that gets generated when you perform a checkout operation. As long as your access queries run in the same BEAM process it should know which database sandbox instance belongs to its parallel universe. Also I think if you shoot of a Task it should also pass that information to the Task. If your database query goes through a GenServer, you might have a hard time (though i have done this and deeper parallel universes are doable, I would say it is not really for beginners).

You can also mock your database (or any module, really) module using Mox: https://hexdocs.pm/mox/Mox.html. There is a similar concept of tying the test’s process to the Mock system’s internal notion of parallel universes which track what test is doing what mock.

You can even do full on acceptance test (I currently run acceptance tests (total 10s) alongside my unit tests that check out a chromium headless and pass the parallel universe information using Hound and then using a plug associate the webserver’s BEAM process to exist in the same universe as the test that shoots off the query). I would say that this is kind of mid/advanced-level sophistication and not something I would necessarily pursue as a beginner, but I’m happy to answer questions for you if you are interested in doing this.

A few general points: 1) turn on async. Even if your tests are stateful, refactoring them to be able to exist in ‘parallel universes’ forces you to think about your architecture and really know what pieces of state bottleneck your system. In general you want to minimize these, and pushing on the async will expose the troublesome parts. Crazy sh*t will happen. You will HAVE to think about race conditions and you’ll step through things and red text will appear all over the place. You can tear your hair out or enjoy the ride, it’s a matter of perspective. 2) Keep in mind that tests within a module do not run asynchronously (although they are scrambled) so if you have a module with a ton of long-running tests you might be served by breaking them up. 3) keep in mind that a test is just a process! You can do things like send them messages and trap them with receive blocks if you are, say, spawning a sidecar process to do a quick thing.

I guess my unbiased opinion is that Testing in BEAM -especially with elixir- is like a whole new mind-blowingly awesome experience that you just won’t get anywhere else so take it in!

nthock · October 22, 2019, 8:40am

I have this experience before, and find out that the culprit is bcrypt. Bcrypt is slow. So maybe you want to check your test setup, especially user to see if you use Bcrypt to hashed your password everytime you create a user in test environment.

LostKobrakai · October 22, 2019, 9:48am

Just to add more context: Bcrypt is slow by design. Password hashing needs to be slow to provide the security they give. Usually in tests one would dial down all the knobs, which allow it to be faster, or even replace hashing with a dummy.

v0idpwn · October 22, 2019, 1:49pm

How would you substitute it by a dummy?

idi527 · October 22, 2019, 3:31pm

One way would be to wrap it in a behaviour module

defmodule MyApp.Accounts.PasswordHash do
  @callback hash(String.t) :: String.t
  @callback verify(String.t, String.t) :: boolean

  impl_mod = Application.get_env(MyApp.Accounts.PasswordHash, :impl)
  
  defdelegate hash(value), to: impl_mod.hash(value)
  defdelegate verify(value, hash), do: impl_mod.verify(value, hash)
end

and have different implementations for test and dev/prod, where in the former one it would return a string

# this would probably be defined in test/support/
defmodule MyApp.Accounts.PasswordHashDummy do
  @behaviour MyApp.Accounts.PasswordHash
  def hash(value), do: "hashed_" <> value
  def verify(value, hash) do
    "hashed_" <> value == hash
  end
end

and in the latter – delegate to bcrypt

# defined in lib/
defmodule MyApp.Accounts.PasswordHashDefault do
  @behaviour MyApp.Accounts.PasswordHash
  defdelegate hash(value), do: Bcrypt.hash_pwd(value)
  defdelegate verify(value, hash), do: Bcrypt.verify_pass(value, hash)
end

and the appropriate implementation module would be set in env’s config

# config/config.exs
config MyApp.Accounts.PasswordHash, impl: MyApp.Accounts.PasswordHashDefault

# config/test.exs
config MyApp.Accounts.PasswordHash, impl: MyApp.Accounts.PasswordHashDummy

and the rest of the app would use MyApp.Accounts.PasswordHash.hash/1 and MyApp.Accounts.PasswordHash.verify/2.

gophertroll · October 23, 2019, 1:59am

Many thanks to all who replied with suggestions. The slow down was caused by Bcrypt, and with this help, the 100+ tests are now running in < 1 second!

My implementation looks like this:

# The behaviour definition
defmodule Checkbook.UserAccounts.PasswordHash do
    @callback hash(String.t) :: String.t
    @callback check_pass(%Checkbook.UserAccounts.User{}, String.t) :: {:ok, %Checkbook.UserAccounts.User{}} | {:error, String.t}
end

(I’m not entirely sure that the syntax for my user type is correct… However, no compiler errors.)

# the mock password hash implementation (in test/support)
defmodule Checkbook.UserAccounts.MockPasswordHash do
    alias Checkbook.UserAccounts.{PasswordHash, MockPasswordHash}
    
    @behaviour PasswordHash

    @impl PasswordHash
    def hash(password), do: "hashed_" <> password

    @impl PasswordHash
    def check_pass(user, password) do
        case user do
            nil -> {:error, "User was nil"}
            user -> case MockPasswordHash.hash(password) == user.encrypted_password do
                        true -> {:ok, user}
                        _ -> {:error, "Invalid password"}
                    end
        end
    end
end

# the real implementation
defmodule Checkbook.UserAccounts.BcryptPasswordHash do
    @behaviour Checkbook.UserAccounts.PasswordHash

    @impl Checkbook.UserAccounts.PasswordHash
    def hash(password) do
        Bcrypt.hash_pwd_salt(password)
    end

    @impl Checkbook.UserAccounts.PasswordHash
    def check_pass(user, password) do
        Bcrypt.check_pass(user, password)
    end
end

# in config.exs
config :checkbook,
  ecto_repos: [Checkbook.Repo],
  generators: [binary_id: true],
  password_hash: Checkbook.UserAccounts.BcryptPasswordHash

# in test.exs
config :checkbook,
  password_hash: Checkbook.UserAccounts.MockPasswordHash

I checked initially by letting the tests run with BcryptPasswordHash (removing the override config from test.exs), and the tests passed in ~12 seconds. This was just to be sure I had the behaviour implemented properly. Then I swapped in the mock for the tests, and the total time dropped to 0.9 seconds. (If I include ~17 tests that go through the UI using Hound and Chrome Driver, the run time using Bcrypt is 30 seconds, and with the mock password hash it is about 13. I can live with that since I can isolate those tests and not run them as part of each Red-Green-Refactor TDD cycle.)

ityonemo · October 23, 2019, 3:36am

Looks great. Are you using compile-time @values when you call the hashing module?

Also you can be lazy and declare @impl true

And those nested cases could probably be refactored into something more idiomatic using function guards on the outer case and an if statement (gasp!) On the inner case.

sasajuric · October 23, 2019, 8:12am

Did you try reducing log_rounds in test env, as advised in step 3 of the installation guide? This is what I typically do in all of the projects, and tests are running smooth then. I never had to do this complex mocking of bcrypt.

gophertroll · October 24, 2019, 1:05am

Looks great. Are you using compile-time @values when you call the hashing module?

Also you can be lazy and declare @impl true

And those nested cases could probably be refactored into something more idiomatic using function guards on the outer case and an if statement (gasp!) On the inner case.

I’m not sure exactly what you mean by “compile-time @values”, but I think you are asking how I am indicating which implementation of the password hash I want to use.

In the module that defines the %User{}, I’ve used

@password_hash Application.get_env(:checkbook, :password_hash)

to retrieve the implementation from the appropriate config. Then I use @password_hash.hash and @password_hash.verify.

Thanks, too, for the tip about cleaning up the check_pass/2 function in MockPasswordHash. I refactored to this:

@impl PasswordHash
def check_pass(user, password) do
        cond do
            is_nil(user) -> {:error, "User was nil"}
            MockPasswordHash.hash(password) != user.encrypted_password -> {:error, "Invalid password"}
            true -> {:ok, user}
        end
end

I don’t think that is quite what you suggested, but it does remove the nested case.

gophertroll · October 24, 2019, 1:11am

No, I didn’t see that. However, that solution seems to produce test run times very similar to those I achieved with the behaviour approach, and, as you pointed out, it has the advantage of not requiring any additional code.

florish · October 27, 2019, 10:03am

For anyone stumbling upon this topic: for a project, I’ve been using argon2_elixir as a bcrypt alternative. For that library, there are also test-specific settings that really speed up the test suite. Documentation here:

https://hexdocs.pm/argon2_elixir/Argon2.Stats.html#module-test-values

shakram02 · March 3, 2020, 1:15pm

@LostKobrakai thank you so much for the hint about bcrypt!! I got 14x better test time just because of that.

If you’re using a mock user for example, put the password hash as a module attribute
In my case, I use exMachnia and every time a user is generated, this line was called (in the Factory)

Bcrypt.hash_pwd_salt("password123$")

What I just did is add this value as a module attribute (since it’s constant anyways)

defmodule MyApp.Factory do
# ...
@password_hash Bcrypt.hash_pwd_salt("password123$")
# ...

and then just use the @password_hash whenever the factory makes a user.