Rationale for not using same data

mattfara50 · June 6, 2022, 7:43pm

I’m reading “Testing Elixir: Effective and Robust Testing for Elixir and its Ecosystem”.

In chapter 1 on unit tests, it explains why you would want to use module attributes in a test file:

Our tests will have to have that same level of knowledge so that they can test the code thoroughly. This means adding a copy of all the IDs to our test file. It may seem like a good idea to put them somewhere where both the test and the code under test can access them, but that’s discouraged. Any accidental modification to that list could cause our test to miss a needed case, allowing our code under test to make it to production with a bug. You should avoid using the Don’t Repeat Yourself principle when the instances are split between your tests and your code under test.

I don’t understand this. What if a an ID were added to the list in the code under test, but accidentally not added to the list in the test? I’d think that using one list that both files pull from would actually do the exact opposite of what the book explains. Can you explain the reasoning some more?

benwilson512 · June 6, 2022, 8:10pm

Honestly without more context from the book, it’s hard to say. For example, the central part of this bit here

I don’t know what this means.

Qqwy · June 7, 2022, 9:29am

It is very difficult to give an answer without more context.

What I guess (but again, this is a guess!) the book is trying to convey, is that if your test retrieves its information about ‘what is correct’ from the same place as the main code, this can lead to brittle tests.

As an example:

defmodule Admin do
  def admin_user_ids do
    [1, 4, 10]
  end

  def admin?(user) do
    user.id in Admin.admin_user_ids()
  end
end

defmodule AdminController do
  def login(conn, params) do
    # ...
    with {:ok, user} <- try_auth_user(params),
         true <- Admin.admin?(user) do
      setup_session(conn, user)
      # ...
    else
      {:error, problem} ->
        return_auth_error(problem)
    end
    # ...
  end
end

Now if we were to write a test for this controller, we’d want to test:

Users which are admins are able to sign in
Users which are not admins will be rejected with an auth error.

You could write these tests as the following (a little bit pseudocode-y, but I hope the idea is clear):

test "admins are able to sign in" do
  for admin_id <- Admin.admin_user_ids do # <- This is the important bit!
   admin = User.find(admin_id)

   result = AdminController.login(conn, %{"username" => admin.username, "password" => admin.password})
  assert_successful_session(result)
  end
end

This will test whether the AdminController works correctly.
When you make a change to the Admin module however, then this is not caught by the test.
So for something as important as e.g. “who is allowed admin rights”, you’ll want to make sure that you have a separate test which has separate knowledge about what is and is not allowed.

It might be tempting to deduplicate this kind of code, but in this case, the duplication is there for a reaon:
The only way to introduce a problem now, is to deliberately make the same mistake in both the Admin module and the test(s).
The chance that this happens accidentally is very low.

Obviously there is a trade-off here: If you copy too much information into your tests, it will become incredibly hard to make any (including any correct!) changes because you’ll have to also change the tests all the time.
But if you have too little, then changes ((including any incorrect changes!) might not be caught by the test suite at all.

What parts of your codebase you want to make ‘rigid’ and which parts you want to not test themselves but only from the ‘outside’ is a very hard situation-dependent choice.
“It depends”