Automating Tests in Elixir Projects

This post explores different ways to do test automation in Elixir, focusing on how to handle dependency injection — covering patterns, libraries, challenges, and trade-offs along the way. It walks through a series of real attempts using tools like Mock, Mox, ProcessTree, Hammox, and Double, all in the pursuit of fast, simple, and frictionless testing. Part of an ongoing series to explore the “perfect” Elixir setup.

7 Likes

What happen to using behaviours and implement a test module.

I also find that all these libraries makes it way more complex then it should be and forget an important lesson to teach around architecture and simplicity.

I think the Elixir documentation deserves a simple dependency free architecture guide vs a million libraries in search of a problem.

Another point would also be, if you want to mock a HTTP response, why would you not setup a complete web-server for it, so you have a much control as possible, TestServer and Bypass comes to mind, where TestServer really shines with it’s support for WebSockets and flexibility in request patterns that let’s you modal highly complex cases with ease.

2 Likes

using behaviours and implement a test module

Is that what I called Manual with Behaviours? Where you define a behaviour, and provide a real and a test implementation?

I like that too - it’s explicit and easy to reason about, with reasonable tradeoffs. Although it still raises the question of how you swap in the test module? I think the application environment is an oft-used mechanism, but I don’t find it especially simple or elegant… Curious if you’ve got another approach?

I share the instinct to keep things simple and avoid unnecessary dependencies. It’s a solid baseline. At the same time, I find that carefully chosen, narrow libraries are a big time-saver - they do one thing well and are easy to read and reason about. It’s always a tradeoff though.

On HTTP mocks vs real servers: I tend to see external test harnesses as more fitting for E2E tests. I haven’t covered that space yet in the blog, but definitely plan to. The focus so far is on unit tests, since they form the foundation of a testing strategy. That’s not to say they’re the only part. A test server like you describe sounds like a great fit when I get to that layer.

Testing is one of those areas with strong opinions and multiple valid approaches. My blog is more of a “random walk” than a “thorough search” through testing techniques - just things I’ve seen and tried in the wild. Always happy to hear how others are doing it so I can keep learning, if you feel like sharing?

1 Like

It’s simple to reason about it, as it is global state. As for elegant, IMO simple things almost always lead to elegant solutions, and I think injection by config is elegant enough for simple cases.

I used to mock http requests too, however after using once bypass, it’s crystal clear that this is a much better approach. Not only you can check in detail http requests with exactly zero modifications to your source code, but you can also go into advanced topics such as https. Doing something like that with mocks will require quite a substantial amount of time investment to get right, not to mention that mocking tends to be on the dangerous side if you are not doing it right.

I think that such kind of libraries (ecto sandbox is another great example) are amazing and this is the right way to approach testing. Obviously this cannot be universally applied as such implementations are involved, however they amount to the best testing experience and results.

2 Likes

One problem with using the application environment for injection is that tests don’t run concurrently, right? I shouldn’t argue it’s not simple though - you’re right, it’s a global key-value store. But I find it surprising when runtime code reads from it not for user-configurable settings, but purely as a mechanism for injection. It feels like an awkward fit to me (as elucidated in the blog).

Personally, I find it easier to reason about when a process gets its dependencies from its own process hierarchy — and as a bonus, it enables async tests. But: Just opinions. It’s a wide tent, and many ways to success. What matters most must be that the team is effective and having fun within whatever is the chosen setup.

I’m smiling at the HTTP discussion - it’s revealing more nuance than I’d considered: I also don’t want to mock HTTP, but I’ve typically placed those kinds of tests higher up, e.g. E2E. But Bypass sounds great - I’ve added it to my to-investigate list so I can make sure I understand it properly.

Ultimately I’m open to test doubling internal modules when they grow too large or too entangled to fit in our heads, because I’ve seen devs really appreciate the clarity that comes from such decoupling. So I’ve got a net-positive view on this kind of approach - as long as we use the simplest (not always easiest) ways to do it. And I appreciate the engagement here, it’s a nuanced topic.

1 Like

If you plan on using separate modules, yeah, but generally you have a 2 module setup: the one that calls the external api and the one that does mocks.

Give it a try for sure. This is no new concept and it’s much more easier to reason about compared to the complex setup that mocks require.

Mocking internal logic for me is a hard no, this is one of the big reasons I try to avoid mocks whenever I can. The point of tests is to ensure correctness of your system at different levels. If you are mocking actual business logic, then you are introducing tests that can result in false positives which may cause more problems than they solve, as the code is a dynamic media that is always changing.

3 Likes

but generally you have a 2 module setup

I appreciate your point here, and to me what you describe is complex. It’s cognitive load. A delayed gotcha of sorts. I don’t say that to convince you of course, but to illustrate the spectrum of the testing topic.

On internal mocks, it never fails to elicit a strong response - and no disrespect at all, I actually find it fascinating how wide our “testing tent” is. What I describe and what you’re advocating (perhaps loosely: mockist vs. classicist) both have passionate, experienced camps behind them.

I’ve seen systems where a few well-placed mocks helped a team move forward, just as I’m sure you’ve seen systems suffer from mock abuse. I like to think there are reasonable compromises in the middle - but that only works if we can talk openly about how to mock well, so we can better decide when it makes sense to do it.

I sometimes wonder if each camp is just wired differently - like trying to work left-handed when you’re right-handed - and I suspect the best solution is typically Conway/architectural: put us on two different teams :joy:

Anyway, I know even that framing won’t sit comfortably with everyone, and I respect that. It’s part of what makes testing such a rewarding and sometimes difficult topic to explore.

1 Like

I am not entirely sure what you mean by camp, I’ve never read even a single book on testing and I certainly don’t have any time to do that now :smile:. I am entirely practice based on this topic.

I’ve learned to write tests by mistakes mostly. I’ve started with projects that I didn’t write tests for when I was a beginner, then by writing brittle tests, abusing mocks etc. These days I am always adapting tests to the current codebase and team I work with, but things such as mocking business logic need to be motivated very thoroughly for them to be used, as I’ve previously said, modifications to code that are not expected can lead to false positives and a distorted reality between testing environment and code that runs in production.

1 Like

Mocking internal code is a huge red flag, it tells everyone that there is fundamental architectural problem, or lack of experience.

In the end, programming is all about data transformations. To put it simple: input → output

So when we test, then we want to assert that we got the expected output from the input.

With that in mind we can design our system in such a way that we only care about changing the input, that input can come from a lot of places.

I saw you mention unit testing and E2E. People tend to conflate all these terms. 99% of the time programmers write integration test, E2E test is a subset of integration test. All your Ecto test are integration test. If a function has side effects then the only way to test it is by an integration test.

Unit test requires a function without side effects.

Regardless of this pedantickery lets just agree that tests should be simple to reason about and if you feel that mocking is the right answer, then you properly have a bigger problem at hand.

3 Likes

I think quite the contrary: test doubles are invaluable in testing, including internally. They help isolate systems and keep tests focused. I used the word “mock” earlier, but didn’t mean to imply a specific kind of test double — any chance that caused some misunderstanding?

Internal test doubles let us test specific behavior without pulling in all underlying dependencies. Dismissing them overlooks how they can simplify complex systems and prevent growing dependency chains. Not every project needs them — but when used well, they help many devs better understand system boundaries.

This isn’t to my knowledge a controversial position in the wider testing world. Mockist vs. classicist doesn’t capture all the nuance we’re touching on here, but it hints at the broad and diverse approaches under the test automation umbrella. For better or worse, we’ve got to learn to see eye to eye :joy:

I think we’re talking past each other, again terms are being conflated, if not made more complex.

When I’m talking about mocking internal code, then what typically happens is that the mock changes your code. If you need to modify your business logic in a perverse goal of test isolation to the extreme. Then you fundamentally have bigger problem at hand, which is basically experience. It takes a long time for any new developer to learn how to cut through the pedantic BS, especially when it comes to testing which is some of the most opinionated discussions you can have.

Regardless my point being a more fundamental software engineering 101. Everything we do is data transformations. input → output.

We don’t need to make it more complex. When we understand this, then we can build beautiful simple systems, which can grow into complex systems that are easier to understand.

2 Likes

This is an interesting topic and something I’ve tried to figure out myself lately as I’m working on an Elixir project at work.
I really dislike mocking too much, and try to keep that for 3rd party APIs that are almost always mocked.

What I’ve seen though is when you have long call chains with side-effects in the end. Then it becomes annoying as it makes all functions “impure”.

Say you have this (never mind any error handling):

defmodule MessageProcessor do
  def process(message) do
    message
    |> Parser.parse()
    |> MessageHandler.handle()
  end
end

defmodule MessageHandler do
  def handle(message) do
    case verify_message(message) do
      :ok -> Router.route(message)
      {:error, reason} -> {:error, reason}
    end
  end
end

defmodule Router do
  def route(%FooMessage{} = message) do
    ExternalAPI.call(message)
  end

  def route(%BarMessage{} = message) do
    AnotherExternalAPI.call(message)
  end
end

If you test the Router module it’s quite straight forward. You can update it to inject the external API module, or set a mock globaly using Mox.

But, then when you want to test the MessageHandler and MessageProcessor - they also end up calling the 3rd party.
So for MessageProcessor for example, your alternatives are:

  1. Write a behaviour for the MessageHandler module and inject a mock. Don’t like mocking internal modules…
  2. If the 3rd party API mocks are set globally, you need to add expect(...) calls in the MessageProcessor tests. Not sure if that makes sense. You’d do the same for MessageHandler tests…
  3. If the 3rd party API mocks are not set globally, you need to inject them all the way from MessageProcessor → MessageHandler → Router. Horrible :slight_smile:
  4. Rewrite the code.

Lately I’ve been leaning more to 4 when I see code like this, trying to isolate the side effects.
Something like this:

defmodule MessageProcessor do
  def process(message) do
    with {:ok, parsed_message} <- Parser.parse(message),
      {:ok, verified_message} <- MessageVerifyer.verify(parsed_message),
      {:ok, {module, fun, args} <- Router.route(verified_message) do
        apply(module, fun, args)
      end
  end
end

defmodule MessageVerifyer do
  def verify(message) do
    if good?(message) do
      {:ok, message}
    else
      {:error, :bad_message}
    end
  end
end

defmodule Router do
  def route(%FooMessage{} = message) do
    {ExternalApi, :call [message]}
  end

  def route(%BarMessage{} = message) do
    {AnotherExternalApi, :call [message]}
  end
end

Now the MessageVerifyer and Router have no side-effects and can easily be tested. The MessageProcessor would not be “unit tested” but only integration tested (or whatever you want to call it), and I would mock the 3rd party API calls with either a global Mox mock or using Bypass or something. Since the MessageProcessor basically only “integrates” different parts of your codebase, a “unit test” doesn’t make much sense.

I feel like approaches like this helps a lot with dealing with (not) mocking etc. But, I don’t have that much experience with bigger Elixir projects yet though so we’ll see.

What do you think about that?

1 Like

That approach is also what is presented in this great talk: Boundaries

I personally like this approach quite a lot.

3 Likes

Right, yeah that is a great talk :+1:

Which is much better achieved by reducing the size of monster functions or modules.

Mocking should be used only for super complex external systems your app depends on. Mocking those helps you ascertain with some degree of certainty (far from 100%) that your code acts adequately to what the external system is doing.

If you have to mock code that is under your control then you’re taking the easy way out now that will introduce more work in the (likely very near) future.

Also known as tech debt.

There are no “camps” here. I strongly object to them existing. It’s simply choosing your poison / tradeoffs. That’s not a “camp”.

1 Like

I like a lot of the thinking here - and @Linuus, your focus on isolating side effects to avoid excessive mocking is spot on. I’d just add that the top-level orchestrator in your solution is itself part of the broader conversation around dependency injection and test doubles, which is what my blog tries to unpack.

Here’s a variant of your setup to illustrate where I’m coming from:

defmodule Warehouse do
  def empty?(%Warehouse{inventories: inventories}) do
    inventories
    |> Enum.map(&Inventory.all_products/1)
    |> List.flatten()
    |> Enum.empty?()
  end
end

In this version, both Warehouse and Inventory are complex internal domains (backed by DBs and external systems), and under our control. The idea is that querying a warehouse necessarily means querying is inventories. This mirrors your example, but focuses on two internal systems.

We could decouple this with an orchestrator (e.g. WarehouseQuerier, like your MessageProcessor), but as long as the systems depend on each other, the conversation around dependency injection becomes relevant. And if we stick to only mocking external systems, testing this orchestrator would mean mocking the DB and APIs behind both Warehouse and Inventory.

That can bring real cost in the form of:

  • a lot of setup, most of which is already handled in the domain’s own tests so it would duplicate those efforts.
  • It demands domain knowledge of both systems just to test their integration.

At some point along that spectrum, I’m happy to discuss the tradeoff of using simple, high-fidelity test doubles for internal systems, if it makes the tests clearer and easier to work with.

The idea of strictly banning test doubles except at fully external boundaries can just as well introduce new complexities. Two points in particular:

  • The “only ever mock 3rd-party code” rule lacks nuance. Over-mocking is bad - but under-mocking is also a problem.
  • It assumes developers can always manage deep cross-domain integrations in tests, and I’ve met developers who are so smart they can’t see that. They don’t see the signs of other developers struggling. It is a great kindness to accept a little mocking if it goes a long way to reduce the cognitive burden.

I like this point, especially in how it frames the issue in terms of time. Test doubles can be a temporary measure - used while waiting for a better design to emerge. Or one might deliberately refactor toward test doubles, deciding they’re simpler than wrestling with domain-heavy test setups.

I think I’m putting words in your mouth though - we likely disagree on when test doubles are appropriate - but I appreciate the idea that test strategy is part of the evolving design, not separate from it.

That’s a helpful clarification, thank you. I don’t associate “camp” at all with your reaction, but I see your point. “Tradeoffs” works for me :+1:

That’s not what I said though :slight_smile: I said I try to keep mocks for calls to 3rd party API’s. What I meant here is 3rd party systems that I have no control over or are annoying to test against (slow, async conflicts, can’t run locally…). I’ve seen code that tries to juggle globally configured Mox mocks for all kinds of dependencies, both purely functional internal code as well as external. And if you need to write behaviours etc for all that just for tests, it gets messy real fast and you have no idea what code is actually running during your tests.

Also, this is in no way a strict rule. Sometimes I inject a simple function if it makes sense (usually at some boundary) or a mock, but first I want to try and isolate side-effects as I showed above. Functional dependencies are not that bad when testing.

This screenshot is from the Elixir testing book (I hope it’s ok that I share it)

And, of course there are exceptions and tradeoffs as always. :slight_smile: But my main point is that I think the default reaction should not be to reach for and inject mocks all over the place.

1 Like

Oh absolutely, I generalized a bit there sorry :slight_smile:

And great posts, thanks.

I don’t know anyone that would consider setting fixtures in your database to be a mock. The Ecto sandbox does not mock the database, the transaction are as real as it gets. This is what I mean about conflating terms.

Traditional when we talk mocks, then we’re talking about changing code. That is why there is and always will be so much contention around them, since they literally obfuscate what really goes on. It is far better to setup a database with fixtures or a web server with an API that you are testing again. Don’t change your implementation, but focus on changing the input.

3 Likes