Use factories and not fixtures
Gods, I cannot even express how much I disagree with it.
Use factories and not fixtures
Gods, I cannot even express how much I disagree with it.
Please try; otherwise your comment is nothing but negative energy.
Factories hide a lot of complexity and make it really easy for you to test toward positive cases (it also typically uses the same flows that you should be testing). With fixtures, you can pretty much ensure that you have data that looks a lot more like what you’re going to see in real data. Factories are far more likely to result in subtle logic bugs than fixtures, because they’re not typically just applied data.
I haven’t found a good way to do fixtures in Elixir and have been tempted more than once to try to port Rails fixtures over for Ecto (but don’t have the time), but I enforced the use of fixtures over factories on my last two Rails projects and we had more tests that ran faster than the last time I worked with a Rails project that used factories. The factories themselves introduced a 30%+ slowdown in test running.
I’m using simplified factories in my current Elixir projects, but I hate them. They have been the source of a half-dozen test bugs across the projects, and result in more churn than I would like.
In my previous Rails projects, when I needed to test a particular set of “empty” table behaviours, I would just truncate the table in question and build from there. But that only applies to the first few times your code runs in the real world, most of the time.
Interesting. Could you maybe give a code example of what the difference is exactly between a fixture and a factory and why fixtures are better?
I’ve been doing some quick googling but they all point into the direction that factories are better because fixtures hide the data you are using in the tests
am I misunderstanding fixtures? Usually I have something like this:
test/support/my_data_fixture.ex
defmodule MyAppTest.MyDataFixture do
@default_fields [foo: "bar", baz: nil]
def new(supplied_fields) do
fields = Keyword.merge(supplied_fields, @default_fields)
MyData
|> struct(fields)
|> apply_to_db_or_other_state()
end
end
Interesting. Could you maybe give a code example of what the difference is exactly between a fixture and a factory and why fixtures are better?
It’s hard, because part of the point is that fixtures aren’t code. Fixtures are a set of data that represents a baseline set of data that covers a large portion of your tests for the unit under test. If a particular unit could benefit from similar but not quite the same data, you can manipulate the fixture data (after it’s loaded) in the test (or test set, e.g., a describe
block).
With Rails fixtures, you define the data as YAML files, which the FixtureSet
code turns into SQL inserts into the database not using your model’s new/save
functions. (It uses your model to help determine relations, but it sidesteps all of the validation and other business logic.) Because it’s a YAML file, you might have something like:
# users.yml
luke:
first_name: Luke
last_name: Skywalker
title: Jedi Knight
You can refer to this fixture as users(:luke)
, which is pretty much the same as create(:user, :luke)
would be in one of the factory providers…without any hidden behaviour that might be present in the code behind either the user
or user_luke
factories.
There’s more to it than that (fixture YAML files are parsed through ERB prior to processing, which means you can generate large amounts of data as fixtures if you need to do so through loops).
Essentially, though, the point is that it’s named (mostly) static data that is loaded through direct interfaces rather than through your application code in the first place.
I’ve been doing some quick googling but they all point into the direction that factories are better because fixtures hide the data you are using in the tests
Yeah. That’s nonsense promulgated by people who don’t understand fixture data and the value of having a small but reasonable set of data loaded quickly into your database. Let’s also be clear, betterspecs is also lying about what a fixture is in the example provided. Look at the linked issue, and you’ll see a ton of discussion about it in the Rails context, and proper use of fixtures looks nothing like the example provided on the betterspecs page.
I’m talking about database fixtures here, but fixtures are often used by the very people who deride them in contexts other than the database:
VCR
files or JSON response files that represent a payload…that’s fixtures.With fixtures, you understand your data much better because you have to think about it in terms of the data, not in terms of your objects (don’t get me started on a rant about OO modelling vs data modelling and why you can’t do the former if you don’t understand the latter).
That’s basically a factory. A simple factory (the one that I am using in my current Elixir codebase is more complex, but not that much more complex), but it’s not a fixture. In database terms, you’d create some SQL statements to seed your test data (usually before your test transactions start, but in Ecto you’d need to run that in the same process, so…) and then manipulate the records that are present to get them into the shape you need for a particular test—but the default set of data loaded would cover > 80% of your tests (no, really, and it would only take 2–3 records per table to do that most of the time).
Edit: deleted since i didn’t see @halostatue’s response above the response to mine.
I did some reading on fixtures and I think it wouldn’t be so hard to write something.
test/support/fixture.ex
defmodule MyAppTest.MyFixture do
require EEx
file = Path.join(__DIR__, "fixtures/my_fixture.yaml")
EEx.function_from_file(:defp, :to_rows, file, [:assigns])
@spec load(pos_integer, (pos_integer -> %{assigns: map})) :: :ok
def load(count, generator) do
0..count-1
|> Enum.flat_map(fn idx ->
idx
|> generator.()
|> to_rows
|> Yaml.from_string # this is not the correct function name, but I don't usually use yaml.
end)
|> Enum.each(&add_to_database/1)
end
defp add_to_database(map) do
...
end
end
Another argument in favour of fixtures over factories is that they are much faster as you don’t need to insert a bunch of data into the database at the start of each test, it is inserted once at the start of the suite.
edit: Oh! I now see @halostatue already covered that!
I’m not sure if trading the need to handle cleanup / setup each time is worth the time saved by not preparing data for each test individually.
There’s no clean up to do, Ecto handled that automatically the same way as with factories.
You don’t need to write any setup or teardown code with fixtures, while with factories you write the setup.
Something like that might work, yes. There’s a bit more work to it than just that, because Rails fixtures also give you a way to refer to the records from your tests. I’ve got some ideas, but no time to actually build this out, but I think it would entirely be possible to get something like Rails fixtures working with Ecto.
I’d probably avoid using YAML (to avoid an unusual dependency), but don’t have a better format offhand (JSON5 would be useful, but I don’t think that Jason can parse JSON5) except maybe .exs
files that are supposed to produce an array of maps.
That does setup for each test though. Not once for the whole test suite run.
depends on where you call it. It’s in test/support, indicating that it’s compiled before test_helper.exs, so you could call it there. The fun is just there for a wee bit of flexbility. You’d be expected to have more than one object in there (as indicated by use of flat_map)
The thing is, it’s ultimately not an either/or.
In some cases, it’s better to use direct configuration—usually when you’re testing what happens when your table is empty or you want it empty to make it easier to reason about what changed. When you’re using fixtures, you’d truncate the table(s) in question and create new database records as you are currently talking about. It’s an extra step, but a small one.
I’ve got code that requires that data in ~6 tables are set up with the correct relationships—and I run ~10 tests on that code. With fixtures that data is configured once and is then available for all ten tests. With per-test configuration, most of your test is configuration, not assertion. (Yes, you can do that in a setup
block in describe
; there are cases when you want that data available for tests that don’t fit into that describe
without repeating yourself.)
You can do the same with factories, but factories hide the complexity behind typically one function call (e.g., if you need to set up a user
, credentials
, and profile
record for each user
, your create(:user)
factory function might actually create three records behind the scenes and you don’t really know/remember.
Done properly, fixtures let you set up a minimal amount of meaningful data with no ceremony and explicit configuration. It usually far better reflects the state of your application’s database will be most of the time (because your database isn’t going to be empty for long).
Pure functions can and should be tested without touching the database. Using fixtures when you have any complexity to your data at all is going to be far easier to reason about than factories or per-test (or per-group) setup.
That’s right, all tests start with the same dataset which is inserted at the start of the suite. Ecto rolls back changes after each test, it does not truncate the database.
If I’m reading everything correctly it seems that the only real difference between factories and fixtures is that fixtures are just the “real data” and factories go through some domain logic.
The other differences mentioned above seem things you can do with both kind of data creation.
For me this means that you probably want to use both factories and fixtures depending on what kind of test you’re writing. A factory adds more complexity because you go through the domain logic, but this prevents you from creating invalid data. A fixture is more standalone, so easier to reason about, but you have the risk that you have incorrect fixtures.
I will add some cents to the discussion that I haven’t seen mentioned yet. Below, I will be talking about database-backed fixtures (and not necessarily fixtures as a whole):
One of the downsides of fixtures is that it is shared data across all of your tests. So sometimes you will change your fixtures, because you need new data to be used in some new tests, and other tests may now fail. This gets worse if projects define a large amount of fixture data. The correct approach, as mentioned earlier, is to define fixtures for a basic feature set that will be shared across all tests.
The particular fixtures implementation in Rails caused some issues because referential integrity and data validations are disabled or not really used in Rails, so you could easily end-up with invalid data or data that would never exist in the database through the regular application workload.
It is actually super straight-forward to have fixtures in Ecto: just write to the database in your test_helper.exs before you start the SQL sandbox.
Finally, a summary that was given to me a long time ago that explains when to use fixtures vs factories well is:
In my opinion, factories are actually easier to get started as they require less discipline (with bigger costs in the long term), while fixtures are really easy to mess up at the beginning (but pay off if well-structured).
Spot on, Jose. I really came to love the Rails fixtures for their speed, and the referential integrity stuff was only slightly annoying for me (I always use FKs at the database level, so the only time we ran into any sort of problem with this was with polymorphic relations, which we used factory-like functions for).
If I ever make an Ecto.Fixtures
-type library, it’ll be set up to work against the database tables directly as opposed to Ecto schema modules. That would force you to think in terms of your underlying data model. The hard part is going to be, as you say, the referential integrity part, and then figuring out a way to make sure that the inserted data is readily available using similar naming conventions as the Rails ones. Maybe an Agent
that stores {table, name} => pk
values so that you could say something like Repo.get(User, Ecto.Fixtures.id("users", "luke"))
.
But that’s for a time when I, well, actually have time.