How to speed up the start of `mix test`?

I work on a very large umbrella project. I did this config in my web app:

config :logger,
  level: String.to_atom(System.get_env("LOGGER_LEVEL") || "info"),
  handle_otp_reports: true,
  handle_sasl_reports: true
config :my_web, MyWeb.Gettext, one_module_per_locale: true, allowed_locales: ["en"]

But finally, I found that when I was testing a file, it took almost 5 seconds before my gettext app started:

These are the top logs when I run mix test filename.exs at 11:10:30:

11:10:35.496 [info]  Child Gettext.ExtractorAgent of Supervisor #PID<0.4876.0> (Supervisor.Default) started
Pid: #PID<0.4877.0>
Start Call: Gettext.ExtractorAgent.start_link([])
Restart: :permanent
Shutdown: 5000
Type: :worker

then, I removed the gettext deps, but It doesnā€™t solve the problem.

These are the top logs when I run mix test filename.exs at 12:37:00 after I remove the gettext deps:

2021-10-31 12:37:04.800 [info] module=application_controller function=info_started/2 line=2089  Application logger started at :nonode@nohost
2021-10-31 12:37:04.844 [info] module=supervisor function=report_progress/2 line=1546  Child :ttb_autostart of Supervisor :runtime_tools_sup started
Pid: #PID<0.4893.0>

References

You can run MIX_DEBUG=1 mix test and it will show you the timing to run tasks in recent Elixir versions. If you are on Elixir v1.11 or later and you have a Phoenix application running on latest v1.5 or v1.6, you can remove the :phoenix compiler in the :compilers option of your mix.exs. That should speed it up by 2-3s in large apps.

9 Likes

When I set MIX_DEBUG=1, Finally, I found that the time was mainly spent on loadpaths:

==> my_web
<- Ran mix deps.loadpaths in 2828ms
<- Ran mix loadpaths in 2830ms

What is your Elixir version? We have recently sped it up, make sure to try the latest v1.12.

Yes, Iā€™ve used the latest version,

$ asdf list elixir
  1.12.3-otp-24
  ref:v1.12.3
$ asdf list erlang
  24.0.6

and I found this answer: `mix` even more slugish with Elixir 1.12.1

Maybe this problem is not easy to solve, looking forward to the arrival of Elixir 1.13 version.

The fixes in that thread are already in place. Can you please post the full output of your mix command with the debug flag here?

1 Like

Iā€™m finding the same thing in an absinthe app I have (with no :phoenix :compilers option)

Itā€™s an umbrella app that consistently takes ~5 seconds to actually run the test even when no compilation happens, In the output what I see is:

...
<- Ran mix compile.app in 1ms
<- Ran mix compile.all in 15ms
<- Ran mix compile in 597ms
-> Running mix app.start /Users/Adz/Projects/hungry_monsters/hungry_monsters/apps/graphql/test/children_with_sensitivities/update_test.exs (inside Graphql.MixProject)
-> Running mix app.config /Users/Adz/Projects/hungry_monsters/hungry_monsters/apps/graphql/test/children_with_sensitivities/update_test.exs (inside Graphql.MixProject)
<- Ran mix app.config in 0ms
<- Ran mix app.start in 843ms
............

Finished in 2.2 seconds (2.2s async, 0.00s sync)
12 tests, 0 failures

Randomized with seed 628530
<- Ran mix test in 7238ms

But everything stops at this line for about 5 seconds:

<- Ran mix app.start in 843ms
^^^ Line prints then long pause before test runs ^^^

Interestingly:

āÆ time mix run -e ""
mix run -e ""  1.31s user 1.08s system 138% cpu 1.726 total

so that is quicker

What is the cost of running a single small test file?

If I run just one single test in that file:

time MIX_DEBUG=1 mt /path/to/my/test/file/update_test.exs:8 --trace

I get

<- Ran mix app.config in 0ms
<- Ran mix app.start in 759ms
Excluding tags: [:test]
Including tags: [line: "8"]

  ....
  * test Updating a child We can update a childs name and remove a sensitivity (615.7ms) [L#8]

Finished in 0.8 seconds (0.8s async, 0.00s sync)
12 tests, 0 failures, 11 excluded

Randomized with seed 663565
<- Ran mix test in 7193ms
MIX_DEBUG=1 mix test  --trace  2.01s user 1.47s system 45% cpu 7.688 total

So the test runs in 615.7ms

So my best guess would be loading and executing test_helper.exs. You can wrap it in a :timer.tc call and let us know how it goes. If it is not the test_helper, then something internal to Mix/ExUnit and it would require further debugging.

1 Like

I have a curious question, even for a brand new phoenix project, the start-up time for testing is long (compared to other languages), what is the reason for this?

I created the project with the following command

$ mix phx.new demo
$ cd demo
$ time mix test 

result:

~/Code/demo āÆ time mix test                                                                       īž± 25.1.2 ī˜­ 1.14.0-rc.1-otp-25
test helper: {4088, :ok}
...
Finished in 0.07 seconds (0.03s async, 0.04s sync)
3 tests, 0 failures

Randomized with seed 621279
mix test  0.79s user 0.67s system 89% cpu 1.648 total

~/Code/demo āÆ time MIX_DEBUG=1 mix test                                                           īž± 25.1.2 ī˜­ 1.14.0-rc.1-otp-25
-> Running mix loadconfig (inside Demo.MixProject)
<- Ran mix loadconfig in 6ms
-> Running mix deps.loadpaths (inside Demo.MixProject)
<- Ran mix deps.loadpaths in 163ms
-> Running mix ecto.create --quiet (inside Demo.MixProject)
-> Running mix app.config --quiet (inside Demo.MixProject)
-> Running mix compile --quiet (inside Demo.MixProject)
-> Running mix loadpaths --quiet (inside Demo.MixProject)
-> Running mix archive.check --quiet (inside Demo.MixProject)
<- Ran mix archive.check in 0ms
<- Ran mix loadpaths in 1ms
-> Running mix compile.all --quiet (inside Demo.MixProject)
-> Running mix compile.yecc --quiet (inside Demo.MixProject)
<- Ran mix compile.yecc in 3ms
-> Running mix compile.leex --quiet (inside Demo.MixProject)
<- Ran mix compile.leex in 0ms
-> Running mix compile.erlang --quiet (inside Demo.MixProject)
<- Ran mix compile.erlang in 0ms
-> Running mix compile.elixir --quiet (inside Demo.MixProject)
<- Ran mix compile.elixir in 27ms
-> Running mix compile.app --quiet (inside Demo.MixProject)
<- Ran mix compile.app in 1ms
<- Ran mix compile.all in 73ms
<- Ran mix compile in 81ms
<- Ran mix app.config in 88ms
<- Ran mix ecto.create in 170ms
-> Running mix ecto.migrate --quiet (inside Demo.MixProject)
<- Ran mix ecto.migrate in 40ms
-> Running mix test (inside Demo.MixProject)
-> Running mix app.start (inside Demo.MixProject)
<- Ran mix app.start in 113ms
test helper: {3933, :ok}
...
Finished in 0.08 seconds (0.03s async, 0.05s sync)
3 tests, 0 failures

Randomized with seed 397072
<- Ran mix test in 251ms
MIX_DEBUG=1 mix test  0.79s user 0.66s system 69% cpu 2.100 total

You will see that the total time taken is 1.648 seconds, inside the test helper it looks like this

:timer.tc(fn ->
  ExUnit.start()
  Ecto.Adapters.SQL.Sandbox.mode(Demo.Repo, :manual)
end) |> IO.inspect(label: "test helper")

System information:

  • m1 max macOS12.6.1
  • erlang 25.1.2
  • elixir 1.14.1-otp25

Looking at debug report, most of the time is spent on:

  1. Loading dependencies
  2. Creating the database
  3. Migrating the database

Do you have a similar stack in other languages where you have a dozen of dependencies, plus creating and migrating databases, and it is running considerably faster?

Sorry, Jose, my expression may be a bit ambiguous, this startup speed is acceptable if you add this bunch of dependency frameworks. Thanks to your work and the community, I havenā€™t used a web framework in another language for many years.

But I was actually comparing the speed of unit testing in other languages because our business area has a lot of unit tests, and I myself like them, and unit test dependencies should be very few, not depending on the DB.

I wonder if there is any way to make my Unit Tests(use ExUnit.Case, async: true) startup speed faster as well?

2 Likes

You could move a few of the tasks in your list ā€“ currently done everytime ā€“ to only happen on demand. E.g. --no-start, remove the test alias doing db stuff, ā€¦ + adding the related code to trigger them if you indeed need them. That will make your test setup more complex though.

Depending on how isolated this ā€œfew dependencies no dbā€ code is you can also consider pulling it out into a separate mix project, which doesnā€™t even have the overhead of ā€œeverythingā€ in the first place.

1 Like

Another possible option is to test it in MIX_ENV=test iex -S mix, which I saw bruce asked about as well.

As you can see in the image below, in a project generated with mix new elixir_demo, the speedup in iex is 100 times faster than running mix test from the command line, because the app is not started again.

If you guys think this is a possibility, can you tell me some caveats? Daniel Olshansky discusses some options in this article, but Iā€™m concerned that the recompile approach to the problem of caching test results may be flawed.

Currently, I mainly use neotest to run tests in Neovim(currently it calls mix test). This gif shows lua testing a block on the left and elixir testing a block on the right. In the future, if we can also run tests in iex, they will be as fast as the left.

1 Like

Ah, that makes sense. You can create different test environments if you want. ecto_sql is such an example: GitHub - elixir-ecto/ecto_sql: SQL-based adapters for Ecto and database migrations

Two test directories, with different types of tests and different dependencies. :slight_smile:

1 Like

Thanks, Jose, Iā€™ll explore this option further, also how about testing in MIX_ENV=test iex -S mix I mentioned above? How to speed up the start of `mix test`? - #15 by scottming

If it works (after solving the problem of `caching test results), then it means I can solve these startup problems at the tool and plugin levels(neotest of nvim, or vscode).

FYI, I have found a solution to this problem by working at the editor tool plugin level. Now my tests for complex projects that take more than 10 seconds take less than 1 second to run on iex, and those for simple projects usually take only 0.2 seconds. The experience from this speed is excellent.

I have written my own plugin: Elixir Test in IEx - Visual Studio Marketplace, which is very handy for Vscode users

Jhon Pedroza is working on the nvim neotest side of the integration: IEx strategy by jfpedroza Ā· Pull Request #14 Ā· jfpedroza/neotest-elixir Ā· GitHub; and if you want to quickly experience vscode-like effects, you can also use these simple command define:

vim.api.nvim_create_user_command("TestIexStart", function()
	local code = 'Code.eval_file("~/.test_iex/lib/test_iex.ex");TestIex.start()'
	toggleterm.exec(string.format("MIX_ENV=test iex --no-pry -S mix run -e %q", code), 1)
	ttt.get_or_create_term(1):close()
end, {})

vim.api.nvim_create_user_command("TestFileAtCursorInIex", function()
	local line_col = vim.api.nvim_win_get_cursor(0)[1]
	local path = vim.fn.expand("%")
	local test_command = string.format("TestIex.test(%q, %q)", path, line_col)
	if vim.endswith(path, ".exs") then
		toggleterm.exec(test_command, 1)
		vim.g.last_test_in_iex_command = test_command
	else
		toggleterm.exec(vim.g.last_test_in_iex_command, 1)
	end
end, {})

vim.api.nvim_create_user_command("TestFileInIex", function()
	local path = vim.fn.expand("%")
	local test_command = string.format("TestIex.test(%q)", path)
	if vim.endswith(path, ".exs") then
		toggleterm.exec(test_command, 1)
		vim.g.last_test_in_iex_command = test_command
	else
		toggleterm.exec(vim.g.last_test_in_iex_command, 1)
	end
end, {})

4 Likes

Hey, years late to the party, but Iā€™ve been seeing a strange lag between the mix task and tests starting and canā€™t determine what is causing it! It is especially prominent in our Github Action CI.

The test suite takes 7 minutes(420s) , 190s running tests, 120 seconds unaccounted for! (The remainder is running test seeds & migrations)

Running MIX_DEBUG=1 and after wrapping many things in :timer.tc Iā€™ve narrowed it down, but canā€™t tell what the cause is.

The big problem is, once the mix test task is run, we donā€™t see any logging for 1:40s! Then we get 4 DB connection errors, then 20s more, and the tests start returning.

Context, all of our tests are concurrent (:speak_no_evil:), there are 6K, we have 3 Repos running SQLSandbox. I tried setting unique timeouts on all of our explicate checkout/2 statements, but it still returns the 120000ms default.

I have tons more data, if anything jumps out, please point and I will research!

Fri, 29 Mar 2024 14:28:56 GMT <- Ran mix ecto.migrate in 33ms
Fri, 29 Mar 2024 14:28:56 GMT -> Running mix run priv/repo/test_seeds.exs (inside Crowbar.Mixfile)
Fri, 29 Mar 2024 14:28:56 GMT -> Running mix app.start priv/repo/test_seeds.exs (inside Crowbar.Mixfile)
Fri, 29 Mar 2024 14:28:57 GMT warning: CrowbarWeb.Endpoint.init/2 is deprecated, ā€¦
Fri, 29 Mar 2024 14:28:57 GMT Warning:   (phoenix 1.7.11) ā€¦ a couple more of these
Fri, 29 Mar 2024 14:28:57 GMT <- Ran mix app.start in 1436ms
Fri, 29 Mar 2024 14:28:58 GMT <- Ran mix run in 1784ms
Fri, 29 Mar 2024 14:28:58 GMT -> Running mix run priv/repo2/seeds.exs (inside Crowbar.Mixfile)
Fri, 29 Mar 2024 14:29:17 GMT <- Ran mix run in 19371ms # This is bad, but not our biggest problem :face-palm:
Fri, 29 Mar 2024 14:29:17 GMT -> Running mix test --cover --partitions 2 --max-cases 4 (inside Crowbar.Mixfile)
Fri, 29 Mar 2024 14:30:57 GMT Error: -29T14:30:57.712 [error] Postgrex.Protocol (#PID<0.628.0>) disconnected: ** (DBConnection.ConnectionError) owner #PID<0.726.0> timed out because it owned the connection for longer than 120000ms (set via the :ownership_timeout option)
Fri, 29 Mar 2024 14:30:57 GMT ā€¦ Exactly 3 more of this error ^ - Very suspicious, I think our code is causing this
Fri, 29 Mar 2024 14:31:19 GMT Excluding tags: [:skip]
Fri, 29 Mar 2024 14:31:32 GMT ................................ more tests complete
    ā€¦
Finished in 195.1 seconds (190.2s async, 4.9s sync)
    66 doctests, 12 properties, 2684 tests, 0 failures, 13 excluded

    Randomized with seed 710739
    ----------------
    Coverage data exported.
  <- Ran mix test in 340632ms
<- Ran mix coveralls.lcov in 361175ms
1 Like

Iā€™ve seen this kind of ā€œsilent hangā€ in situations when multiple sandboxed tests tried to insert the same value for a column with a unique constraint.

None of the INSERTs can report success or failure until the other oneā€™s transaction commits or aborts, so everything comes to a halt.

That usually produced one error per connection in the pool, though. In your case, matching the number of repos suggests something different might be happening. :thinking:

3 Likes