Does test partition fasten ecto sandbox shared mode test?

I am working on a project which has a lot of E2E/integration test, which hit API from router/absinthe and access live database via ecto sandbox shared mode. Ecto sandbox shared mode have restriction cannot be run in async mode.

So I am thinking, whether it’s possible to make test finish faster by using test partitioning, spawning a new process for each test partition. However after hacking it for sometime, i can’t reduce actual test time significantly.

Is test partitioning within same machine is a feasible way to fasten test using ecto sandbox shared mode? wondering if anyone else has similar experience.

I have done following:

  • varying number or partition (2/4/8/16/20): low number (2/4) result in similar time as without partition, higher number result in worse time
  • changing BEAM Scheduler (1/2/4/8): doesn’t seem to affect much time, i can confirm that without partition, changing beam scheduler to 1 worsen the time, but with test partition difference of test time finish without partition is same
  • Having separate logical database for partition within same database instance: doesn’t seem to improve test time much/if any, does add a bit of setup time for each database

on simple test, (like for 30 time, insert a row to table and sleep 1 second). Test partitioning works nicely in reducing total time. But for actual real test it’s doesn’t improve much.

Just for reference:

  • original time of mix test : 13 minute
  • best configuration of test partition currently (2 partition, 4 scheduler, separate logical database): 10 minute
  • system: Erlang/OTP 24 Elixir 1.13.3 i7 8thgen 8 Core

The only thing I am reasonably confident would help you is using the sandbox’s manual mode and use the allow function if the tests are spawning extra processes. Not very pretty but gives you control and allows you async tests. I’ve done it only once and didn’t like it very much but you can use import-ed test helpers (functions or macros) to make your own mini DSL for checking out a connection and allowing a sub-process to use it. It’s doable and it’s minimum noise in the test code.

Unfortunately I didn’t make a lot of measurements back then – I kind of just wanted to reduce test time somewhat but not by much. If memory serves I managed to reduce total test time from ~78 seconds to ~53s or so. Not bad but could have been much better as well. There were ~2700 tests in total across all files. But have in mind that I didn’t change all tests to be like that – didn’t have enough approved time for it. I just aimed at the top X slowest ones (how much X was I absolutely can’t remember, sadly).

Another recommendation would be to break apart your tests files into smaller ones. That’s… not easy, and very often it’s not trivial to figure out as well. But if you feel you have an e.g. 2500 lines in a single test file (not a rarity) then maybe it’s time to entertain the idea to have more test helpers – macros included, I’ve done it successfully a good amount of times – and make more sub-topical tests. But this is a very “out there” recommendation; it’s extremely context- and project-sensitive. Don’t take it as an universal advice because it absolutely is not.

Finally, you can just increase the amount of parallel connections that the DB server can take (your dev box would be most suitable for this). I know that on my workstation and a laptop I’ve outright told Postgres to allow up to 200 connections and I often allow 100 connections in my test Elixir env.

These are fairly generic advice and sorry about the lack of details but then again, we don’t have access to your project so something more concrete is harder to advise, at least for me.

1 Like

Thanks @dimitarvp for such detailed answer! both of your suggestion on changing to sandbox’s manual mode and breaking apart test from integration test to function level test does indeed able to make test go faster and run asynchronously, but due to huge amount of effort needed to refactor the test suite i haven’t gone through with them yet.

That being said, i recently revisit this slow test issue, and i am able to solve why partition test doesn’t make test go faster previously. By running each test partition in their own project folder & build + separate logical database, i am able to reduce testing time to ~8 minute from ~13 minute by partitioning test into two. I suspect this has something to do with mocking library i used Mimic, as Mimic require recompilation of BEAM file when mocking a module, this might result in contention between different test partition process if they still run under a single project/library. Interesting things that i found out is increasing test partition (from 2 to 4) at this project doesn’t decrease time by much, i suspect due to number of module mocked using Mimic and it’s BEAM recompilation effort (which are repeated for each test partition process) become bottleneck which prevent even more speedup.