Intermittent DBConnection.ConnectionError in tests

I do have some of those still happening but the runs of tests on the CI vs local are quite different so it’s difficult to pinpoint the issue.

Before my change, none of the test runs would pass. Now they do pass but they randomly fail.

Locally I mostly have such errors:

[error] Postgrex.Protocol (#PID<0.1316.0>) disconnected: ** (DBConnection.ConnectionError) client #PID<0.19767.0> exited

or

[error] Child #Reference<0.3366742383.3747086337.132479> of Supervisor #PID<0.14032.0> (Supervisor.Default) shut down abnormally
** (exit) no process: the process is not alive or there's no process currently associated with the given name, possibly because its application isn't started
Pid: #PID<0.14034.0>
Start Call: Phoenix.LiveView.Channel.start_link/?

Whereas on the CI, I have errors related to the assert_value package:

[error] GenServer AssertValue.Server terminating
** (stop) exited in: GenServer.call(#PID<0.1074.0>, :flush, 5000)
    ** (EXIT) no process: the process is not alive or there's no process currently associated with the given name, possibly because its application isn't started
    (elixir 1.15.6) lib/gen_server.ex:1074: GenServer.call/3
    (assert_value 0.10.1) lib/assert_value/server.ex:68: AssertValue.Server.handle_cast/2
    (stdlib 5.1) gen_server.erl:1103: :gen_server.try_handle_cast/3
    (stdlib 5.1) gen_server.erl:1165: :gen_server.handle_msg/6
    (stdlib 5.1) proc_lib.erl:241: :proc_lib.init_p_do_apply/3
Last message: {:"$gen_cast", {:flush_ex_unit_io}}

[error] Process AssertValue.Server (#PID<0.903.0>) terminating
** (exit) exited in: GenServer.call(#PID<0.1074.0>, :flush, 5000)
    ** (EXIT) no process: the process is not alive or there's no process currently associated with the given name, possibly because its application isn't started
    (elixir 1.15.6) lib/gen_server.ex:1074: GenServer.call/3
    (assert_value 0.10.1) lib/assert_value/server.ex:68: AssertValue.Server.handle_cast/2
    (stdlib 5.1) gen_server.erl:1103: :gen_server.try_handle_cast/3
    (stdlib 5.1) gen_server.erl:1165: :gen_server.handle_msg/6
    (stdlib 5.1) proc_lib.erl:241: :proc_lib.init_p_do_apply/3
Initial Call: AssertValue.Server.init/1
Ancestors: [#PID<0.902.0>, #PID<0.901.0>]

I’m still trying to find why I get some of these random errors.

They mostly appeared when switching to Elixir 1.15.6 (+ erlang 26.1) and upgrading some packages like phoenix_live_view to 0.18.18.

1 Like

I’ve found this common pattern to cause ConnectionError in passing tests:

# flaky
assert view
       |> element("#some-button-id")
       |> render_click()
       |> follow_redirect(conn, "/destination/path")

# better
view
|> element("#some-button-id")
|> render_click()

assert_redirected(view, "/destination/path")
2 Likes

I’ve tried updating my tests to follow this pattern, but I’m still getting those errors.

I’ve created an issue here: Intermittent DBConnection.ConnectionError in tests · Issue #3545 · phoenixframework/phoenix_live_view · GitHub

I think you’re misunderstanding the comment you replied to. They’re saying that they’re doing something common which shouldn’t cause an error but does cause an error. I don’t think they’re saying that that the pattern will fix the error.

I had to insert some extra render calls at the end of my functions to fix the issue.

Hey all, we’re having the same problem. As suggested in the referenced issue Intermittent DBConnection.ConnectionError in tests · Issue #3545 · phoenixframework/phoenix_live_view · GitHub we were having async tasks fire off in Liveviews that made db accesses but the tests didn’t wait for them. render_async(view) is what fixed it.

Except we’re still seeing those errors for some of the tests. They don’t always appear (yay concurrency!) but when they do we have no inkling which tests logged those errors.

  • We did liberally apply render_async(view) everywhere - but they’re still there, so maybe these aren’t liveview tests triggering this?
  • also tried to run each test file on its own - but the errors don’t pop up then.
  • we also did try to run mix test --trace - but that changes the test behaviour completely (runs everything serially as far as I can tell) so the error output never shows up in there.

So I was wondering.. has anyone ever figured out a good solution for dealing with this? I would even be open to a hack, e.g. patching up ExUnit locally to print test filenames but couldn’t figure how to do that only when there’s some error output in the console.

Yeah, this is one of the more annoying things working with db_connection, the lack of visibility. If you browse the Github repo there is even an issue where connections are created and maybe dropped. It’s even harder to debug by looking at the code.

Do you have a lot of test that talks to the database with expensive queries?

There can be some slow queries I guess but the test db holds no data (except whatever gets setup per test which is then of course rolled back), so I wouldn’t think that that might be the case there

Yeah, not sure myself. Hopefully I can eliminate some of these issues in my library, since I found out I can’t use db_connection, although the pool I’m building can be specialized and more optimized then db_connection ever could be, so looking forward if there is gonna be found any root causes to these issues other then write your code differently.