SQL sandboxing issue with Phoenix Playwright

aidalgol · June 1, 2025, 10:22pm

I am trying to add end-to-end browser tests to my Phoenix application with PhoenixTest.Playwright, and I can only get it to work if I run the cases synchronously (use PhoenixTest.Playwright.Case instead of use PhoenixTest.Playwright.Case, async: true). I have a test that creates a user through the AshAuthentication regsiter LiveView, and if I slow down the test, I can see that my LiveView (after completing user registration) has current_user set in the assigns, as expected (and not nil), and then shortly after updates to have current_user: nil.

From the application log, the following happens:

The user-registration form submission triggers a 302 redirect to the expected route
This route’s LiveView is mounted
current_user is set (not nil)
With the same request_id as the 302 just before, a 200 request is sent
The LiveView socket is reconnected, and the same LiveView is mounted again
This time the logged SQL query (from Ash) on the users table returns no records
current_user is now nil

In my on_mount hook to allow the SQL sandbox for LiveViews, described here, I have added a line to log metadata, and I can see that it always the same value throughout the sequence outlined above. But I also notice that it is the exact same value across all runs, so I am unsure whether this is working correctly.

I have at this point gone over the setup instructions for PhoenixTest and PhoenixTest.Playwright several times, and I am now quite sure I have not missed anything from there, so I am wondering whether some additional steps are required because I am using Ash, and this combination (PhoenixTest.Playwright with Ash instead of bare Ecto) has just not been documented anywhere yet.

zachdaniel · June 1, 2025, 10:40pm

AshAuthentication generally doesn’t do anything non-phoenix-y It sounds more related to an Ecto sandbox issue as you’ve alluded to. Do you have this set up in config/test.exs?

config :ash, :disable_async?, true

aidalgol · June 1, 2025, 10:49pm

I didn’t, and I just tried setting it, and got the same symptom.

I forgot to mention that I have no issue with Ecto sandboxing in my unit tests, or my tests using Phoenix.LiveViewTest.

zachdaniel · June 1, 2025, 10:55pm

the issue doesn’t immediately jump out at me. It makes sense that you get a reconnect as that is how LV works, but it implies perhaps something going wrong with the “handoff”, where we get the user from the session and fetch it to sign in. We use assign_new for this.

Here is that code:

  @doc """
  Inspects the incoming session for any subject_name -> subject values and loads
  them into the socket's assigns.

  For example a session containing `{"user",
  "user?id=aa6c179c-ee75-4d49-8796-528c2981b396"}` becomes an assign called
  `current_user` with the loaded user as the value.
  """
  @spec on_mount(
          atom | {:set_otp_app, atom},
          %{required(String.t()) => any},
          %{required(String.t()) => any},
          Socket.t()
        ) ::
          {:cont | :halt, Socket.t()}
  def on_mount({:set_otp_app, otp_app}, _params, _, socket) do
    {:cont, assign(socket, :otp_app, otp_app)}
  end

  def on_mount(:default, _params, session, socket) do
    tenant = socket.assigns[:current_tenant] || session["tenant"]

    socket =
      if tenant do
        assign_new(socket, :current_tenant, fn -> tenant end)
      else
        socket
      end

    context = session["context"] || %{}

    socket =
      socket
      |> otp_app_from_socket()
      |> AshAuthentication.authenticated_resources()
      |> Stream.map(&{to_string(Info.authentication_subject_name!(&1)), &1})
      |> Enum.reduce(socket, fn {subject_name, resource}, socket ->
        current_subject_name = String.to_existing_atom("current_#{subject_name}")

        if Map.has_key?(socket.assigns, current_subject_name) do
          raise "Cannot set assign `#{current_subject_name}` before default `AshAuthentication.Phoenix.LiveSession.on_mount/4` has run."
        end

        assign_new(socket, current_subject_name, fn ->
          if value = session[subject_name] do
            # credo:disable-for-next-line Credo.Check.Refactor.Nesting
            case AshAuthentication.subject_to_user(value, resource,
                   tenant: tenant,
                   context: context
                 ) do
              {:ok, user} -> user
              _ -> nil
            end
          end
        end)
      end)

    {:cont, socket}
  end

  def on_mount(_, _params, _session, socket), do: {:cont, socket}

FWIW you can factor Ash pretty much entirely out of the picture in terms of debugging by adding an on_mount hook of your own that does something like:

Repo.all(MyApp.Accounts.User)

If that returns the user you’re expecting to see then the issue may in fact be one with AshAuthentication (but I still currently doubt it).

(Ash resources are also Ecto schemas)

aidalgol · June 2, 2025, 12:16am

Moving forward with my debugging, should I have Ash’s disable_sync? config option set, or remove it again?

config :ash, :disable_async?, true

aidalgol · June 2, 2025, 12:42am

zachdaniel:

FWIW you can factor Ash pretty much entirely out of the picture in terms of debugging by adding an on_mount hook of your own that does something like:
Repo.all(MyApp.Accounts.User)
If that returns the user you’re expecting to see then the issue may in fact be one with AshAuthentication (but I still currently doubt it).

Well this is interesting: that appears to disagree with AshAuthentication. This is what I put in this new debugging hook:

  def on_mount(:default, _params, _session, socket) do
    Logger.debug(users: MyApp.Repo.all(MyApp.Accounts.User), socket: socket)
    {:cont, socket}
  end

And this is the log message from right around where the symptom occurs (manually wrapped for readability):

[debug] [
	users: [
		%MyApp.Accounts.User{
			posts: #Ash.NotLoaded<:relationship, field: :posts>,
			__meta__: #Ecto.Schema.Metadata<:loaded, "users">,
			id: "6ad5176f-3341-4610-b629-e529e03ace3b",
			email: #Ash.CiString<"esmeralda1973@brekke.com">,
			role: :user,
			display_name: nil}
	],
	socket: #Phoenix.LiveView.Socket<
		id: "phx-GEUSnTSarktd1gCU",
		endpoint: MyAppWeb.Endpoint,
		view: MyAppWeb.PostsLive.Index,
		parent_pid: nil,
		root_pid: #PID<0.878.0>,
		router: MyAppWeb.Router,
		assigns: %{
			__changed__: %{
				current_user: true,
				phoenix_ecto_sandbox: true
			},
			current_user: nil,
			flash: %{},
			live_action: nil,
			phoenix_ecto_sandbox: "BeamMetadata (g2gCdwJ2MXQAAAADdwVvd25lclh3DW5vbm9kZUBub2hvc3QAAANkAAAAAAAAAAB3CXRyYXBfZXhpdHcEdHJ1ZXcEcmVwb2wAAAABdxFFbGl4aXIuUGFkZHkuUmVwb2o=)"
		},
		transport_pid: #PID<0.877.0>,
		sticky?: false, ...>
	]

zachdaniel · June 2, 2025, 2:00am

And everything works outside of tests?

zachdaniel · June 2, 2025, 2:01am

This should always be configured in your config/test.exs

aidalgol · June 2, 2025, 2:04am

Correct, and every other type of test works (even with the config changes required for PhoenixTest.Playwright). This PhoenixTest.Playwright test also works if I change

  use PhoenixTest.Playwright.Case, async: true

to

  use PhoenixTest.Playwright.Case

zachdaniel · June 2, 2025, 11:06am

So this is almost 100% related to the sandbox then. Are you 100% sure your disable async configuration is taking effect in your test environment? Maybe in the same place you ran that Ecto query you could fetch that config and see what its value is?

aidalgol · June 2, 2025, 8:08pm

Logging Application.get_env(:ash, :disable_async?) shows that it is true.

zachdaniel · June 2, 2025, 8:19pm

Lemme tag @jimsynz to see if he has any idea. The only thing I can think of is some kind of process related issue?

Can you add some logging to your AuthController failure callback to see how it’s going wrong?

zachdaniel · June 2, 2025, 8:21pm

Actually, a more fool proof way to see what’s going on, is to modify your read action for signing in like so:

prepare fn query, _ ->
  Ash.Query.before_action(query, fn query, _ ->
    IO.inspect(self())
    query
  end)
end

To find out if its somehow running in a different process (which is how you’d end up with conflicting results from queries using the sandbox).

aidalgol · June 2, 2025, 8:57pm

That goes on the user resource, not the token resource, right?

Since I’m just using a default read action, how should I add that preparation without unintentionally changing the behaviour of the read action?

  actions do
    defaults [:read]
    # ...

zachdaniel · June 2, 2025, 9:27pm

The sign in is not using a primary read action.

Out of curiosity, did you set up AshAuthentication before or after we added the igniter installers?

The installers add this action now, whereas it was implicit before. If you add the action yourself, you will have a much easier time debugging sign-in.

      read :sign_in_with_password do
        description "Attempt to sign in using a email and password."
        get? true

        argument :email, :ci_string do
          description "The email to use for retrieving the user."
          allow_nil? false
        end

        argument :password, :string do
          description "The password to check for the matching user."
          allow_nil? false
          sensitive? true
        end

        # validates the provided email and password and generates a token
        prepare AshAuthentication.Strategy.Password.SignInPreparation

        metadata :token, :string do
          description "A JWT that can be used to authenticate the user."
          allow_nil? false
        end
      end

jimsynz · June 2, 2025, 11:02pm

There is nothing that AA does specifically async except for the token expunger, which is unlikely but could get in the way (I suggest removing AshAuthentication.Supervisor) from your application start callback in test).
Is it possible that this is just a confirmation issue?

zachdaniel · June 2, 2025, 11:23pm

That sounds possible. @aidalgol do you have require_confirmation_for_authentication? enabled?

aidalgol · June 3, 2025, 6:04am

Most of the Ash code in this project was written before most of the Igniter installers/generators were added to the various Ash packages.

I added the above :sign_in_with_password read action, and this preparation,

prepare before_action(fn query, _context ->
          Logger.info(action: :sign_in_with_password, pid: self())
          query
        end)

And it is not hit for user registration, only for signing in an existing user. (I confirmed this by running my application interactively via iex -S mix phx.server.) So I added a :register_with_password action to my user resource based on the Ingiter task in ash_authentication/lib/mix/tasks/ash_authentication.add_strategy.ex (manually), with a similar validation for logging (since preparations can’t be added to create actions).

validate fn changeset, _context ->
  Logger.info(action: :register_with_password, pid: self())
  :ok
end

It looks everything for a request is being run in the same process.

Something else occurred to me while doing this: is AAP’s on_mount hook called before every hook passed to ash_authentication_live_session? If so, that might be the problem, because the hook that sets up the Phoenix.Ecto sandboxing needs to be called first.^[1]

Acceptance tests with LiveViews, Phoenix/Ecto Hexdocs ↩︎

aidalgol · June 3, 2025, 6:06am

No, I don’t have that anywhere in my code. I can’t find it anywhere in the AA or AAP documentation or source code. Is that the right option?

If you mean email confirmation, I haven’t set that up yet in my application.

zachdaniel · June 3, 2025, 11:31am

Sorry I gave you instructions for sign in not register

I think we must be missing something simple here and/or there must be something wrong with the initial assumptions.

I assume you’ve set up all of the sandbox stuff for Liveview and not LiveView mentioned here? PhoenixTest.Playwright — PhoenixTestPlaywright v0.6.3