How to test GenServer that is started via DynamicSupervisor w/ Registry?

sudostack · July 3, 2020, 5:51pm

I’m attempting to provide unit test coverage for a GenServer in an “application” that I’ve created, but I’m running into an issue with an already started process, namely the registry. I won’t share the code for the supervisor but it declares a child spec with a named registry, __MODULE__.Registry. I’m trying to follow the example for the KV.Registry from elixir-lang.org here: https://elixir-lang.org/getting-started/mix-otp/dynamic-supervisor.html and here: https://elixir-lang.org/getting-started/mix-otp/genserver.html, but having a little trouble decomposing it and mapping it back to what I’m trying to accomplish

0) Test: failure on setup_all callback, all tests have been invalidated
   10      ** (EXIT from #PID<0.205.0>) shutdown: failed to start child: .FooRegistry
    9          ** (EXIT) already started: #PID<0.180.0>

I have a specific name that I want to use for both my registry and dynamic supervisor: Foo.Registry and Foo.DynamicSupervisor that I declare with child specs inside my Foo.Supervisor.

defmodule Foo do
  use GenServer

  alias __MODULE__.FooReducer

  @supervisor __MODULE__.Supervisor
  @registry __MODULE__.Registry

  # client

  def start(id, strategy, opts \\ []) do
    DynamicSupervisor.start_child(
      @supervisor,
      {__MODULE__, id: id, strategy: strategy, opts: opts}
    )
  end

  def fun1(id), do: GenServer.cast(registry_key(id), :increment)

  defp registry_key(id), do: {:via, Registry, {@registry, id}}

  # server (callbacks)

  def start_link(args) do
    case NimbleOptions.validate(args[:opts], args[:strategy].options_schema()) do
      {:ok, opts} ->
        state = struct(args[:strategy], opts)
        reg_key = registry_key(args[:id])
        GenServer.start_link(__MODULE__, state, name: reg_key)

      {:error, %NimbleOptions.ValidationError{} = err} ->
        {:error, Exception.message(err)}
    end
  end

  # other callbacks
end

test file

defmodule FooTest do
  use ExUnit.Case, async: true

  alias Foo.Strategy

  @id 0

  # BEFORE
  setup_all do
    {:ok, _} = Foo.Supervisor.start_link() # starts both my named registry and named dynamic-sup
    :ok
  end

  # FOLLOWING suggestion on elixir-lang site's KV.Registry example
  # uncertainty about how to do this properly
  # attempting to start a single supervisor for all tests in this file as the IDs are unique
  setup_all do
    _supervisor = start_supervised!(Foo.Supervisor) 
    :ok
  end

  # the default ID of 0 should be restarted for all tests
  setup do
    opts = [increment: 100, decrement: 100, step: 3]
    {:ok, _} = Foo.start(@id, Strategy, opts)
    :ok
  end

  test "invalid options" do
    expected_message = "required option :decrement not found, received options: [:bar]"
    opts = []
    assert {:error, expected_message} == Foo.start(@id + 1, Strategy, opts)
  end

  test "multiple IDs" do
    opts = [bar: "asdf"]
    ids = 1..1_000
    for id <- ids, do: {:ok, _} = Foo.start(id, Strategy, opts)
    for id <- ids, do: for(_ <- 1..150, do: Foo.fun1(id))
    for id <- ids, do: Foo.stop(id) # unsure if this is needed
  end
end

I’m not sure how to go about setting up my test correctly when my GenServer depends on my registry to be started. I couple the GenServer implementation to the name of my registry and dynamic supervisor. Should I decouple this and just let Elixir start a registry like the example on elixir-lang.org? I’ve been wrangling with this for some time and resources online are limited. My GenServer behaves as expected otherwise

Thanks in advance for any support!

ityonemo · July 3, 2020, 11:37pm

As Registries are “vm-global” entites, generally speaking you should have the registry started in your application supervision tree, and your naming scheme is fine for when your system is not in test. Since you have async: true, I presume the globalness of the registry could be a problem if other tests hit it at the same time.

option #0: turn async off and your test should have exclusive access to the registry.

option #1: add a second parameter to registry key so that you can give it an arbitrarily named registry.
then in your test, you can start a fresh registry each time:

setup do
  Registry.start_link(name: :registry_for_this_test, keys: :unique) 
end

pass the registry name in to your GenServer initialization as an optional setup parameter.

option #2: do something more complicated. (don’t do this unless you are advanced). I wrote some libraries which will shard Registry access based on which test you’re in (it registers the process as whatever you chose as the normal name, tupled with the test pid), library is called ‘Multiverses’ but fair warning, it overuses macros and I hope someday to not need to use it.

sudostack · July 4, 2020, 3:00am

Super appreciate your insights a bunch!

I’m definitely leaning on option 1 that you’ve suggested, and your suggestions have helped clarify a couple of things for me, but also raise a few more questions (along with some notes):

I want to be able to run the tests asynchronously – it’s why I purposefully chose the IDs that I did which shouldn’t result in already started errors, alternatively, what I’ve seen some libraries do is just match on “already started” and just return the PID which I could also do, of course. I’m really trying to have the test match the running application – a single registry for these particular types of GenServers.
I’m not a huge fan of expanding parameters to an interface just to support testing, although this may be orthogonal to my thoughts around loose-coupling and dependency injection
I guess I’m sort of wondering about whether I should be exposing an interface to the GenServer to run independently of a registry even though that’s not the original intent. It does allow me to test it in isolation, but deviates from mirroring its actual use, because I’m testing core functionality rather than the integration
I’m trying to maintain automatic clean-up using start_supervised! but I’m not sure whether or not this is applicable in my scenario where I’m declaring my own named Registry and DynamicSupervisor

Also, thank you for sharing. I was looking at your tests, particularly at the one in relation to how you’re testing your GenServer and it looks like it doesn’t bring any of the supervision or registry baggage with it… your basic gen-server implementation has no coupling or mention of a registry or supervisor. It knows essentially nothing about the outside world that may be running/supervising/orchestrating it. This gives a bit of a fresh new perspective on how I could re-factor it to be this way, too, and removes the tight-coupling

ityonemo · July 4, 2020, 3:21am

Yeah the “vm-global-ness” of registries is what’s in conflict with async tests. It’s just a fact of life =(.
I’m not really a huge fan of expanding parameters for dependency injection myself. But this is the original way that you were supposed to do tests in Elixir. If you are careful and hide it behind default parameters, I think it’s less “ugly”. One way to reduce the ugliness is to stash it in a keyword options list.
I think that’s a great idea, unless you are specifically doing an integration test involving the registry. But you can also move those integration tests outside of async.
If you do tests right, you probably don’t have to clean up anything. If you build your tests with linked processes, then everything will be torn down when the test process dies.

X) Yeah, I’d say be careful with trying to refactor like Multiverses, the secret is this thing called :"$callers" which uses the “very dangerous” erlang Process dictionary. Granted, this sort of thing is exactly what the process dictionary was designed for, though (and it’s the magic behind how Mox and Ecto tests are really effective). I’m (probably) applying to give a talk about it at ElixirConf, to get community feedback and see if this is truly a good idea or something horrifying, but also hopefully the talk will be educational about how callers works.

sudostack · July 4, 2020, 3:33am

Yeah – a small nuisance, but happy to know that someone having wrangled with more of this than I have says that there’s really no other way
Yeah, default params it is! And you’re right. It’s probably me just fighting with other examples that I’ve seen that don’t include the tests or don’t even test at all
This is a good point. It’s a bit important to me that at least a simple integration test exists, because one of my concerns was/is slow tests, but it’d be worth the trade-off here for some confidence that the entirety of the application is working
Fair enough – but one question remains: in what scenarios is ExUnit.Callbacks.start_supervised!/2 needed or used? Perhaps in the integration test?

Now that would be cool! I’d certainly learn a few things from it. I’m coming back to Elixir after a hiatus of two years and a lot has changed (there wasn’t a registry before and even less boiler plate for establishing supervision trees now)

Again thanks for all the help!

ityonemo · July 4, 2020, 5:27pm

Ok, I got an answer for you. If you use start_supervised!, then your process will know the test is its supervision tree ancestor. I believe this will also let you use mox/ecto allowances/checkouts correctly (haven’t checked). If you do spawn_link, you will still get stuff like cleanup after termination, but it will be running as an isolated process that isn’t supervised (not the worst thing in tests, IMO). Flowchart seems to be:

for GenServers: start_supervised!/1,2 gives you access to allowances/checkouts, module.start_link/N does not.
for lambdas: Task.start_link/1 gives you acess to allowances/checkouts, spawn_link/1 does not.

sudostack · July 4, 2020, 7:13pm

Oh! Thank you for investigating and sharing your findings! This is worth capturing in a blog post or cheatsheet of some sort (I may do this).

So in a continual search of better patterns, I came across the Horde project, a distributed dynamic supervisor and registry. Curious as to whether you are familiar with it - https://github.com/derekkraan/horde

ityonemo · July 4, 2020, 7:21pm

I am. However, my system in prod is not distributed (yet) so I’m not using it, and even when I do that I think for what I’m doing I need CP, not AP, so Horde is not appropriate for my use case. Erlang’s :global is probably a better choice for me, even though it’s slow and not performant.