Test processes using handle_continue

Background

I have a process that does nothing in it’s init function and delegates all the work to a handle_continue. I do this because the work being done in the handle_continue is quite heavy, and since this is a Worker process, I don’t want to slog it’s supervisor (and thus the entire application) with the slow initialization of a Worker (of which there can be hundreds or thousands of).

Problem

The problem here comes when testing. When using ExUnit, it will execute the code assertions right after the init function of my Worker, which I remind you, does next to nothing.

So effectively, ExUnit is sometimes running the test assertions before the Worker has even initialized. I say sometimes, because everything is concurrent, so sometimes I am lucky and the assertions run after the Worker’s handle_continue has run, sometimes they don’t.

Questions

Is there a way to make ExUnit run the assertions without forcing the Worker process to send a message in handle_continue signaling it? (think of it as forcing the Worker to broadcast a message once handle_continue is done running).

I ask this because I frown upon this idea. If I change the Worker to broadcast a message once it’s handle_continue is done, then I am just changing my production code for the sake of testing, which is something I abhor completely.

1 Like

The obvious “solution” is to pause the test process for X amount of milliseconds and give a time for the process to initialize… this will slow down the tests a bit but if you don’t want to add any other code to signal that server initialized already I am not sure how you can do that.

I’m not sure this is a really worthwhile intent to have: Never change production for testing. I’d rather phrase it as creating flexible enough production code so you don’t need to add anything extra for testing the code.

In your example the worker could receive a callback, which defines the actual work.
This way you can test the runtime behaviour of the worker without any of your prod code needing to know about the callback it receives from the test, and it’s even more flexible because you can now use the worker to do all kinds of tasks and not just one specific one.

# Worker
def start_link(callback), do: GenServer.start_link(__MODULE__, callback)
def init(callback), do: {:ok, callback, {:continue, :run_callback}}
[…]

# Business Code
Worker.start_link(&fancy_callback_doing_business_logic/1)

# Test
test = self()
callback = fn -> send(test, :msg) end)
Worker.start_link(callback)
assert_receive :msg

Edit:

This also allows you to test fancy_callback_doing_business_logic without a process around it.

2 Likes

100% agree.

It still doesn’t fix the fact that I am changing my code to be more complex, all in the name of a feature I will likely never use :stuck_out_tongue:

I do admit it is a valid solution though!

I’m not sure it’s more change to your code than trying to make atomic counters increment. Both wouldn’t be needed without you trying to test something running async to the process starting the computation.

When I add atomic counters, I do that on my test code only.

If I pass a callback, I have to change my production code to receive a parameter on startup (the said callback), I have to change my supervisor creating the workers, and then I have to change the worker itself to be more generic. All this when the purpose of the worker is very specific and when it wouldn’t benefit from the inherent complexity of having a dynamic handle_continue. All these are changes in production code, which don’t bring any real design benefits to the Worker, because this GenServer is riddled with handle_info callbacks specific to it’s purpose.

Can you understand my point of view?

I agree your approach is perfect for a generic worker. But in this specific case, the worker is anything but generic, so it makes me reluctant to pay the price to have a more dynamic handle_continue.

test "saves base_url to dumper and metrics if it cannot connect to it" do
	Process.flag :trap_exit, true

	test = self()

	args = %{
		deps: %{
			dumper: %{
				save_failed_request: fn url ->
					send(test, {:dumper, :save_failed_request, [url]})
					{:ok, :saved}
				end
			},
			metrics: %{
				inc: fn id ->
					send(test, {:metrics, :inc, [id]})
					{:ok, 1}
				end
			}
		}
	}

	{:ok, _pid} = Worker.start_link(args)
	assert_receive {:dumper, :save_failed_request, args}, 600
	# Check args
	assert_receive {:metrics, :inc, args}, 600
	# Check args
	assert_receive {:EXIT, _, :timeout}, 600
	Process.flag :trap_exit, false
end

This is the test of your other topic rewritten to not use atomic counters, but callbacks sending messages. I’d say it’s even better than before because you can also send e.g. the args back to the test for assertion. You did already use the option of “passing in callbacks”, just that you used it to increment counters instead of sending messages.

1 Like

Aside:

What is that rationale for this structure?

			dumper: %{
				save_failed_request: fn url ->
					send(test, {:dumper, :save_failed_request, [url]})
					{:ok, :saved}
				end
			},
			metrics: %{
				inc: fn id ->
					send(test, {:metrics, :inc, [id]})
					{:ok, 1}
				end
			}

It seems a bit JavaScript-y/Object-y. If these callbacks only exist for test monitoring purposes, it should be possible to unify them into one single function using pattern matching and putting the arguments into a keyword list. (Then it should be possible to wrap it in a macro to remove the function calls from production code (similar to :compile_time_purge_matching) - though I’ve never done it).

This is how I inject the functions I am stubbing/spying/faking into the SUT. Why maps?
Because I like the syntax of my_map.deps.http.get.('blabla') rather than my_keyword[:deps][:http][:get].('blabal').

I have used keyword lists extensively by now, and I don’t really see the added value when compared to maps (you can also pattern match with them). Even if pattern matching with maps is more limited, I still prefer their less verbose usage any time of the day.

And no, I was not influenced by JavaScript :stuck_out_tongue:

Quite a brutal idea, but I still go for the first rule of macros: Don’t use macros.
I am not alone in my team, so I am not convinced by the price I would have to pay just because it would be fun. I would have to convince others, teach them and then maintain this position of mine. A price I am not going to pay alone without being convinced this truly has a gigantic benefit.

1 Like

The testing you are doing is invasive which is typical of Behaviour verification and therefore brutal to start with.

In this case the macros aren’t for fun but to ensure that production code isn’t burdened with testing overhead.

In the absence of macros keyword lists are cheaper to create than maps. That way the production code pays the minimum of overhead while the cost is moved to the testing code.

How about testing handle_continue/2 by calling handle_continue/2? If init/1 does nothing then this would be as simple as:

test "hadle_continue/2 does it job" do
  {:noreply, state} = MyServer.handle_continue(:foo, [])

  assert expected_state == state
end
1 Like

In the absence of macros keyword lists are cheaper to create than maps. That way the production code pays the minimum of overhead while the cost is moved to the testing code.

If you tell me of a real use case where using maps (instead of keywords) made your application sluggish and nightmare slow for your clients to use, then I will refactor the code to use keyword lists.

Until then, I shall treat the Maps VS Keywords debate as I treat everything else before doing performance testing and finding actual pain points: as premature optimization.

Still, it’s good to know this detail about Keywords, thanks for chipping in!

I don’t want to test handle_continue, I just want it to run before ExUnit starts verifying the assertions in the tests :smiley:

Is the SUT your process?

  • If yes, and these are unit tests, then my answer is - do not start the server at all. Just call functions and provide your own state directly. In that way you do not have the problem with asynchronous testing.
  • If that process isn’t SUT then mock it by starting different process with the same name
  • If these tests are integration tests then sleep in the test setup and then perform all actions.

Nothing prevents you from instead of using GenServer.call(my_server_pid, :message) use MyServer.handle_call(:message, make_ref(), state).

I can’t remember where I saw it now, but I’m sure I saw a post about using the :erlang.trace function to set up tracing on the SUT from the test. i.e. setup tracing for YourModule.handle_continue function and when it’s called tear down the tracing and continue with the assertions.

I can’t remember any more details than that, and I haven’t tried it myself but it seems possible to know the function has been called without having to add an extra message just for testing.

But what do you want to test then? Showing it might help understanding how to do it? If you want to test that a call made after the full initialisation (meaning init + :continue) results in either A or B according to the state set after the :continue executes, then doing a call with a big enough timeout should be (for practical purposes) deterministic (say 10secs timeout). If you want to test that side effects to the process state happen, then probably chaining manual calls to the handle methods should be enough, etc

1 Like

So I had a play around with this:

A simple server:

defmodule Tracetest.Server do
  use GenServer

  def start_link(arg) do
    GenServer.start_link(__MODULE__, [arg])
  end

  @impl true
  def init([arg]) do
    {:ok, :some_state, {:continue, arg}}
  end

  @impl true
  def handle_continue(_, state) do
    {:noreply, state}
  end
end

The test:

 test "can know when handle_continue finished" do
    :erlang.trace(:new, true, [:call, :return_to])
    :erlang.trace_pattern({Tracetest.Server, :handle_continue, 2}, true, [:local])

    {:ok, pid} = Tracetest.Server.start_link(:foo)

    assert_receive {:trace, ^pid, :call,
                    {Tracetest.Server, :handle_continue, [:foo, :some_state]}}

    assert_receive {:trace, ^pid, :return_to, {:gen_server, :try_dispatch, 4}}

    assert true == true
  end

I setup the same tracing calls in the iex console, started the server there, and ran flush to see what trace messages got received. There were only two so it was straightforward to match on them. I am sure your real code is a lot more complicated than this but perhaps just waiting till the method had been called rather than a return would be enough to avoid the race condition.

4 Likes

The last few topics that you have posted suggest to me that you need a new boundary - for example between the worker process and the http client library.

Essentially that puts the http client library in the horrible outside world and the new module becomes the way your application wants to interact with “the service” - i.e. it becomes a contract that is specified by your application.

Once that boundary is established a test double can be slipped in that the worker process uses to complete its flow to produce results (e.g. a response message) that are directly observable by the test case process.

Depending on the necessary sophistication of the test double “test first” may need to take a back seat for a while.

Especially for you I have prepared small library that I want to publish on Hex later:

-module(gen_local).

%% API exports
-export([start/2,
         call/2,
         cast/2,
         send/2]).

-record(state, {state, module}).

%%====================================================================
%% API functions
%%====================================================================

start(Module, Arg) ->
    case Module:init(Arg) of
        {ok, State} -> {ok, #state{state = State, module = Module}};
        {ok, State, {continue, Msg}} ->
            handle_continue(Msg, #state{state = State, module = Module});
        {ok, State, _Timeout} -> {ok, #state{state = State, module = Module}};
        ignore -> ignore;
        {stop, Reason} -> {stopped, Reason}
    end.

call(#state{module = Module, state = State} = S, Msg) ->
    Tag = make_ref(),
    case Module:handle_call(Msg, {self(), Tag}, State) of
        {reply, Reply, NewState} -> {ok, Reply, S#state{state = NewState}};
        {reply, Reply, NewState, {continue, Cont}} ->
            case handle_continue(Cont, S#state{state = NewState}) of
                {ok, NewNewState} -> {ok, Reply, NewNewState};
                Other -> Other
            end;
        {reply, Reply, NewState, _Timeout} -> {ok, Reply, S#state{state = NewState}};
        {noreply, NewState} -> async_reply(Tag, S#state{state = NewState});
        {noreply, NewState, {continue, Cont}} ->
            case handle_continue(Cont, S#state{state = NewState}) of
                {ok, NewNewState} -> async_reply(Tag, S#state{state = NewNewState});
                Other -> Other
            end;
        {noreply, NewState, _Timeout} -> async_reply(Tag, S#state{state = NewState});
        {stop, Reason, NewState} -> {stopped, Reason, NewState};
        {stop, Reason, Reply, NewState} -> {stopped, Reason, Reply, NewState}
    end.

cast(S, Msg) ->
    handle_reply(fake_call(S, handle_cast, Msg), S).

send(S, Msg) ->
    handle_reply(fake_call(S, handle_info, Msg), S).

%%====================================================================
%% Internal functions
%%====================================================================

fake_call(#state{state = State, module = Module}, Callback, Msg) ->
    Module:Callback(Msg, State).

handle_continue(Msg, S) ->
    handle_reply(fake_call(S, handle_continue, Msg), S).

async_reply(Tag, State) ->
    receive
        {Tag, Reply} -> {ok, Reply, State}
    end.

handle_reply({noreply, NewState}, S) ->
    {ok, S#state{state = NewState}};
handle_reply({noreply, NewState, {continue, Msg}}, S) ->
    handle_continue(Msg, S#state{state = NewState});
handle_reply({noreply, NewState, _Timeout}, S) ->
    {ok, S#state{state = NewState}};
handle_reply({stop, Reason, NewState}, _S) ->
    {stopped, Reason, NewState}.

This provide interface similar to the gen_server but functions are run synchronously instead of asynchronously. This should make testing of gen_server a little bit easier (however there still is no timeout, so you cannot test that).

EDIT:

Published on GitHub for now.

1 Like

When working with a lot of asynchronous calls (handle_continue, handle_info, :timer.send_interval, etc) I often make use of a helper to automate testing for results repeatedly. This avoids using fixed Process.sleep/1 calls where the sleep period is statically defined and either too long (which is pointlessly slow) or too short (which causes flickering tests):

def with_backoff(opts \\ [], fun) do
  total = Keyword.get(opts, :total, 50)
  sleep = Keyword.get(opts, :sleep, 10)

  with_backoff(fun, 0, total, sleep)
end

def with_backoff(fun, count, total, sleep) do
  fun.()
rescue
  exception in [ExUnit.AssertionError] ->
    if count < total do
      Process.sleep(sleep)

      with_backoff(fun, count + 1, total, sleep)
    else
      reraise(exception, __STACKTRACE__)
    end
end

I typically use this from higher level integration tests, where it isn’t desirable to call handle_ functions directly.

1 Like

In your production system how do you know that the workers are available to run? Also what type of failure do you get in your tests when the workers aren’t finished initializing? If you’re just doing a handle_call to the workers then I would expect it to work just fine (unless the initialization takes more than 5 seconds which would cause you to exceed the default handle_call timeout)