What the idiomatic way to handle "callbacks"? Such as when a module needs to communicate some events asynchronously to a "client" module?

After getting through the basic learning materials on Elixir, one concept that I am struggling with is what the idiomatic way to handle “callbacks” is. IE, when a module needs to communicate some events asynchronously to a “client” module.

One pattern that I see frequently in Elixir libraries (Cowboy for instance) is to define a behaviour and have clients pass a module that implements the behavior. This seems similar to a Java class implementing an “interface”, but with the disadvantage that a module alone has no state, and so I feel like I end up writing a lot of Singletons with this paradigm.

Another pattern I’ve seen is to pass an anonymous function, which seems advantageous because client state can be captured inside the closure, but becomes cumbersome if you have large numbers of different types of events since you start passing lots of functions in a disorganized fashion.

Finally, I could just accept a pid and send it messages. This seems the most straightforward, but you don’t get as much compile-time checking and it seems to go against the “OTP” way of doing things.

Thanks for reading!

4 Likes

It would help if you have a concrete example in mind since otherwise the answers that you get will trend towards “it depends”. But in general I will say that you should try to avoid persisting state around and instead just pass all the “state” you need via function arguments.

4 Likes

Finally, I could just accept a pid and send it messages. This seems the most straightforward, but you don’t get as much compile-time checking and it seems to go against the “OTP” way of doing things.

How so? Because that is exactly what cast/2 and handle_cast/2 are for.

You may find GenServer docs: “handle_cast … should be used sparingly” Why? of interest.

when a module needs to communicate some events asynchronously to a “client” module.

Modules namespace code, processes communicate and processes can maintain state - though even that is open to abuse.

If you have a Google account you may benefit from downloading the free sample of Designing for Scalability with Erlang/OTP: Implement Robust, Fault-Tolerant Systems - it contains “Chapter 3: Behaviors” and most of “Chapter 4 Generic Servers” and being exposed to these on a more basic level may be beneficial towards “thinking in processes” (I know it’s in Erlang but you mentioned Cowboy so I figured it’s OK).

5 Likes

I agree that this question is too general, so it’s hard to give a concrete answer. But I’ll briefly comment these options.

This approach is usually used to keep both the generic code and the particular code running in the same process, but separate from all other processes. For example, in Cowboy, every connection is handled in a separate process. A GenServer, which also uses this approach, always powers a separate process. But in both cases, the particular implementation (i.e. the callback module code) is running in the same process as the implementation of the behaviour.

Also note that this approach is not stateless, and you don’t need to have singletons. Usually, there is an init callback which creates the initial state and returns it to the behaviour. The behaviour then passes the state to other functions, which return the new version of the state.

It is however true, that for each particular implementation you need to implement a module with well defined callbacks.

The second pattern you mention, an example of which would be Agent, could be thought of as a lightweight variation which doesn’t require a separate definition of a module. I agree with you that this can become quite clumsy pretty easily, so I think it’s mostly suitable in very simple cases.

This is essentially a pub-sub approach, and is also a valid option. Compared to the first two options, here the particular implementation runs in a separate process. This has its own set of trade-offs. It complicates the process structure, and might have performance implications. On the upside, it promotes fault-tolerance and scalability. The decision between this approach and the first two boils down to the question: should the generic code and the particular code run in the same process or not?

In my opinion, a good guideline for running two things in separate processes is to separate failures and/or latencies. If you don’t want a failure of A to affect the success of B, and/or you don’t want a slowness of A to affect the latency of B, then you should consider separate processes. Assuming that A is generic, and B is particular, this would mean that you’d opt for the third option. If you’re uncertain, then the first or the second would be a better choice, since the concrete implementation can always decide to handle the event separately.

In other words, with the third option, the generic code enforces the process separation of generic/concrete, while the first/second options don’t.

A recent example of the third option I’ve encountered was when polling an external service. I have a process which periodically fetches some data from a service. Then I need to do a couple of independent processings of that data. I opted for the third option. The poller sends a message to subscriber processes. This means that the poller can run separately from its subscribers. If a single subscriber crashes, everything else keeps running. If the poller is stuck, say due to a network latency, the subscribers can still do the remaining work. Likewise, a slow subscriber won’t block other subscribers, nor the poller.

If this seems vague, that’s because the answer is as concrete as the question :slight_smile: If you have a particular example at hand, then it’s maybe worth explaining it in more details.

11 Likes

Ok, these responses are very enlightening. Sorry for the vagueness of my question. A specific example that I came across recently was the Sippet library, which listens for packets on a UDP port and translates them into higher-level session messages. It uses the first option; you pass a module name that implements the “Sippet.Core” behaviour (https://hexdocs.pm/sippet/Sippet.Core.html#content).

This particular library seems to only support binding to a single UDP port for the entire application, but it got me wondering how one might extend this interface to support instantiating multiple clients in their application, each potentially having its own state.

The first thing that came to mind for me was to just send messages to a pid, but I suppose that does put a little more burden on the client to manage their own processes, as opposed to being more of a “framework”.

As sasajuric points out, another way would be to add an init function where some state may be created and passed back to each callback. This sort of reminds the old C practice of passing an opaque void * along with a callback (not that that’s a bad thing). Another option might be to simply pass an “instance id” back along with each callback and let the client maintain a mapping between instance and state, but that seems more cumbersome and I have not seen many examples of such a thing.

1 Like

For a while now pigeon has used anonymous functions to implement async callbacks for push notification responses. The “state” of the notification is managed in the internal worker, and when a response is received, a supervised Task is spawned to execute the function with the updated state as its only parameter.

I’ve found this really simplifies the API, and gives the greatest flexibility in how the callback is implemented.

def handle_response(%Pigeon.APNS.Notification{response: :success} = notif) do
  # do something on successful push
end
def handle_response(%Pigeon.APNS.Notification{response: :bad_device_token} = notif) do
  # remove device token from database
end
def handle_response(_notif) do
  # other things
end

n = Pigeon.APNS.Notification.new("message", "token", "push topic")
Pigeon.APNS.push(n, on_response: &handle_response/1)
2 Likes