Domain events as GenServer messages

easco · August 25, 2018, 9:20pm

In my work with OTP, I’ve usually sent messages to a process when I want to invoke behavior - an imperative action, or command.

I’ve been learning about Event Storming and Event Sourcing and it occurred to me that using messaging to send Domain Events might be a more effective tool in cases.

I’m curious to hear the reports of the clever folks who came to this realization before me. If you’ve created such a thing, where the messages between processes are Domain Events, what was your context and how did the technique work for you?

blatyo · August 26, 2018, 2:31am

I do this with RabbitMQ. The problem with doing it in Elixir in my opinion, is that typically the producer is aware of the consumer (e.g. send(consumer_pid, :message)). You want some sort of pubsub system, where processes interested in an event can register themselves and the event producer can just publish the event without being aware of who will consume it. Using a Registry as a dispatcher is one way to solve that.

In an RPC style system, the producer is responsible for failure of the consumer, which tends to be easier to handle, because the producer has the data. In an event based system, the consumer is responsible for failure. This makes it harder to recover, because the data from the event came from another system. This is why I use RabbitMQ. RabbitMQ provides persistence of messages between the producer and consumer. If the consumer fails in processing a message (i.e. hasn’t acked the message), the broker retains the message and will retry it again. This gives you at least once semantics, whereas, between gen servers by default you would get at most once.

I’ll say it works well for me. But I use it primarily as an integration pattern between applications rather than internal to one application. Although, nothing would preclude you from using something like RabbitMQ with a single application.

easco · August 26, 2018, 4:28am

Thank you. Those are some interesting points.

I can see the advantage you get from having the event queue separated from the event processor. In the most basic case, a processes mailbox would stand in the stead of RabbitMQ in your setup. This, at least in the sense that it queues events for an event processor. But you are right that if the process fails, that mailbox goes with it. Depending on the processor and events dropping those on the floor might be a Bad Thing.

For the intra-application case I could imagine Phoenix PubSub, or even Flow and GenStage as that intermediary to isolate the stream of events from the processor. (Hex probably has other message bus things that could carry the event stream). Of course RabbitMQ is going to use persistent storage while you’d probably need to add that yourself using the other solutions. (DETS perhaps).

blatyo · August 26, 2018, 5:42am

Yep, those could work fine. I think the problem with using GenStage or Flow specifically, is that it’s a mechanism for you to provide back pressure. Basically a way to say, don’t generate an event. If the event is generated from a user action for example, then they don’t make sense by themselves, because the event is going to happen independent of whether there is demand for it. You could keep those events in memory, but if you application can’t process them fast enough at peak load, it’ll crash. For GenStage and Flow to work, you need some queue to pull work from that won’t be subject to the same constraints. That could be DETS like you mention, or a database, or something like RabbitMQ.

Similarly, Phoenix PubSub has no mechanism for back pressure. So you need to process events faster than they come in or you application could be overwhelmed. So, for that reason, you usually want to make sure you always do a fast action. It follows that doing the least amount of work possible will be the fastest thing you can do. What I see people do most often in this situation is rely on some persistence mechanism to queue them and then process them at their leisure.

axelson · August 26, 2018, 6:54pm

With genstage (depending on your use case) you can use load shedding to prevent the system from toppling over. Discord has a great blog post about that: Discord Blog

blatyo · August 26, 2018, 6:57pm

Fair point. Everything I said was with the assumption that losing events should be avoided. Sometimes, that’s not reasonable.

easco · August 27, 2018, 8:57pm

I get your point. In my original post I was thinking more along the lines of using messages to convey domain events in the case where a domain event triggers functionality further down the stream.

I mention just that in contrast to Event Sourcing where the sequence of events become the source of truth. In Event Sourcing being careful not to drop any events on the floor is critical. In my original thoughts that was not a requirement.

(But I appreciate your thoughts on the topic. They are still valuable, and interesting, even if not part of my original conception).

MrDoops · August 27, 2018, 9:27pm

You could check out event_bus for inter-app communication. External services like RabbitMQ, Nats, Kafka, etc. are really good for integration across service boundaries, but it sounds like you’re looking for more of an inter-app solution.

karmajunkie · August 29, 2018, 5:09pm

In both cases, what you’re sending is a message, plain and simple. Recognition of the message as a “command” or “event” is a layer of semantics you put on top of the mechanics that’s useful primarily for humans to understand what’s going on. To paraphrase the Honeybadger: “BEAM don’t give a f*ck…”

As you mention further down, whether you store these messages to be able to recreate a data model is another couple of steps beyond this that’s useful to do sometimes.

In terms of design and architecture, the use of “domain events” implies a few things you have to (and you seem to) be aware of—they indicate things that happened in the past, so losing them is a corruption of your data. Its also easy to get the boundaries wrong or model the messages naively, especially if you’re unfamiliar with the design technique (and even if you’ve used it before!) If you’re not doing eventsourcing that’s less of a problem, you can just update your code. But if you’re storing those, that’s a design mistake you have to live with for awhile. So the analysis phase takes on great significance.

As @blatyo mentioned, you may be thinking more about the differences between pubsub and request/response messaging. Both are useful in the right context, and event messages are a better fit for pubsub, where you’re advertising to the rest of your system things that have happened, while command messages you don’t typically want all handlers to fire. One consideration you’ll need to think about here is how your application is distributed—do you need multi-node pubsub? Do you need durability of messages? Acknowledgment of handling? What are the delivery semantics, e.g. at-least-once or at-most-once? All of these will affect the choice of message transport. For command messages, you usually want request/response or at least a point-to-point message (where you don’t care or don’t want a response.) Being able to look up the relevant node in a registry to send the message to is probably enough to deal with.

There’s a whole rabbit-hole of stuff to think about when you start moving in this direction for architecture. If you’re going to bite off ES architecture, I’d recommend using the commanded project in Elixir and promise yourself not to let yourself get bogged down in the mechanics until you’ve been doing it for at least a year. Every time I see someone get into this the first time the inclination seems to be to try to write your own CQRS/ES framework, and frankly, the world doesn’t need another one from someone learning about it as they go. (I speak from experience here, though at least I didn’t do it in Elixir… )

sneako · August 29, 2018, 5:20pm

Here is a talk from Empex LA ealier this year where Chris talks about an event bus, built with GenStage https://www.youtube.com/watch?v=Aa--NDjL9SI