I am considering building an Event Sourcing datastore in Elixir


#1

Hello, everyone!

I have a new pet project I am considering spending some time with,
which is to build a simple Event Sourcing engine to be used as ‘database’ for some applications.

For who does not know what Event Sourcing is: The idea is that instead of mutating your ‘business domain data model’ directly
(and thus throwing away information every time), you create a big log of state-changes (which are called events) as they come in.

This makes it a lot easier to alter the way you consume information, as you still have all the original data.
Also, for auditing and ‘go back when something is wrong’ it is a wonderful technology.

Another interesting thing, is that Event Sourcing can very much be built in a Functional Reactive way,
which makes it very clear how data is altered in the system as changes occur.

Of course, there are some drawbacks as well:

  • Somewhat more disk/memory usage than just using the most recent version of the Application’s state. (But since storage space is ridiculously cheap nowadays, this is not that much of a drawback)
  • More thinking is required beforehand before starting the system.
  • Properly doing migrations might be difficult, possibly requiring to re-play all or large parts of the events that have happened so far.

Still, Event Sourcing is a very interesting concept. But engines that do Event Sourcing are still quite scarce.

So I’d like to build one.
One of the things I’d like to tinker with, is to build stuff that creates the business domain data model lazily,
in that it will only calculate stuff when required (And then cache this for a little while). This might
greatly reduce memory usage of ‘keeping everything in memory the whole time.’

Argh, have to go! I will continue writing my thoughts in the future. Feedback is very welcome.


#2

I’d encourage you to look around at the available solutions first. Ben Smith (@slashdotdash) has put together a very nice framework for ES and CQRS (https://github.com/commanded/commanded, https://github.com/commanded/eventstore) and if you need something more than what you can do with Postgres, it also has an integration with Greg Young’s EventStore (https://geteventstore.org). I’m personally a fan of the idea that OTP provides almost everything needed for eventsourcing with the arguable exception of the store itself.

Its very tempting, especially when first getting into ES, to focus on the tooling and mechanics of it, because its very different from what we’re used to from years and years worth of CRUD apps. But its very easy to get lost in this and hand roll subpar solutions until you’ve actually worked on some ES-driven apps in production. The most interesting things to me, especially after i’ve been doing this for awhile, is in the design of information boundaries, which is really where the risk is in your application—its a choice you have to live with for awhile so it really helps to spend as much time as you can learning the boundaries of your data.


#3

Thanks for the mention @karmajunkie.

@Qqwy Feel free to take a look at the open source libraries I’ve been building over the last year or so for building CQRS/ES apps in Elixir. You can see some of the design decisions I’ve taken, based on lessons learnt while building and working on at least three previous event sourced applications.

I’m open to pull requests and will gladly accept contributions to these libraries.

For more Elixir and CQRS/ES resources take a look at Awesome Elixir and CQRS. There are another three Elixir libraries for event sourcing listed there.

There’s an #eventsourcing channel on the official Elixir Slack group too if you’d like to chat about the topic.


#4

Thank you very much for your replies!

You both are very astute with the statement that I should take a look at the libraries that already exist. I took some time to do so yesterday and today (as much as my travels during vacation allowed). My quick glances were by no means comprehensive, but I do have some more questions to ask.
And besides that, I do want to finish my original train of thought, now that I have a few minutes to spare again :slight_smile: .

As for my own work with Event Sourcing so far: I have worked with it directly in two different projects, but both contained a simple event sourcing log that someone else had prepared. I do work daily with the Ethereum Blockchain (amongst other blockchains and distributed ledgers), however, whose mechanics can be seen somewhat as an extremely strict Event Sourcing engine, including the significant drawbacks it thus inevitably has.

Thought snippets:

  • It would be wonderful to write an event store in such a way that it works in a distributed fashion. Most of the linked libraries store the events in some external form of persistence (Why store a list of events in a relational database?). It might be interesting to work within OTP itself. (Tools like the Riak Core library, Mnesia, Swarm or using GenServer.multi_call directly).
  • When working with an event store, it is absolutely required to have strong consistency to store events. This is required because for nearly all (combinations of) events, order matter: It is impossible to successfully interleave two event timelines when healing a netsplit in the general case.
  • However, for certain projections and/or event handlers, this could be relaxed somewhat, when a (combination of) event(s) is:
    • Idempotent: applying the event’s effects one time or multiple times has the same effect. Examples: setters, any operation that keeps track externally if it already happened or not.
    • Associative: order of applying the ordered list of events does not matter (left-to-right, right-to-left or divide-and-conquer). This can mostly be used to create ‘early return’-versions of aggregate creation functions. Examples: setters, addition, multiplication, subtraction, division.
    • Commutative: ordering of events does not matter for the result. Examples: addition, multiplication, maximum, minimum.
  • Running through the events and filtering them to pass them on to the different event handlers/projections could be done using Flow. Creating a ‘strongly consistent’ store for the events first might be implemented as one GenStage stage.
  • Having one event stream that is filtered for the different handlers/aggregates and having many different event streams that might be combined are idempotent, allowing us to pick the one which is more performant.
  • Most data in our models can be seen/modeled as nested hashmaps (for which some of the levels/‘branches’ might not care about the keys and/or ordering of elements, i.e. being respectively lists and sets). Most events could contain a hierarchical references (‘paths’) to this as well, which would make it easy to match on inside handlers.

I do not currently understand the need for a separate library that handles the creation of ‘commands’. Are commands not basically (once validated) events to be stored as well? Why is extra routing for this necessary?

The same for Process Managers: How is this not just an aggregate that creates certain new events based on an observed constellation of events? I believe these can just be programmed as a Finite State Machine (possibly in a Reactive way) and are in essence pure, i.e. do not necessarily need their own dedicated long-running process.

I hope you can shine some light on this, @slashdotdash. I would love to understand the rationale behind Commanded better :smiley: .


#5

As I mentioned, I’m of the opinion that OTP provides damn near everything you need to do ES properly—with the exception of an event store. While I think it’d be cool if there was one, I’m perhaps skeptical because of the number of half-baked stores out there I’ve seen over the last several years. It seems like this is the thing everybody who jumps into ES wants to build, and there’s not enough follow-through or value added to make it worth doing when there are already options available.

I’ve not done that much with flow or genstage, so maybe i’m off here, but these seem to be fundamentally demand-based tools. ES is largely a push-based paradigm and I think the mismatch there is significant.

I’m with you on the first part, not on the second.

CQRS is where the notion of commands and aggregates leaks into ES from. CQRS is at its root meant to be a stepping-stone for object oriented develops to move towards a functional data pattern. They’re not necessary in Elixir. Command messages are just… messages. An aggregate is just a projection like any other. The functions that get called for commands are also like any other function. There’s no reason to persist command messages for their own sake—they’re meant to be transient messages, processed like a queue, which is why they’re usually given separate treatment. If you want to keep a log of commands for audit purposes or something that’s perfectly ok (and sure, the event log can be used for this purpose), but its not a requirement. There is literally nothing special about a “command” other than the fact that it can be rejected if it is invalid.

Process managers are not simply for publishing new events—they issue commands as well. Very frequently these are integration points with external systems so its not simply a matter of adding new events into the stream. They can be stateless, but often have their own internal state and associated rules, which is why processes are a good fit for them. (Note that while they’re often conflated with sagas, the latter is a different concept that describes a single long-running distributed transaction that has complex failure modes and compensation logic.) State machines may be a good implementation pattern used for both, depending on function.


#6

Why store a list of events in a relational database

Using a relational databases, such as PostgreSQL, provides a robust, battle-tested, and performant platform to build an event store. It also brings plenty of tooling for production: full/partial/hot backups; cluster support; diagnostics, profiling; and more. Most databases use an append only log to persist transactions anyway, so it’s not so dissimilar to event sourcing. In the EventStore library I also disable UPDATE and DELETE statements on the events table to make it append-only and immutable.

Saying that, you could write an event store that simply writes files to disk. The public API is small: append events to stream, read stream forward. However you then need to deal with all the issues that inevitably come up when writing files and the tooling I mentioned above.

Another interesting option for an event store is to use the new Redis streams. That looks like a really good fit for event sourcing, but remember to enable fsync per write.

Running through the events and filtering them to pass them on to the different event handlers/projections could be done using Flow

For event handlers I took the approach of having them run independently and fully autonomous. This is similar to how Greg’s Event Store works with its competing consumers model. Why would you want to do this? It allows handlers to run at different speeds, typically you have slow async handlers that can lag behind (e.g. sending emails, third party API requets). But you don’t want them to hold up read model projections to minimise query latency.

Autonomous subscriptions allows you to add new handlers and replay all events from the beginning of time, or restart a handler to rebuild a projection. I’ve implemented a hybrid push/pull model for the Event Store subscriptions where appended events are published to subscribers, but they are buffered per subscriber and use back-pressure to ensure the subscriber isn’t overwhelmed. The subscription falls back to pulling events from the store when it gets too far behind, until caught up again.

You could use GenStage for this, but I would recommend using an individual flow pipeline per handler; not one flow for all handlers. Since GenStage's broadcast dispatcher can only go as fast as the slowest consumer. You also want to have any event handlers run from the event store, after the events have been atomically persisted. Appending events to the store should guarantee that a success reply is returned only after committing to the underlying storage.

I do not currently understand the need for a separate library that handles the creation of ‘commands’.

As @karmajunkie says, commands are simply request messages that can be rejected (e.g. debit account). Whereas events are messages of fact, stating something has already happened and cannot be rejected (account debited, account is overdrawn).

Process managers are the inverse of aggregates. An aggregate handles commands and creates events; a process manager handles events and creates commands. They may both have state that helps them to either validate a command (aggregates) or route a command to the appropriate aggregate (process managers). You can model both of these as finite state machines. Often I simply use a state field and use Elixir’s pattern matching to do just that, rather than explicitly modelling the states and allowed transitions. @sasajuric’s fsm is one library that could be used.

The approach I’ve taken is heavily influenced by my previous experience developing CQRS/ES applications using an object-oriented language (C#), not necessarily from a functional perspective.

Hope this helps you out!


#7

There’s another distinction between commands and events: commands have one, and only one handler, whereas events can have many handlers, or even none.


#8

@Qqwy If you are looking to reproduce the existing event sourcing paradigm, then I wholeheartedly agree with pretty much everything said by @karmajunkie and @slashdotdash .

But if you are working on this as a pet project, and you are already familiar with blockchains - especially as they pertain to a restricted event sourcing paradigm - then I’m going to throw out there that you may be interested in what I’ve done with ibGib as it pertains to an ES-like data store.

I mention this because the underlying data construct creates a graph database with “events” as a subset of its structure. The entire data store is self-similar, in that each data item (called an ibGib) is tree-shakeable, with full integrity because it links to other data via merkle links. This makes it similar to the merkle links used in a block chain, but whereas those are more like linked lists, ibGib creates an entire merkle “forest” (terminology borrowed from Juan Benet at IPFS) that ends up having properties of a DAG and a DCG. DAG, in that each immutable ibGib frame is acyclic, but DCG in that each ibGib over time creates a timeline where an ibGib can point to itself - but only a past version. Because of the “tree-shakable” design, there is no one “main event stream” single source of truth, which makes it scalable without a whole lot of hacks outside of the data store itself.

I’ve recently left GitHub (hence my new EF account here), as I’m currently creating a version control system built on top of it, because it is a natural fit and a use case that compels me to attack the distributed aspect. Just today I’ve gotten the version control basics working, and soon I will have it to where you can “push” your local IbGib.FS (file system) to the ibGib server, and then “pull” it down to some other node. But the interesting thing is that because of the self-similar design, this kind of cloning/branching is a natural fit since files and folders are a subset of ibGib and their relationships. And the VCS aspect pertains not just to diffing text-based source files, but to any ibGib (like pinterest-like image boards, or whatever other data constructs). This is actually possible because the “events” (“dna” in ibGib) can be replayed on different timelines, much like replaying diffs, but enabling this at a semantic-level.

So like I said, if you want to do the existing paradigm, I would recommend like the others say, to contribute to the existing repos. But if you’re wanting to leverage your blockchain and distributed skills, you can check out the website itself at www.ibgib.com. Just create a new ibGib, and fork/mut8 some things (or just go here or here - some of them have subitems, just like a folder, like this pic of one of my winter garden beds), and you can see the raw json generated for each ibgib record if you click “view > info”. You can see the “events” if you look through the “dna” rel8n. These events (which are ibGib themselves!) are immutable transforms that record the actual transformation of an ibGib, from A to A’ or B.

If you have any questions, or if you or anyone else is interested in the code, just let me know, as it’s still open-sourced. I’m almost done with the FS side and will hopefully be pushing the new version, and subsequently the source for the new version, within the next month.


#9

Thanks for pointing this out, @wraiford—I tend to approach these conversations as though someone’s going to put it into production, but as a learning experience its a very different proposition. (In my defense I’ve seen a lot of experiments go into production, but that’s no reason to be as heavy-handed as all of that!) And that sounds like an interesting piece of work, thanks for sharing it.


#10

@karmajunkie Indeed. Each of us brings his and/or her assumptions to every discussion. For example, “in production” can mean very different things, depending on the maturity of the technology and the goals and aspirations of the authors. Some of us are in the world of optimization in the admirable and inexorable procession to the local maximum, while others live on the bleeding edge of creation looking to break out to find the next. :wink:


#11

Have you seen Datomic @Qqwy? It’s not exactly an event sourced system but it looks like it behind the scenes, where Datomic just gives you the aggregates.

I would love to see an implementation of Datomic in Elixir. :slight_smile:


#12

You can check article
https://www.confluent.io/blog/building-a-microservices-ecosystem-with-kafka-streams-and-ksql/