Opinion on file & memory based event sourcing system

jdumont · September 30, 2019, 10:36pm

I’m still chipping away at this whenever I get a moment, and thought that I’d provide a quick update on where I’m at.

I tried using many different options for storing events, all using a shared interface so that I could quickly switch between them. I’ve found some wonderful little features of both Elixir and Erlang and some cool libraries to help me along the way — it really has been a brilliant learning experience!

So far I’ve tried:

Simple term_to_binary and File.open(x [:binary, :append])
DETS
Erlang’s disk_log
Erlang’s file:consult for reading events back and a custom io_lib:format function to write them
CubDB (using min_key and max_key for event number ranges was brilliant)

All had positives and negatives with disk_log probably coming out on top. It has a load of features that you really need for working with files already baked in and thoroughly tested. The deal breaker though was that it’s really hard to get events back out by their event number, as disk_log really doesn’t work with any keys, it literally just appends a given term to the end of the log.

Getting the events out required bringing all events into memory (ignoring disk_logs wonderful chunking feature) and filtering out events older than those we were interested in. This isn’t the worst thing in the world as I was planning on keeping logs partitioned by aggregate, meaning that they are unlikely to ever become so long that parsing the whole log becomes an issue as it’s incredibly fast - loading 100_000 average-sized events in approximately 75ms.

My concern was that I could end up with a compromised method for reading and writing the literal heart of this system.

—

The next phase in this experiment was yet more learning… Martin Kleppmann’s “Designing Data Intensive Applications”. I can honestly say that I learnt more reading this in a week than I did over the previous 9 months piecing together articles and documentation online. His closing conclusion of “turning the database inside out” is very in line with what I’m trying to accomplish here, although I definitely think he was talking at a much larger scale than I’m working at!

Importantly, the book highlighted many issues that I was unwittingly making with other parts of my system. I was neglecting a total order by writing all of my events in partitions based on aggregates. It wasn’t a deal-breaker though as I could accept partial order as a trade-off in this instance as causality isn’t an issue in my domain. However, these partitions did make my projections quite complex due to needing to track their event offsets (using consumer offsets over ack) for many, many, many streams. Not an issue but one that I decided that for now at least to avoid by opting for a single unified log with total order.

I’m using a very simple writer and index pattern from the book and at the moment it’s working very well. Sending a command to an aggregate, having it validated, events created, persisted and applied to the aggregate is taking on average around 150 micro-seconds. I still have to handle log rotation and snapshots, but these should be quite simple as I’m modelling my store as a series of GenStage (yet another thing I’ve learnt since starting this).

—

I’ve decided in many cases to opt for purposely naive approaches - whereas before they were just naive - in order to keep my system simple. My tests have shown that I’ve got a good amount of performance margin with which to make compromises such as the single log which is a potential bottleneck, but hugely simplifies the entire system. Equally, a single event stream, local-only Registry makes it a lot easier to work with and understand. I figure that it’s easier to make a system that already works well faster, than it is fix a fast system thats yet to work.

I have had crisis’ of confidence along the way — “Why aren’t you just using Commanded and PostgreSQL like a sensible person?” - “You’re totally out of your depth here!” - “Why even bother with event sourcing, CRUD could work here after all” — but overall I’m very happy with the progress I’ve made and the things I’ve learnt. I think I’m probably a little way off being able to build anything complex with it yet, but I’m enjoying the process.