Thank you very much for your replies!
You both are very astute with the statement that I should take a look at the libraries that already exist. I took some time to do so yesterday and today (as much as my travels during vacation allowed). My quick glances were by no means comprehensive, but I do have some more questions to ask.
And besides that, I do want to finish my original train of thought, now that I have a few minutes to spare again .
As for my own work with Event Sourcing so far: I have worked with it directly in two different projects, but both contained a simple event sourcing log that someone else had prepared. I do work daily with the Ethereum Blockchain (amongst other blockchains and distributed ledgers), however, whose mechanics can be seen somewhat as an extremely strict Event Sourcing engine, including the significant drawbacks it thus inevitably has.
Thought snippets:
- It would be wonderful to write an event store in such a way that it works in a distributed fashion. Most of the linked libraries store the events in some external form of persistence (Why store a list of events in a relational database?). It might be interesting to work within OTP itself. (Tools like the Riak Core library, Mnesia, Swarm or using
GenServer.multi_call
directly). - When working with an event store, it is absolutely required to have strong consistency to store events. This is required because for nearly all (combinations of) events, order matter: It is impossible to successfully interleave two event timelines when healing a netsplit in the general case.
- However, for certain projections and/or event handlers, this could be relaxed somewhat, when a (combination of) event(s) is:
- Idempotent: applying the event’s effects one time or multiple times has the same effect. Examples: setters, any operation that keeps track externally if it already happened or not.
- Associative: order of applying the ordered list of events does not matter (left-to-right, right-to-left or divide-and-conquer). This can mostly be used to create ‘early return’-versions of aggregate creation functions. Examples: setters, addition, multiplication, subtraction, division.
- Commutative: ordering of events does not matter for the result. Examples: addition, multiplication, maximum, minimum.
- Running through the events and filtering them to pass them on to the different event handlers/projections could be done using Flow. Creating a ‘strongly consistent’ store for the events first might be implemented as one GenStage stage.
- Having one event stream that is filtered for the different handlers/aggregates and having many different event streams that might be combined are idempotent, allowing us to pick the one which is more performant.
- Most data in our models can be seen/modeled as nested hashmaps (for which some of the levels/‘branches’ might not care about the keys and/or ordering of elements, i.e. being respectively lists and sets). Most events could contain a hierarchical references (‘paths’) to this as well, which would make it easy to match on inside handlers.
I do not currently understand the need for a separate library that handles the creation of ‘commands’. Are commands not basically (once validated) events to be stored as well? Why is extra routing for this necessary?
The same for Process Managers: How is this not just an aggregate that creates certain new events based on an observed constellation of events? I believe these can just be programmed as a Finite State Machine (possibly in a Reactive way) and are in essence pure, i.e. do not necessarily need their own dedicated long-running process.
I hope you can shine some light on this, @slashdotdash. I would love to understand the rationale behind Commanded better .