Event Sourcing with CQRS is a technique for building applications that are based on an immutable log of events, which makes it ideal for building concurrent, distributed systems.
Though it is gaining popularity, the number of options for storing these events is limited and require specialized services like Kurrent (aka Greg’s EventStore) or AxonIQ.
One of the strong-points of the BEAM is, that it comes ‘batteries included’: there are BEAM-native libraries for many common tasks, like: storage, pub/sub, caching, logging, telemetry, etc.
ExESDB is an attempt to create a BEAM-native Event Store, building further upon the Khepri library, which in turn builds upon the Ra library.
On the roadmap:
integration with pg2 and Phoenix.PubSub for side effects (i.e. read model projections)
Interesting idea. I’m curious about plans for the commanded support.
So you plan to create an implementation for the Commanded.EventStore.Adapter behaviour, right? In that case it would be interesting to hear about a key differences between your library and PostgreSQL-based Elixir EventStore / EventStoreDB.
I guess it would be something like In memory event store, but for production. That should give us less required environment dependencies which is great. Are there any other pros for that like speed or security?
What kind of serializer would be needed? Would it work with the Elixir structs as-is or we would still need to use a JSON serializer?
Suppose some project already uses other adapter. Would that require some extra steps for migrating data?
Hi Tomasz,
Thank you for your feedback.
Next to the advantages you already mentioned (speed, security, no need for extra serialization - events are indeed stored as Erlang terms), I’d like to add the capability to deploy event sourced services as a self-contained BEAM-native release to the edge. This would allow us to leverage for instance The Nerves Project for deployment.
I am also looking in to https://bondy.io for scenarios where you could have 10E+N nodes in the network.
Much of my past work revolved around decentralized and autonomous systems (think Parking Facilities, Vehicles, Agricultural automation, Logistics etc), often in a “spotty” environment, where nodes aren’t always connected. Such systems benefit little from SaaS solutions, if the network is not available. That space could be considered my main motivator to build a BEAM-native event store: as few dependencies on 3rd party services as possible.
The reason for implementing the Commanded Adapter is simple: it is the de-facto event sourcing standard for the BEAM and is as far as I am concerned, feature complete.
When it comes to migrating data from existing stores, I’d argue that’s quite easy, barely an inconvenience: replay the old store and project into the new.
So, indeed: highest points on the agenda are:
1, have Kherpi triggers throw seen events on pg2 (for projections etc…)
2. Dynamic Clustering via Partisan
3. Commanded Adapter
4. Monitoring/Telemetry
…
This seems cool - love seeing more database projects in Elixir!
A couple of random questions from someone who knows very little about CQRS/ES:
Khepri, like Mnesia, is an in-memory database (which also persists to disk). If you’re storing an immutable log, would you eventually run out of memory? Or is the log truncated at some point?
Khepri, as I understand it, is a K/V store built on top of a Raft log. Since the thing you’re storing is, of course, a log, would it make more sense to use Ra directly?
I see you mentioned a large number of nodes. Is the idea here to have many individual Khepri clusters running independently within a large cluster?
What is Khepri’s throughput like? I would imagine you would get better results with aggressive batching.
Thank you for your input, those are some very valid points and concerns.
The main driver for this project is decentralization and in such scenarios, I’d imagine JIT availability and localized sharding functions as a counterweight to throughput. For now, I don’t worry about this too much yet and focus on getting the store operational.
Most of the dedicated Event Stores (I know of) are centralized at the data center level and there is not much literature about decentralized event sourcing. It probably opens a whole different can of worms, but we need tooling to investigate it. ExESDB should be seen in this context.
As an example, imagine the scenario of a parking facility where vehicles enter and exit, people enter and exit, payments are made etc…next, imagine such a facility not being managed by a centralized system, but rather by a mesh of SBCs that perform individual parts of the process. Such a system might consist of a few hundred devices, and indeed, in that mesh there might be a number of individual realms that are responsible for parts of the process.
I will definitely follow your progress on this, sounds interesting.
Question:
If the assumption is to run this “at scale”, whatever that may be, what is the concept/approach about consistency boundaries for said scale?
That is, in EventStoreDB (Kurrent) and Commanded’s own Postgres EventStore, there is the “stream local” boundary with optimistic concurrency for the stream itself-- aka the incrementing gapless version number between events in a stream–, but these events do get projected into some order in the $all stream.
Is the use of khepri and ra to make sure that the “individual stream” is consistent across the distributed cluster nodes?
And that also there will be an $all stream (or other projections) that also will provide some sort of consistent ordering across the cluster? Also using khepri and ra?
Given that Kurrent has an HA cluster solution, I’m assuming that there is known distributed systems approach to merge all these various distributed small stream events into a combined single $all projection? If so, what is it? Got references for me to learn from? And how would ExESDB accomplish this task?
Thanks for the experimenting of your project and edifying me on this.
Hi Byu,
Thank you for your interest and comment and sorry for the belated reply.
To answer some of your questions:
The mechanism that is used for emitting events from the store, relies on Khepri’s built-in triggering capabilities: when a node in a Khepri path is created, a ‘stored function’ can be activated, which emits the corresponding event, if path and payload satisfy certain filter conditions. However, this capability is only available on Ra leader nodes, so a Follow-the- Leader mechanism is implemented to transfer the Emitter subsystem to the new Leader upon election.
Since v0.0.15, ExESDB supports
For ‘transient subscriptions’
3 flavors: :by_stream ($all or $stream_id), :by_event_type and :by_event_payload.
For ‘persistent subscriptions’
only :by_stream has practical relevance, as far as I know
ExESDB.GatewayAPI
Using the swarm library, a Gateway is implemented, which routes requests to a random available node in an ExESDB cluster, thus achieving some Load-Balancing and High-Availability.
Tests and Experimentation so far show quite good results in terms of consistency and event ordering, even when performing chaos testing on the cluster (I must admit, to my slight surprise) which is testament to the excellent work done by the giants on whose shoulders we stand
I did create a little demo clip (sorry for the terrible audio, conditions are sub-optimal for now)
ExESDB v0.1.0 available!
I am proud to announce that the first useful release of ExESDB is now available as v0.1.0! This release comes with an adapter for the fantastic Commanded Library AND a Phoenix LiveView demo app. Feel free to check it out
It has taken me a few months to finish things up (but then, what is ‘finished’, right?) but meanwhile, the distributed store is there, a HA proxy is there and…the Commanded adapter is there, too!
It is said that the proof of the pudding is in the eating, so the adapter is being developed and tested in conjunction with a working Phoenix LiveView application: an event-sourced demo app that mimics a dashboard for regulating greenhouses. All data (events) is stored in ExESDB, while Cachex is used for read models, thus creating a true BEAM-only application without the need of any external services like EventStoreDB or PostgreSQL.
I must confess, it feels a little liberating, being able to just spin up a few containers and not having to worry about connectivity with other backend components.
Also, since we remain in the same ecosystem, there is no need to worry about (or waste processing power with) serialization. Serialization will only become a topic if and when we decide to create API’s for clients in other languages..
wow!!! that is unbelievable news!! i recently started exploring commanded library, and having now native eventstore is amazing news! what’s your plans for future releases?
dynamic store support: currently there is a one-store-per-cluster (1SPC) limitation, steered by config. In the philosophy of ExESDB a store supports one business process/behavior (one aggregate type). If you are familiar with EventStoreDB, a ‘store’ in ExESDB corresponds to a ‘category’ in EventStoreDB. Though there is much to say for 1SPC, N-stores-per-cluster (NSPC) offers more options towards scenarios with budget constraints (less nodes), resource constraints (smaller hardware) and simplified inter-business process/behavior communication (less complexity)
An ExESDB Admin API (REST+gRPC) in order to support:
an ExESDB Administration tool: not sure yet in what format…probably a CLI for scripting (must have!), a TUI (I am a terminal jockey) and/or an Admin Web UI/Dashboard with fancy metrics and such
Later:
A REST/gRPC API that is 100% compatible with EventStoreDB, so it could form a ‘drop-in’ replacement. Since ExESDB supports dynamic clustering and is resilient against node shutdowns, I figure it has the advantage over EventStoreDB for Orchestrated scenarios (Kubernetes for instance)
All this is of course just a wishlist and it is a labor of love, so I won’t put planning on it
i will definitely test it with my commanded application. do you think it’s now on the level that i can switch from postresql-store and start using it during development of my app? i’m pretty new to event-sourcing, so i’m worried if i can run into issues that will make my study of commanded/es slower?
Well, maybe experiment a little in a separate branch and save your current work, first.
Have a look at ExESDB Commanded Adapter at Github there is a /apps/regulate-greenhouse folder with a demo app that should get you started…in theory it should boil down to add a dependency on {:ex_esdb, “~> 0.1.0”} and {:ex_esdb_commanded, “~> 0.1.0”} to your mix.exs
I couldn’t get that working with dependencies on {:ex_esdb, "~> 0.1.0"} and {:ex_esdb_commanded, "~> 0.1.0"} in my mix.exs.
The issue is that ex_esdb uses relative paths like ../deps/khepri/include/khepri.hrl in three locations in the source code. While I do have the khepri.hrl file in that path in my deps folder, ex_esdb isn’t finding it during compilation.
This affects anyone trying to use ex_esdb as a standard Mix dependency rather than vendoring it. Note that the sample greenhouse application also uses relative paths for its dependencies.
I’ve fixed the issue by changing these relative paths to standard include_lib format. The fix is minimal, maintains compatibility, and I’ve tested it successfully in development in our production-bound application that’s migrating from PostgreSQL to ex_esdb with Commanded.
I’ve submitted a pull request with the fix to the original repository. In the meantime, you can use my working fork: {:ex_esdb, github: "iffies/ex-esdb", branch: "fix/khepri-include-paths", sparse: "system"}.
Oh wow thank you so much for finding that, PR is approved and merged
Thanks also for testing it in a real-world setting, very important! Though a little word of caution: this project has been going only for a few months now and claiming it to be “production ready” would be a little ambitious, today. For that, it’d require a few more successful real-world scenarios.
The greenhouse sample application sources its dependencies from path right now, indeed, since it is primarily a development sandbox.