Yes that’s accurate for CQRS. The “highest definition” data are the events themselves, that’s the most detailed view of what happened. The aggregate state (build from the event stream) is the state that the business creates for themselves “internal” so that they can take decisions on what events should be created. So one starts with a command that holds information X it is send to an aggregate (“internal” state) that holds information Y, it then combines information X & Y to decide what events should (or not) be appended to the event log. Usually it does not make sense to share that state with customers as it’s intent is for the system to reason about what to do next and it lives in memory.
The projections are representations of the data that help solve specific business problems, so it’s state created from the events with the purpose of finding (viewing) information about the system. It can be used to solve a problem for a customer, but can also be created to solve a problem for the business. The main difference there is that this state is not used to take decisions on what the result of a command will be (even though in practive projections are sometimes used to build the data inside commands, but that can be dangerous so it’s important to understand what could go wrong) and in general there is a “lag” of this state, meaning that it might be slightly stale. For example it might get updated 1 second after an event was added to the log, (that’s where the dangers of using it in commends can arrise)
But I’ve build a toy application where the aggregate state is used as the main state of the application. It’s more of a fun experiment and would not expect to see this in a real application other. than games.
Most EventStores allow deleting or obfuscating events, this feature exists precisely for dealing with situations like this (or when events were inserted incorrectly).
Encrypt your data and throw away the key on request. Basically if dealing with user or organization data, you can create a key that will be used to encrypt and decrypt the data, if the user requests to delete the data you just throw away the key. Effectively making them inaccessible (I guess personal quantom computers are still some decades away)
Finally, don’t use ES for personal data. Most likely your business does not need to build ES around user emails, credit cards or other personal data. Just store them in seperate tables that are not ES and then use a user id to correlate them, if they need to be deleted you just remove the personal data. If you are building a social network or a public blog where you handle user content then I would be highly skeptical of the need to use ES there. You can just say for example that user XXXX created a blog post or updated it, without storing the contents of the blog post. Effectively just getting the information about the action but no personal information (but again for social media I don’t think ES makes sense). But overall I must admit that this still leaves a lot of vurnerabilities as one could correlate the user ID to a specific user and then use metadata information like dates to determine things about them.
And this also reminds me of another Greg Young comment – yes, i ended up watching a LOT of youtube from EventStoreDB (Kurrent) and Greg young-- that one doesn’t just build “an Event Sourcing System/Application”, rather that ES is applied as necessary to wrangle the problems it’s good at handling; other parts of an application will apply other traditionally understood software patterns as fit for purpose.
I found Simple Transactions Made Simple in EventStore a very informative video, which covers how to perform the well known “transactional bank account money transfers” using Event Sourcing.
From my perspective as a CQRS/ES novice, I think that because with EventStorming or EventModeling, business subject matter experts are very much involved that it MIGHT be less likely to “get mistakes” in modeling a “Business Event” – not making claim about “Technical Events” … And MAYBE what might be more likely is that business, or understanding of the business, changes and that the Events must evolve to support such change… And this is still a HARD problem from what I gather, a trade-off in adopting ES in these problem domains… And Greg Young’s Event Sourcing Versioning book does provide some information, but I personally can’t comment further as I don’t have true expertise/experience when it comes to this kind of evolution.
But if you’re saying that the “bugged event” was a result of committed data that was incorrect… For example, a warehouse management system was keeping track of product inventory based on sales and shipping events which result in “incrementing” or “decrementing” product inventory counts, yet a manual count of product shows a different number… I believe that the suggested practice is that a NEW event be committed into the event stream to reflect the “reality/truth” — example ProductInventoryAdjustment{product_id, count} so the “internal aggregate” will be built to reflect this latest truth to handle the next command.
I’m hearing three topics here:
Snapshotting
Eventual Consistency
Application State Management complexity
I understand “Snapshotting” is a “technical pattern”, and the blog posts from Kurrent and various other experts in that world say that snapshotting is not recommended unless there is real good reason – since most streams are very short, and even streams of 1000s events is non-materially impacting performance for the purposes of these “enterprisey business entity lifecycle stuff”.
For Eventual Consistency of the “read-model” (aka materialized view), I believe that while the ESDB and CQRS/ES in general lend to an Eventually Consistent Read-Model, I believe that both ESDB/Kurrent and Axon Framework have some sort of projections mechanism that IS transactional-- someone please correct me if I’m wrong.
This What is EventSourcing video provided a few slides clarifying what they consider the four types of “ordered storage patterns” for application state management – command sourcing, change logging, state logging, event sourcing – and the rationale and reasoning of event sourcing.
What I was expressing was skepticism that materializing state from a set of immutable events at query time has different safety properties than materializing that state at insertion time, the former being “event sourcing” and the latter being, uh, “not event sourcing”. My argument being, if corruption is a concern it could happen anywhere, so it makes no difference.
I then extended that argument to cover corruption induced in the application layer, in response to a counterpoint. But at no point was I referring to data entry errors and so on - that’s a totally different thing. In that case the system is functioning properly.
We are clearly heading down the rabbit hole here, but I was not referring to eventual consistency. I mean I did kind-of allude to stale snapshot reads, but that wasn’t the point I was making either.
What I was getting at (and I realize I am now at attempt #3 here) is that materializing the “view” (or “projection”) at regular intervals - that is, taking snapshots - is in the limit equivalent to materializing the view incrementally on every single insert. And that is exactly the thing you’re doing when you’re not doing event sourcing. At the end of the day, it’s all equivalent.
But I have, here, learned a few things which cloud this perspective, of course. Mainly around the whole “add projections down the road” idea.