Audit logging on stereoids?

elvanja · October 21, 2024, 7:45am

We need to build an audit log solution that would keep all actions a user attempted to execute in the system, successfully or not. Hard requirement is that no audit log record should be lost (for compliance - a financial institution).

Related to logging mechanism, current idea is to just use Logger — Logger v1.17.3 and let the underlying infrastructure pick up audit logs and push them to Amazon Cloudwatch.

What I am worried about is that even though most specific log formatting and processing is done in caller process, we still have a small period of time when any audit log record would live as a message in the actual logger process which allows for logs to be lost, however small the chance for that. E.g. if application/container crashes for whatever reason, we may just loose those records that have not yet been submitted to the underlying system. If not mistaken, logger process is, well, a process, so all uncertainty of execution of specific message (log) applies.

What are your experiences with audit logs or even just reliability of Logger? Maybe this is a non issue and we should just proceed? Any suggestions / experience in how you solved such audit log issue in the past?

tcoopman · October 21, 2024, 8:37am

How hard is that requirement? Do you have to be able to prove you kept all of them?

If that’s the case, then I agree with your assessment. I haven’t looked at Logger in depth, but unless you have some transactions/blocking mechanism you can’t prove that everything has been logged and there are definitely ways of losing things.

Again, if the requirement is actually a hard requirement you have to either:

pause the execution of the command until you’re 100% sure that the command logging has succeeded, or
save the command and the outcome of the command in the same transaction - so they both succeed or fail at the same time.

In (1) you might need to deal with saving a command, but then the execution that fails. Not sure what you want to do with that.
The advantage of (1) is that you can use a different storing backends. In (2) you would deal with distributed transactions if you use 2 different storing backends/databases.

I guess you know about EventSourcing? There your events are the source of truth, so every decision you took is stored with 100% guarantee in an EventStore
You can apply a similar approach but with Commands and store your commands in an EventStore.

If you combine Command logging with EventSourcing and use Correlation and Causation ids, you get the added benifit of seeing which commands caused which events.

LostKobrakai · October 21, 2024, 8:44am

Just to phrase what @tcoopman wrote a little differently: If that’s a hard requirements common logging facilities are unlikely what you want. Your auditing records would become business records like any other, which you want to store in a database, where you are in charge of knowing the constraints around data loss and working within them.

joddm · October 21, 2024, 9:27am

Probably wiser to use the db rather than Logging module

pgaudit/pgaudit: PostgreSQL Audit Extension
supabase/supa_audit: Generic Table Auditing
Postgres Auditing in 150 lines of SQL

dstockdale · October 21, 2024, 11:46am

If you’re auditing in the db it’s worth looking at GitHub - bitcrowd/carbonite: Audit trails for Elixir/PostgreSQL based on triggers as well.

As it uses triggers it will catch absolutely everything you do or try to do in the db. It does mean you’ll need to wrap everything in a transaction and it doesn’t solve capturing user actions that aren’t db based.