Opinion on file & memory based event sourcing system

I thinking that disabling the Linux write cache could help here:

Not all system’s belong to the same “turn-on write-back caching” recommendation group as write-back caching caries a risk of data loss in the event such as power failure etc. In the event of power failure, data residing in the hard drive’s cache do not get a chance to be stored and a lost. This fact is especially important for database system. In order to disable write-back caching set write-caching to 0:

# hdparm -W0 /dev/sda

/dev/sda:
setting drive write-caching to 0 (off)
write-caching =  0 (off)

Reading the Kafka design decisions about going with a filesystem approach while taking advantage of all the low level stuff that Linux as to offer may help you in making some good decisions.

1 Like

Thanks for the info and links! I’ve tabled this project whilst I work on other things — bills to pay and all that… — but the idea still keeps nagging at me. I had intended to look at the guts of systems like Kafka and EventStore properly when I revisit this.

Do you have any insight on how the Erlang VM might impact this approach? When looking at this before it seems that any access to the low level OS stuff outside the VM is more limited than with other systems, but that may just be my unfamiliarity with the guts of the VM.

This approach of using files to persist the data need to take in consideration that the Erlang library disk_log by default as a delay of 2 seconds or 64kb to write to the disk as I say here:

This setting can be tuned, but care needs to be taken to find the right balance in terms of disk IO performance, otherwise you may create a bottleneck when writing to the disk.

I am not that familiar to, but @dimitarvp may have something to say here.

From what I remember when reading the Kafka design decisions you don’t need to interact with the OS from the BEAM, you just need to tune the machine for disk IO as they recommend.