I am indeed familiar with Khepri, and I think both it and Ra are important contributions in the direction of strong consistency on the BEAM. I have remarked before that it is quite strange there is no consensus primitive in OTP (e.g. a Paxos implementation), and Ra is literally that. BTW, Erlang is actually older than the first working consensus algorithms (Viewstamped Replication and Paxos).
Ra and Khepri are not, however, sufficient for my goals in particular.
MultiPaxos-style replicated log databases like Khepri are not meant to scale out and are designed to store a pretty small amount of data. Their tradeoffs also require them to store an unnecessary number of copies of the main dataset, which is fine for a small dataset but very bad at scale. Khepri also happens to be an in-memory database (the entire dataset is in RAM), which is not an architectural limitation but a tradeoff they’ve decided to take (which I’m sure is fine for their use-case).
Databases like this (see Zookeeper, etcd, Consul) are generally used as control planes rather than used to store the main dataset. The problem with this approach is that it means you actually have to build an entire distributed database. Something like Zookeeper is maybe 5% of an actual database.
Hobbes inherits from FoundationDB’s architecture. FDB is a reconfiguration system which is explicitly designed to store large datasets but provides a very open-ended data model. So FDB is maybe 80% of a database, but it solves nearly 100% of the “hard problems” of building a distributed database. Correctness is very hard, and FDB provides an abstraction which is correct and scales out of the box.
As I’ve mentioned in the past, I am interested in building tooling to replace things like Postgres, S3, and so on. I need an abstraction which can scale up to “real” datasets so that I don’t have to keep solving the same distributed problems over and over again. I want to solve them once, because they are very hard.
Hobbes is designed to provide strong consistency guarantees while storing several orders of magnitude more data than something like Khepri (and serving equivalently more traffic). Architecturally, the difference in complexity to meet that requirement is quite substantial, but that is what achieves my goals.
If you’re interested in the tradeoffs here, check out this excellent article which covers some of them.