Hobbes - a scalable, fault-tolerant transactional record store

Hobbes is a scalable, fault-tolerant transactional record store written in Elixir.

Hobbes is designed to be:

  • Scalable - Hobbes can shard data and scale horizontally across nodes.
  • Fault-tolerant - Hobbes can replicate data across nodes and survive failures.
  • Persistent - Hobbes writes data to persistent storage so that it can survive restarts and power failures.
  • Transactional - Hobbes offers atomic transactions over its entire keyspace (even across arbitrary nodes).
  • Consistent - Hobbes offers strict serializability for all transactions out of the box.
  • Embedded - Hobbes can be embedded directly into an application as a library.
  • Correct - Hobbes has a strong focus on correctness and has been aggressively tested in simulation from day one.

Hobbes provides what we refer to as an “unstructured record store” data model. Hobbes has a unified, ordered keyspace like a key/value store, but with strongly-typed keys and values. Hobbes deliberately stops short of providing schemas or migrations, though, with the goal of enabling a Hobbes cluster to function as a “multi-model database”: a database which contains other databases, each with their own data model.

Hobbes is, fundamentally, a tool designed to make building databases or other consistent distributed systems much easier. Here are some things you could build with Hobbes:

  • A distributed filesystem, using Hobbes as a scalable, persistent metadata store to map filenames to blobs
  • A distributed database, storing everything in Hobbes and taking advantage of transactions to keep your schemas, indexes, and user data in sync
  • Anything else which needs to scalably store persistent data: job queues, event stores, process registries, and so on

Hobbes is not the kind of database that an application developer would use to build an app. Instead, Hobbes is a tool which you can use to build that kind of database. For this reason, you might think of Hobbes not just as a database but as a new OTP primitive: a building block like :ets or :global but far more powerful.

Much like Erlang exists to solve the hard problems of distributed computing, Hobbes exists to solve the hard problems of building consistent distributed systems. We want to solve them once, properly, so that we don’t have to solve them again and again.

Hobbes takes care of the hard parts of building a distributed database, like strong consistency, concurrency control, replication, sharding, atomic transactions, and consensus, so that you don’t have to.

Hobbes exists to make building scalable, consistent distributed systems easy.

(…for more, see the README)

Hobbes is my database project, which I have alluded to in many discussions on here. It’s not ready for public use yet, but it is very much ready for public discussion, which is why I am publishing it here. You can find the roadmap to alpha testing in the repo; once we move to alpha I’ll publish a Hex package and so on.

In addition to being a fairly sophisticated distributed database, Hobbes is also unique in that it is one of only a handful of distributed databases ever to be designed and built from day one using Deterministic Simulation Testing. In order to do this I had to write a simulation testing framework in Elixir, named Construct, which you can also find in the repo. I’m pretty sure this is the first time anyone has ever attempted simulation testing on the BEAM.

You can find the source code for Hobbes here:

Questions are welcome, especially technical ones!

43 Likes

You did it! Congrats!

That is all as this is way out of my wheelhouse (though have been creeping along with the interactions between you and jallum).

3 Likes

Congratulations on the release. I’ve been doing my best to follow the discussions but as this is pretty specialized material, at one point I gave up.

One thing that I would remark is that a few examples could help. F.ex. there was a wave of people on this forum asking for stateful actors, and there were discussions around how useful would that be at all (if an actor persists state that makes it crash on every restart after, that’s actually a worse state of affairs than before) but there are likely multitude of useful usages.

Want to give us an example of a few of those?

5 Likes

Have you looked at khepri:

There seems to be some overlap with your goals.

2 Likes

I am indeed familiar with Khepri, and I think both it and Ra are important contributions in the direction of strong consistency on the BEAM. I have remarked before that it is quite strange there is no consensus primitive in OTP (e.g. a Paxos implementation), and Ra is literally that. BTW, Erlang is actually older than the first working consensus algorithms (Viewstamped Replication and Paxos).

Ra and Khepri are not, however, sufficient for my goals in particular.

MultiPaxos-style replicated log databases like Khepri are not meant to scale out and are designed to store a pretty small amount of data. Their tradeoffs also require them to store an unnecessary number of copies of the main dataset, which is fine for a small dataset but very bad at scale. Khepri also happens to be an in-memory database (the entire dataset is in RAM), which is not an architectural limitation but a tradeoff they’ve decided to take (which I’m sure is fine for their use-case).

Databases like this (see Zookeeper, etcd, Consul) are generally used as control planes rather than used to store the main dataset. The problem with this approach is that it means you actually have to build an entire distributed database. Something like Zookeeper is maybe 5% of an actual database.

Hobbes inherits from FoundationDB’s architecture. FDB is a reconfiguration system which is explicitly designed to store large datasets but provides a very open-ended data model. So FDB is maybe 80% of a database, but it solves nearly 100% of the “hard problems” of building a distributed database. Correctness is very hard, and FDB provides an abstraction which is correct and scales out of the box.

As I’ve mentioned in the past, I am interested in building tooling to replace things like Postgres, S3, and so on. I need an abstraction which can scale up to “real” datasets so that I don’t have to keep solving the same distributed problems over and over again. I want to solve them once, because they are very hard.

Hobbes is designed to provide strong consistency guarantees while storing several orders of magnitude more data than something like Khepri (and serving equivalently more traffic). Architecturally, the difference in complexity to meet that requirement is quite substantial, but that is what achieves my goals.

If you’re interested in the tradeoffs here, check out this excellent article which covers some of them.

7 Likes

The lack of examples is very intentional, because there is no public API. A curious reader might find their way to the workloads/ directory and read some of the test clients there if they want to know what the private API looks like :slight_smile:

I am very aware this is an extremely strange way to introduce a library, BTW. Building a database like this is a long journey, and doing it properly essentially means designing it to be tested. This project, like FDB, takes this to an absolute extreme. I remember hearing an FDB engineer remark that for the first two years there was not actually a database because they simply developed and tested everything in the sim. And indeed after well over a year of development Hobbes has never written a single byte to an actual disk.

It is very strange to write code this way; I’ve never done anything like this before. The entire codebase exists only to be fuzzed. It’s like a closed ecosystem: a digital terrarium.

You might think of “test driven development”, but amazingly I’ve come to the realization that unit tests are completely worthless. I had to stop writing them altogether. They don’t find any bugs, but they break constantly and have to be rewritten.

But what’s funny is when the day comes that Hobbes writes its first bytes to disk, the vast majority of the bugs will already be gone. It will be reliable from day one. Isn’t that weird?

Anyway, I have digressed a bit, but usage examples will come with the public API. There is a roadmap for what has to happen before that. There are no open questions at this point; I have a good idea of how to implement everything which is left. It’s only work, now.

I’m aware this means there are some who will click by, say “I have no idea what this is”, and leave. And that’s okay for now! It’s not ready for them yet.

3 Likes

Hobbes is a database, and is definitely not a library for stateful actors. When I say “OTP primitive”, what I mean is that Hobbes resembles a very sophisticated persistent ETS table. Where you might use :ets to back an in-memory data structure, you would use Hobbes to back an on-disk data structure.

There is a bit more, though, because Hobbes is not simply “on-disk ETS”. (Actually we have DETS for that.) Hobbes is fault-tolerant and will scale to very large datasets across nodes. An ETS table is not fault tolerant (tied to one node) and cannot be (natively) sharded. Also, ETS tables do not have transactions at all.

And so this is why I say it’s like a new OTP primitive, because there has never been an OTP primitive which can do these things. The closest would be Mnesia, but it suffers from small shard sizes and poor consistency guarantees.

Maybe Hobbes could be used to build a library for stateful actors. But that’s just because it is a tool for persisting state in general. For example:

  • A job library like Oban could use Hobbes to persist jobs (they currently use Postgres)
  • An event-sourcing library like Commanded could use Hobbes to persist events (they also seem to use Postgres)
  • A distributed process registry could use Hobbes, taking advantage of its fault-tolerance and strong consistency

But the main reason I wrote Hobbes is to serve my own needs. As you know, I want something that can replace Postgres itself (and S3 and similar tools). Hobbes is essentially an abstraction layer which contains all of the hard problems associated with building a distributed database, so that I can reuse it to solve the “easy” problems each time. Much like Erlang solves the hard problems of distributed scheduling and message passing and so on.

I can then use Hobbes to build, say, a distributed filesystem, or a relational database.

Actually, the relational DB already has a name: it will be called Memex. And Memex will probably be the thing most will take interest in, but it does not yet exist.

Hobbes will always be the “pro tool” for those who want to get their hands dirty, but just not “rolling a database from scratch” dirty :slight_smile:

5 Likes

I love the ambition of this project and congrats on this milestone. Excited to see how it evolves and what the public api ends up looking like for the eventual Postgres replacement.

Might be worth putting this message somewhere in the Readme along with whatever caveats are appropriate :slightly_smiling_face:

3 Likes

Thanks for the examples, much appreciated.

I strongly resonate with this:

That’s practically most of my reason to still be a programmer. So you have at least one person who super strongly subscribes under this philosophy together with you. The IT area keeps chasing its own tail and solving the same problems over and over again. That is what is weird.

I could not resist to not digress a little bit myself, my apologies for that.

There are many unit tests that are too micro and one does indeed find themselves maintaining a contract they don’t know whether will be valid next week. I relate to that a lot.

But let me point out that fuzz / mutation / property / integration testing is also TDD of sorts. I am not here to start academic debates however, so let us not do that, I wanted to give you praise that you wanted to have a harness that makes sure that your expectations, whichever level they live at, are always met and checked. That is what truly counts.


The examples that caught my eye are the following:

(No idea about Commanded so not commenting on that.)

ETS is really good but I did find myself wanting to replace memcached and Redis with it and obviously could not because DETS and Mnesia are a separate industry at this point and I did not wanted to dig myself in that particular grave.

Oban – awesome!

Distributed process registry is something that I feel is still under-served. My last job involved Fly.io and man, the amount of times I’ve seen logs saying that the orchestrator can’t find a node is now burned in my brain and will not leave its apparently comfy spot soon.

Thanks for indulging the questions. I know that for a person like yourself working on the problem itself is much more important than serving commercial (and sometimes hobbyist) needs but many of us on the front lines are acutely aware of the suboptimal “state of the art” of a lot of tooling and dream of more. Thanks for moving that dream a little bit closer to reality.

3 Likes

Despite my reading along with a lot of your convos around databases here, I mostly fall into this category. Thanks to the more recent posts between you and @dimitarvp I’ve got a clearer picture of exactly what Hobbes is. I do have some questions but I’m going to sit on those for now becauuuuse you have totally nerd snipped me with your TDD comment. The most succinct way I can put this is that I also generally dislike unit tests, but if writing them helps inform lower level design but then you throw away all those tests away once you move up a level, you’re still doing TDD! Even if (more accurately when) you end up completely rewriting all those lower level functions based on the higher level tests… yep, still by definition (with the original definition being extremely fluid) TDD :smiley:

Really most of these small unit tests are just REPL driven development where instead of pressing up a bunch of times, you write your test in a file that is easy to run, edit, and save for as short or as long as you want.

4 Likes

I have written a new introduction for the README, taking into account the above feedback. I don’t think I could have written the intro without this dialogue taking place first, so thanks to all of you for that.

4 Likes

lol I see I managed to snipe both of you with this throwaway line.

Hobbes’s development is absolutely driven by testing. In fact, it’s test-driven to a frankly extreme degree. But it is not “Test Driven Development” in the popular sense. The first line for TDD on wikipedia literally explicitly mentions “unit tests” so I feel pretty justified that that’s the popular sense.

And I have written ExUnit tests as a means to develop features sometimes, but they are not unit tests. They are serving as stand-ins for a script, with the sole reason being that mix test is easier to type than mix run priv/scripts/whatever.exs. There is no resemblance to actual unit testing.

I did not have such an extreme opinion before working on this project, but unit tests are completely and utterly worthless. They are best used as a manual method of snapshot testing to catch regressions, but one would be better off doing actual snapshot testing there.

Once you are working on a program which is complex enough that you cannot exhaustively validate its correctness in your head, you need to start fuzzing. Every database-shaped codebase I have read (which is, at this point, many) has had fuzzers. I don’t know why people don’t talk about this more, but I guess it’s just something that you learn pretty quickly while working on such code.

Unit tests do not find bugs. They suck.

2 Likes

Just keep in mind that I wrote that in the context of serving my needs, and what replaces Postgres for me may not necessarily replace it for you. You might like parts of Postgres that I dislike, for example SQL, and you certainly won’t be getting those parts from me!

But for Hobbes specifically, the idea is that you could use it to build your own database if you like. Query planning ain’t easy, though.

2 Likes

Well good, now that we’re on the same page I can ruin your day by telling you I don’t think distributed process registries are a very good idea.

There is a specific architecture that I see people reaching to. Like you have a job queue, and you want to ensure only one process executes each job.

This is the wrong approach. No matter how hard you try it’s basically impossible to ensure there is actually only one process. I am not going to justify this statement, in part because clearly you have some idea of how much this sucks, but… it sucks. Don’t do it.

Instead, you can push the guarantees down to the datastore. Insert a job row, and then when it’s complete delete the job row and insert the result. We’ve discussed this before.

The problem is that in order to do this at scale you need a Very Big Multitenant Distributed Database or it gets very hard.

This project is, actually, that database.

I do not want a process registry to coordinate jobs for an external datastore. I want a datastore that enables me to not do that.

See also: the QuiCK paper

3 Likes

That’s largely my point: I don’t see them as a means of catching bugs, I see them as a means of informing design. In all honesty I usually don’t end up deleting them, but in the case of a large refactor I throw them away and don’t bring them back. I’ve tried fully top down before which was great in getting full coverage, however it a single function interface that parsed and execute instructions for mapping images onto product templates. I ended up with very thorough tests but the code itself was a bit of a mess. These days I start somewhere in the middle (but I don’t want to derail this thread too much and I’ve already had some exhausting TDD convos around here… sooooo I’ll shut up now)

1 Like

lol I couldn’t care less about derailing so don’t worry about that.

But we probably don’t actually disagree about anything here. It’s just that I do need to find bugs, and the tests that actually find bugs dwarf the coverage of unit tests so completely as to make them totally insignificant. And yet somehow unit tests are more annoying to write.

1 Like

Oh ya, I know we’re almost entirely agreeing here, I’ve just already been down the “unit tests are worthless!” path and then decided, “Well, maybe there is some value.” But who knows, maybe you won’t!

Along those lines, LiveViewTest was a major draw of LiveView me. I still often start with them but then immediately dip down (so I guess more BDD?)

1 Like

Excellent. Not a SQL fan either so curious to hear more about what you’re cooking.

2 Likes

Congrats, and thanks for sharing your progress!


Given the comments from different people above:

  • An event-sourcing library like Commanded could use Hobbes to persist events (they also seem to use Postgres)

and

Have you looked at khepri:

I’m also linking the ExESDB Event Store that @beamologist created on top of khepri and ra– more idea/though cross-pollination .

2 Likes

To be fair, a database like this is a tool with a very constrained API and extremely strong correctness requirements, so this is like the worst case for unit tests. Unit tests are probably better for situations where a large number of mediocre tests can cover a large API surface area which is unlikely to have deep bugs.

But, like, they’re still inferior. You’re just “getting away with it” there.

This is actually quite relevant because I think the biggest failure of FoundationDB was that the simulator cannot (easily) be used to test layers. By writing Hobbes, the layers, and the application in Elixir my goal is to be able to fuzz them all *together, deterministically, using Construct. In this way, I hope to finally fulfill what I see as FDB’s true potential.

So the question of what these fuzzers look like with application code is actually very relevant to me, and it’s something I’ve been thinking about a lot. I have nothing to show for it yet, but if things go well then down the road I will be able to demonstrate a better path.

I agree. The even longer-term goal is to find a way to fuzz UI code, but I genuinely have no idea what that will look like. Especially since I want to move to JS for that. It’s gonna get weird…

1 Like