Ecto_foundationdb - An Ecto Adapter for FoundationDB

I’ve started development on an Ecto Adapter for FoundationDB: GitHub - ecto_foundationdb.

FoundationDB is a distributed database with ACID transactions. ( https://www.foundationdb.org/ ). The adapter is still very early stage, but some basic functionality works, and I’m interested in gathering some early feedback.

There are no published docs yet, but in addition to the README, some more documentation can be found in the Ecto.Adapters.FoundationDB module.

13 Likes

Announcing EctoFoundationDB 0.1.0!

EctoFoundationDB is an Ecto adapter for FoundationDB, a distributed key-value store that is designed to be scalable, fault-tolerant, and performant.

Quick Links:

Features:

  • CRUD plus indexes
  • Multi-tenancy
  • Automatic migrations
  • Custom indexes
  • FDB Transactions

Due to FoundationDB’s Layer Concept, EctoFoundationDB is a more than a wrapper. It has opinionated default behavior that is intended to fit the needs of modern web applications, and it also allows you to add structure to your data beyond the table. It does both of these things with ACID transactions, to ensure your entire data model is in a consistent state no matter what.

For example, maybe you want to put all your Users in a durable queue to process later. Or maybe you’re interested in implementing your own vector similarity search directly on top of your existing data model. Perhaps you’re intrigued by the sound of automatic schema migrations. Maybe you just need some very solid simple data storage with high availability.

EctoFoundationDB can help you do any of this.

For me, after managing various medium-to-large-scale SQL and NoSQL databases in production for 12 years and eventually deciding I’m more of a NoSQL guy, I simply wanted an Ecto adapter where I felt like I was at home.

Finally, thanks to @Schultzer and @warmwaffles for their open source adapters. I learned a lot from ecto_qlc and ecto_sqlite3, and you should definitely check them out!

7 Likes

Nice! I was looking at possibly building a FoundationDB adapter, but I haven’t used it before to really know what I was getting myself into.

1 Like

This looks awesome.

1 Like

EctoFoundationDB 0.2.0 is released (changelog)

There are 2 new features for writing fast transactions:

  • Pipelining: The technique of sending multiple queries to a database at the same time, and then receiving the results at a later time, still within the transaction. In doing so, you can avoid waiting for multiple network round trips. EctoFDB provides an async/await syntax on the Repo.
  • Upserts: Support for Ecto options :on_conflict and :conflict_target

For example, we can combine these features into the transaction below, which safely transfers 1 unit from Alice’s balance to Bob’s balance. With Pipelining and Upserts, there are 2 waits for data from the network (best case).

(Reminder: FoundationDB’s transactions are ACID and globally serializable)

def transfer_1_from_alice_to_bob(tenant) do
  Repo.transaction(fn ->
    a_future = Repo.async_get_by(User, name: "Alice")
    b_future = Repo.async_get_by(User, name: "Bob")

    # 1. wait (Alice and Bob pipelined)
    [alice, bob] = Repo.await([a_future, b_future])

    if alice.balance > 0 do
      # No wait here (because of `conflict_target: []`)
      Repo.insert(%User{alice | balance: alice.balance - 1}, conflict_target: [])
      Repo.insert(%User{bob | balance: bob.balance + 1}, conflict_target: [])
    else
      raise "Overdraft"
    end

  # 2. wait (transaction commit)
  end, prefix: tenant)
end

Compare with this logically equivalant transaction, implemented without pipelining nor upserts. It waits 5 times for data on the network.

def transfer_1_from_alice_to_bob_but_with_more_waiting(tenant) do
  Repo.transaction(fn ->
    # 1. wait
    alice = Repo.get_by(User, name: "Alice")

    # 2. wait
    bob = Repo.get_by(User, name: "Bob")

    if alice.balance > 0 do
      # 3. wait
      Repo.update(User.change_balance(alice, -1))

      # 4. wait
      Repo.update(User.change_balance(bob, 1))
    else
      raise "Overdraft"
    end

  # 5. wait (transaction commit)
  end, prefix: tenant)
end

Thanks for reading!

2 Likes

EctoFoundationDB 0.3.0 is released (changelog)

Breaking changes

We’ve refactored the implementation of multitenancy, making 0.3.0 incompatible with data from previous versions. If you have a database that needs to be upgraded, please submit an issue.

New feature: Watches

FoundationDB Watches are similar to Triggers in an RDBMS. Registering a watch on a particular key provides a guarantee* that when that key in the database is changed, the client application is notified with a push-style notification, delivered directly to the Elixir process that requested it.

future = Repo.watch(struct, label: :mystruct, prefix: tenant)
# later on...
receive do
  {ref, :ready} ->
    # `struct` changed
end

Livebook | Watches in LiveView has a demonstration of using Watches instead of PubSub with a simple phoenix_playground app.

EctoFoundationDB 0.4 is released (changelog)

New feature: Large Structs

In a FoundationDB key-value pair, a value is limited to 100,000 Bytes. Previously EctoFoundationDB did not protect your app from this limitation. Now, we will split the binary among several keys behind the scenes. No changes to the API surface. Note: other FDB limitations still apply.

1 Like

Love your work!
I’ve been following the project closely. In fact, the idea of using Fdb in a real project is very enticing.
However, I’m a bit worried about bus factor of 1 and basically no community.
Can you provide your thoughts on this ? How mature is the whole solution ? I’m not asking about Fdb itself, which is battle-proven and is very stable, obviously, but specifically about using it with Elixir. If I understand correctly, the C driver is provided by the Fdb itself (Apple ?), so it must be stable. However, there’s also erlfdb.
Fdb seems like an amazing technology, pity that more companies don’t want to invest in it.

1 Like

Hi there, thanks for the message. You’ve rightly identified that there are some risks to running erlfdb or ecto_foundationdb for a project. Let’s discuss from the bottom-up.

FoundationDB server and libfdb_c: Maintained and released by the FoundationDB team at Apple. Production ready, battle tested. There are reasons to choose FDB over other DBs and reasons not to. Happy to discuss more, but probably out of scope for this post.

erlfdb: A NIF wrapper of libfdb_c. With any NIF there is risk of bringing the BEAM VM down. The project was originally implemented by the CouchDB team working closely with the FDB team. The apache-couchdb/erlfdb project is used in production apps with success.

The foundationdb-beam/erlfdb fork (where the hex.pm package comes from) has some changes, and to my knowledge has not yet had a production deployment anywhere. However, I’m aware of one project where it will be soon, in an app that’s very important to me professionally.

I’ve been conservative with my changes to the fork to preserve its production-readiness. I’m confident erlfdb will hold up well to production scrutiny. Of course please report any bugs to the issues page. :slight_smile:

ecto_foundationdb: Still young, and ready for experimentation. No battle testing to my knowledge, but I do seek to change that. There are some projects that I have in mind, but they’re still a ways out.

In FDB parlance, ecto_foundationdb is a Layer. A consequence of an FDB stack is that correctness in the Layer is just as important as correctness in the database itself, so extensive testing is encouraged. An example of something that needs more testing focus is migrations. Everything works on paper, but it needs a longer term app to live in to make sure the migrations hold up as expected across iterative application releases.


In short, I’d call erlfdb production-ready, but not yet production-proven (due to the fork) and ecto_foundationdb is ready for community experimentation. I am personally and professionally invested in them both, and welcome further discussion, issues, and PRs.

3 Likes

This is a really cool project, and I think most people passing by this thread don’t realize how much work this actually is. You’re essentially building a database inside an Ecto adapter :slight_smile:

If I could offer a couple points of feedback:

First, in your docs (which are great btw) you mention that users should either use a secondary index to perform :where queries or instead implement the filter themselves with Enum.filter (or Stream.filter).

I think this is a mistake. The problem is that if someone writes code to perform a :where query and then filter it, and then they decide they want to improve performance with an index, they have to rewrite and retest their code. If you instead implement the (local) filtering inside the adapter, they can keep their code the same and add indexes as needed. This is more work, of course, but I think it would be worth it, because I could see this situation coming up a lot.

Second, I noticed you’re storing records in the DB via term_to_binary on the structs (from what I can tell it’s actually kw lists but I didn’t look too deep).

The problem is that if you encode the k/v pairs of the records with the actual keys, they can never be updated. You will never be able to rename or delete columns without rewriting the entire table, which is not viable at scale because the table will be too big to rewrite atomically.

Apple solves this in the record layer by using Protobufs to serialize the records and then taking advantage of the field tags to rename or drop columns in the schema without having to rewrite the actual records. You could probably do something similar, though you would have to think about how to integrate with Ecto (which has no concept of field tags). Of course another benefit is compression of the field keys, which are much smaller as varints.

1 Like

Please elaborate. Interested in learning more. Thanks!

Thanks for the feedback, I love to have it. You raise some good points. I apologize if my response is long.

Non-indexed filtering, and other query features

The benefit of us only supporting indexed :where clauses is that the developer knows exactly what they’re getting when they write the query. Of course we have no explain/analyze here, so I believe the role of the adapter must remain very clear to the end user. That role is: For each call to Repo.all, EctoFDB will always execute a single get* operation against FoundationDB (disregarding the IndexInventory), and do minimal processing locally. This way, the developer knows that the fact that a query works at all is confirmation that they’re getting expected performance characteristics, with respect to their chosen indexes.

Indeed, a downside of this is that when a new index is created, the developer must rewrite their queries to take advantage of the index. There is a risk that they miss a query and their app fails to leverage the index. I do consider this an ok tradeoff. The target audience is people that want to get “closer” to their data so-to-speak.

Moreover, suppose the adapter did support open ended where clauses. EctoFDB would then need to decide which index to use to conduct the query, which puts us on path to implementing a real query planner, which I’m not prepared to bite off at this point.

This line of thinking probably seems backwards to conventional wisdom around database querying. Most people usually want their query to do as much work as possible. For a database like Postgres, this is natural because the query computation is done within the database itself, where the sophisticated query planner can make many optimizations to carry out sorting, joining, filtering, column selection, etc, which reduces the total data that is transmitted on the network. In our case, the compute is detached from the storage, a consequence of the FDB Layer concept. This implies that you must pull a relatively large amount of data into your client and operate on it there. (And encourages your client to be as “close to” the FDB server as possible) The current design limitations of this adapter on Ecto.Query reflect these ideas back to the developer, so hopefully there is little doubt about the behavior.

One last point tangential to this topic – EctoFDB’s use of tenants is already buying 1 level of space partitioning on the data. While some relational approaches may use a foreign key for multitenancy, EctoFDB arranges the keys such that all data for a tenant is partitioned in space from others, meaning there is already an index built in, in a way.

Value encoding using term_to_binary

This is a design decision that I struggle with to be honest. There are pros and cons. BTW you’re right about the terms themselves being Keyword lists.

Things I like about term_to_binary on Keyword list:

  1. Fast and simple
  2. Fairly easy to inspect and debug FDB key-value pairs outside of Ecto
  3. Adding a new field is trivial. IME adding new fields is the most common schema change.
  4. All Erlang terms are supported naturally

Things I don’t like:

  1. Wasteful in space – all field names are stored in each value. We may someday make use of the :compress option, but the gains will be limited due to the unique field names.
  2. Renaming fields does require a data migration, as you pointed out. In fact, EctoFDB doesn’t yet support renaming fields at all, a major gap at the moment.
  3. Some Erlang terms should not be stored permanently, and EctoFDB doesn’t provide any assistance. For example, storing pids would work, but can be risky.

You’ve noted some good benefits to using Protobuf. Here are some drawbacks as I see it.

  1. A headache to manage. This is more of a personal opinion on Protobuf.
  2. Unclear to what extent the Ecto.Schema could be tied to a Protobuf definition. If you’re aware of any work in this space, I’m interested in hearing about it.

That being said, it does seem reasonable for EctoFDB to support both term_to_binary and Protobuf (or something similar) someday, perhaps there are even use cases where it’s a choice that can be made at the Ecto.Schema level. However I’m not at a spot where I would implement this now. There is significant complexity, and the term_to_binary approach has not failed me yet. If you have a use case for Protobuf perhaps we can collaborate on some ideas in a GitHub Issue.

1 Like

I recommend getting started at the “Why FoundationDB” link posted above. The classical use case here is horizontal scaling of write-heavy workloads.

Also there is a well-received talk from the founder about Testing Distributed Systems w/ Deterministic Simulation (YouTube) that you might find interesting.

TBH most projects should probably just choose Postgres and move on, but there is room in the world for other databases also. For me, FoundationDB has been a pleasure to work with. Operationally, it’s rock solid. Plus you can do some cool stuff like building data structures directly on top of your data: KvQueue - A distributed durable queue with erlfdb

2 Likes

Absolutely not, your response is much appreciated :slight_smile:

To be clear, my criticism here is not a technical one - this is a UX problem. When a user is iterating on a query they might not yet know which indexes they will need, and it will harm productivity for them to rewrite all of their queries every time they make changes to the indexes. There is also a huge risk of introducing bugs every time they do this.

You are correct of course, and this is what I was getting at: I think it would be difficult, but well worth it. I will note that it does not need to be a particularly good query planner, you could probably get away with using the same index selection logic you use now and then just running everything else through Stream.filter in the adapter. You can improve it later, what matters is that the API is stable so you don’t have to rewrite business logic.

I think you have reversed cause and effect here. The reason declarative queries are the standard is not because the work is done on the server, but the other way around. Declarative queries are the standard because they are much easier to work with for most developers, for the exact reason I gave above: you don’t have to rewrite your code based on how the data is stored (i.e., in an index).

But this is still orthogonal to my point, which is that you should try to keep the API invariant to the schema (the indexes) so that you don’t have to rewrite business logic. The lack of predicate pushdown does not affect this equation: even if you could push down the filters it would still be slower than an index, so the query planner would be roughly the same.

Btw they actually have added predicate pushdown, though they’ve certainly made things harder on themselves by storing their records in a format the underlying database doesn’t understand (protobufs).

The problem is more that such data migrations are impossible because you would hit the transaction limit. And even if FDB did not have such a limit, rewriting a large (even a TB) table would block for too long. Considering an FDB table could be hundreds of TB, and the transaction limit is 10MB, this is simply out of the question.

Oh I was definitely not suggesting using Protobufs, I was just citing what the Record Layer does. This is absolutely something that would require a lot of thought, especially since you have to worry about Ecto too.

Here’s an approach you could take: Store a schema in the FDB tenant (I assume you do something like this already?) and then assign integer ids to string fields within the schema. Update this mapping as fields are renamed or dropped.

Then instead of storing [field: key], you can store [1: key] and keep the mapping %{field => 1} in the schema metadata. You can aggressively cache the metadata and use FDB’s special metadata version key to invalidate the cache (this is literally the exact thing that was added for, by the way).

While rich declarative queries do provide a better API for developers, a lot of the value I personally get from developing an app on FDB comes from a deep understanding of the data and confidence that I know precisely what the performance characteristics are of any given query, even if it means lack of an expressive query syntax. Other projects do that sort of thing much better than I could anyway!

I do understand all your points wrt query UX. They are solid points. I just feel that the juice isn’t worth the squeeze for me. Instead, I get more excited about exploring other data retrieval techniques such as vector similarity search. That being said, I do welcome contributions!

EctoFDB uses GetMappedRange in the Default Indexer.

Yes, impossible within a single FDB transaction. EctoFDB does provide a multi-transaction migration for index creation. It works based on my testing but admittedly it needs more focused testing. And again, there doesn’t currently exist any schema migration. This is something I’ve been putting off that needs to be addressed. :sweat_smile:

The Apple Record Layer also has a solution for this, at great effort. YouTube talk about it (Sorry to link another YouTube video, but that’s where a lot of this info lives, sadly)

Actually no, by storing the Keyword list in the value, we obviate the need to store the schema details in the database. Yes, this is limiting, but the simplicity is quite nice. We do however store the index details in the tenant – just not the schema details.


If any readers have made it this far, I assume there are many recoiling in horror at this point. :laughing: EctoFDB will probably never fulfill all your database needs, but for some outside-the-box use cases, I think it can be useful.

1 Like

This trick works for building indexes because you can set them to write-only until they’re complete and the indexing operation is idempotent. The same trick would not work (naively) for a schema migration (column rename) because when you read a record you won’t know whether it was yet migrated. You could use indirection to block the entire table until the migration was complete, but again: our “table” could be hundreds of terabytes, so the idea of rewriting the table at all is out of the question. The obvious solution is to use indirection to store the column names in a schema, which is also what Record Layer does (via protobufs).

You could probably also do something where you instead store the schema version a record was written at and migrate them on the fly, but I think that’s both more complicated and throws away the nice compression gains from integer column ids, so why bother?

Yeah, the index metadata was actually what I was referring to there - I see how that was unclear. If you want schema migrations I don’t see how you will be able to avoid storing the schema in each tenant.

I believe Apple actually uses another layer of indirection where they store a reference to a schema (which is stored in another tenant or cluster which stores schemas). It makes no difference where it’s stored because it’s cached using the metadataVersion key either way, and that key is “read” in every transaction.

I wouldn’t under-sell yourself here :slight_smile: FDB is a very special database because it scales horizontally with strict serializability and it’s nearly indestructible and it’s permissively open-source. The biggest downside is that it’s not user-friendly, and that’s the role you’re filling. I think people would find that very useful, especially because Elixir apps are meant to “scale” and existing databases are often the bottleneck.

I’m actually quite curious what you have in mind for this. Apple managed to build FTS indexes on FDB, but I don’t think they’ve done a vector index yet.

I’ve actually been thinking about how one would implement vector search on FDB’s KV model and it’s a bit tricky. IVFFlat is no good because it can’t be updated incrementally. At that point there’s little reason to bother with storing it in FDB because you can’t keep it up to date transactionally anyway.

HNSW supports incremental updates, but the graph approach is essentially intractable for anything except an in-memory database. The sequential retrievals would get demolished by FDB’s read latency, I think. I know there are some HNSW variants designed for disk but I’m skeptical of them (though I’m no expert here, to be clear).

I think the most promising approach, which has been gaining some attention lately, is to quantize the vectors heavily (1bit) and then use SIMD to brute-force search them. If they’re 1bit quantized the distance operation literally becomes xor(v1, v2) |> popcnt(), which can be really fast. I believe Nx actually supports those operations so I’m curious what the performance would look like.

But most importantly it would be easy to build an index in FDB which stores /index/quantized_vector/primary_key and then do a reverse lookup to find the records after the quantized search. Then you can do reranking with the full vectors or with an LLM.

I will note that quantized approach only works well with large vectors, which makes sense: if the dimensionality is too low then too little information survives the quantization.

This is very interesting work. Right after FoundationDB was open-sourced I tried to do something similar and created fdb and experimented with fdb_layer. Unfortunately, I couldn’t convince any of my previous employers to use it, so it ran out of steam after some time.

I see a lot of parallels, good luck.

1 Like