lucaong
CubDB, a pure-Elixir embedded key-value database
Hello Elixir and Nerves community,
I have been working for a while on an open-source embedded key-value database for Elixir, that I called CubDB. I use it for several IoT projects I run using Nerves, where I need to store large-ish amount of data locally to the device.
I am already using it in production, but before I release version 1.0.0 I would love some feedback from the Nerves (and Elixir) community.
You can find the CubDB repository here
And here the API documentation
A quick basic usage example:
{:ok, db} = CubDB.start_link("my/data/directory")
CubDB.put(db, :foo, "some value")
#=> :ok
CubDB.get(db, :foo)
#=> "some value"
CubDB.delete(db, :foo)
#=> :ok
CubDB.put(db, {:keys, "can", :be, 'anything'}, ["and", :values, 'too'])
#=> :ok
# Check out docs for advanced usage with select/3 and get_and_update_multi/4
I know that Elixir comes with ETS/DETS and Mnesia, but:
-
ETS is not persistent across reboots
-
DETS does not offer sorted collections, and is thus not ideal when one needs to select arbitrary ranges of keys, iterate in order, etc.
-
Mnesia is great, but on embedded projects I don’t need distribution
-
Sometimes I really just need a “persisted map”, sorted by key
-
It’s nice to be able to backup the whole DB by just copying one file
The use-cases I am primarily targeting is what described in this blog post by the Nerves team: https://embedded-elixir.com/post/2017-09-22-using-ecto-and-sqlite3-with-nerves/
CubDB is somehow similar to SQLite in which it stores the data locally in a single file, but it is written in Elixir, is key-value and schema-less, and both keys and values can be any Elixir (or Erlang) terms, so no serialization/de-serialization is needed.
The data structure it uses is an append-only immutable B-tree, inspired by CouchDB: that guarantees robustness to data corruption (no in-place mutation), and enables features like concurrent read operations that do not block writes, and atomic transactions.
It was already a lot of fun for me to develop it, but I would love to hear your constructive feedback.
What do you think about it? Do you have a use-case where this could be useful? Do you have feedback about the API?
Thanks in advance ![]()
Most Liked
lucaong
Also, as this thread is now updated again, I take the chance to announce that the release candidate for CubDB 1.0.0 is now on Hex as v1.0.0-rc.1
. It is the result of running CubDB in production on embedded devices for the past year, and introduces a few improvements that make the API more solid. The notable changes are:
- The database is 100% compatible with the previous releases (I have no intention to break compatibility there).
- Auto compaction and auto file sync are now the defaults: I decided to go for safer defaults, that should be good for the vast majority of cases, and let users tune it for maximum performance in special cases. More info about compaction and file sync are here
- The timeouts (for example in
selectorget_and_update_multi) are now enforced on the callee side too, freeing up resources immediately when a timeout elapses. Their API is also slightly different, astimeoutis now passed as an option, instead of a separate argument.
For most users, the changes needed to update to v1.0.0(-rc.x) are minimal: just review if the new defaults are ok for you (or set them explicitly), and, only in case you are using explicit timeouts, adapt calls to select and get_and_update_multi. I will write a proper release post on this forum when 1.0.0 is out, but I wanted to inform in advance people that are reading this thread and that expressed interest in CubDB.
lucaong
Hi @wolfiton,
thanks for your question. I really admire the engineering work of the Redis author, Salvatore Sanfilippo, so it’s nice for me to see CubDB and Redis discussed in the same context.
That said, the Redis and CubDB have considerably different goals and characteristics, and I think the overlap in use-case is quite small. I will try to clarify that a bit:
-
Redis is a “data structure server”, to which one connects over the network. It keeps data primarily in memory to ensure very fast operations, and uses the disk to recover after a restart. It offers several different data structures (maps, list, sorted sets, streams, etc.) and is agnostic about the programming language used by the user. So, it is shared (multiple apps/instances can connect to one Redis db), very fast, but data must fit in memory. Common use-cases for Redis are: shared in-memory cache, shared data-structure for queues or parallel computation, shared locks.
-
CubDB is an embedded database, so it run “inside” your application, with no network connection. It works sort of like a map, but persisted on disk (plus all the sorted lookup operations). It can be used directly only by Elixir or Erlang, but has the convenience of having zero dependencies and storing native Elixir terms without requiring the user to implement serialization/deserialization. It is not shared between different apps/instances (unless you implement yourself a server layer on top of it). It stores data primarily on disk, so it can store more data that can fit in memory. It’s designed for robustness in case of power failures, and simplicity to install and use from Elixir apps. Primary use cases would be data storage for an embedded application (think Nerves running on a Raspberry Pi), or data storage within one app instance.
Of course, one could build a small server on top of CubDB, and expose its features over a network, achieving something comparable to Redis maps. That would be a nice project ![]()
Right now I am working on the core, and focusing on doing one thing well: a versatile and robust key/value storage. Hopefully that will enable developers to get creative and build more use cases on top of it.
lucaong
After extensive testing on a number of test Nerves devices, I was finally able to identify the issue that @Qqwy reported.
It was a bug with the way the most recent database file is chosen, in cases when a restart happens right after a compaction, but before the old file is cleaned up, and CubDB sees more than one database file. The wrong file was chosen, leading to the new records disappearing.
The issue is solved with the latest release, v0.12.0, which is 100% backward compatible. Thanks a lot @Qqwy for reporting and helping. Version 1.0 is getting closer, thanks to valuable feedback from people in this forum ![]()
Popular in Discussions
Other popular topics
Categories:
Sub Categories:
Forums
Popular Tags
- #ecto
- #liveview
- #troubleshooting
- #learning-elixir
- #deployment
- #library
- #erlang
- #testing
- #genserver
- #mix
- #absinthe
- #remote-other
- #otp
- #plug
- #how-to-question
- #macros
- #postgres
- #channels
- #elixirconf
- #exunit
- #discussion
- #javascript
- #code-sync
- #podcasts
- #onsite
- #dialyzer
- #docker
- #authentication
- #umbrella
- #full-time-contract
- #podcasts-by-brainlid
- #ecto-query
- #elixir-ls
- #phoenix_html
- #iex
- #blog-post
- #graphql
- #genstage
- #ai
- #websockets
- #supervisor
- #advent-of-code
- #elixirconf-us
- #distillery
- #processes
- #forms
- #api
- #metaprogramming
- #security
- #performance








