lucaong

lucaong

CubDB, a pure-Elixir embedded key-value database

Hello Elixir and Nerves community,
I have been working for a while on an open-source embedded key-value database for Elixir, that I called CubDB. I use it for several IoT projects I run using Nerves, where I need to store large-ish amount of data locally to the device.

I am already using it in production, but before I release version 1.0.0 I would love some feedback from the Nerves (and Elixir) community.

You can find the CubDB repository here
And here the API documentation

A quick basic usage example:

{:ok, db} = CubDB.start_link("my/data/directory")

CubDB.put(db, :foo, "some value")
#=> :ok

CubDB.get(db, :foo)
#=> "some value"

CubDB.delete(db, :foo)
#=> :ok

CubDB.put(db, {:keys, "can", :be, 'anything'}, ["and", :values, 'too'])
#=> :ok

# Check out docs for advanced usage with select/3 and get_and_update_multi/4

I know that Elixir comes with ETS/DETS and Mnesia, but:

  • ETS is not persistent across reboots

  • DETS does not offer sorted collections, and is thus not ideal when one needs to select arbitrary ranges of keys, iterate in order, etc.

  • Mnesia is great, but on embedded projects I don’t need distribution

  • Sometimes I really just need a “persisted map”, sorted by key

  • It’s nice to be able to backup the whole DB by just copying one file

The use-cases I am primarily targeting is what described in this blog post by the Nerves team: https://embedded-elixir.com/post/2017-09-22-using-ecto-and-sqlite3-with-nerves/

CubDB is somehow similar to SQLite in which it stores the data locally in a single file, but it is written in Elixir, is key-value and schema-less, and both keys and values can be any Elixir (or Erlang) terms, so no serialization/de-serialization is needed.

The data structure it uses is an append-only immutable B-tree, inspired by CouchDB: that guarantees robustness to data corruption (no in-place mutation), and enables features like concurrent read operations that do not block writes, and atomic transactions.

It was already a lot of fun for me to develop it, but I would love to hear your constructive feedback.

What do you think about it? Do you have a use-case where this could be useful? Do you have feedback about the API?

Thanks in advance :slight_smile:

230 13925 124

Most Liked

lucaong

lucaong

Also, as this thread is now updated again, I take the chance to announce that the release candidate for CubDB 1.0.0 is now on Hex as v1.0.0-rc.1 :rocket: . It is the result of running CubDB in production on embedded devices for the past year, and introduces a few improvements that make the API more solid. The notable changes are:

  • The database is 100% compatible with the previous releases (I have no intention to break compatibility there).
  • Auto compaction and auto file sync are now the defaults: I decided to go for safer defaults, that should be good for the vast majority of cases, and let users tune it for maximum performance in special cases. More info about compaction and file sync are here
  • The timeouts (for example in select or get_and_update_multi) are now enforced on the callee side too, freeing up resources immediately when a timeout elapses. Their API is also slightly different, as timeout is now passed as an option, instead of a separate argument.

For most users, the changes needed to update to v1.0.0(-rc.x) are minimal: just review if the new defaults are ok for you (or set them explicitly), and, only in case you are using explicit timeouts, adapt calls to select and get_and_update_multi. I will write a proper release post on this forum when 1.0.0 is out, but I wanted to inform in advance people that are reading this thread and that expressed interest in CubDB.

lucaong

lucaong

Hi @wolfiton,
thanks for your question. I really admire the engineering work of the Redis author, Salvatore Sanfilippo, so it’s nice for me to see CubDB and Redis discussed in the same context.

That said, the Redis and CubDB have considerably different goals and characteristics, and I think the overlap in use-case is quite small. I will try to clarify that a bit:

  • Redis is a “data structure server”, to which one connects over the network. It keeps data primarily in memory to ensure very fast operations, and uses the disk to recover after a restart. It offers several different data structures (maps, list, sorted sets, streams, etc.) and is agnostic about the programming language used by the user. So, it is shared (multiple apps/instances can connect to one Redis db), very fast, but data must fit in memory. Common use-cases for Redis are: shared in-memory cache, shared data-structure for queues or parallel computation, shared locks.

  • CubDB is an embedded database, so it run “inside” your application, with no network connection. It works sort of like a map, but persisted on disk (plus all the sorted lookup operations). It can be used directly only by Elixir or Erlang, but has the convenience of having zero dependencies and storing native Elixir terms without requiring the user to implement serialization/deserialization. It is not shared between different apps/instances (unless you implement yourself a server layer on top of it). It stores data primarily on disk, so it can store more data that can fit in memory. It’s designed for robustness in case of power failures, and simplicity to install and use from Elixir apps. Primary use cases would be data storage for an embedded application (think Nerves running on a Raspberry Pi), or data storage within one app instance.

Of course, one could build a small server on top of CubDB, and expose its features over a network, achieving something comparable to Redis maps. That would be a nice project :slight_smile:

Right now I am working on the core, and focusing on doing one thing well: a versatile and robust key/value storage. Hopefully that will enable developers to get creative and build more use cases on top of it.

lucaong

lucaong

After extensive testing on a number of test Nerves devices, I was finally able to identify the issue that @Qqwy reported.

It was a bug with the way the most recent database file is chosen, in cases when a restart happens right after a compaction, but before the old file is cleaned up, and CubDB sees more than one database file. The wrong file was chosen, leading to the new records disappearing.

The issue is solved with the latest release, v0.12.0, which is 100% backward compatible. Thanks a lot @Qqwy for reporting and helping. Version 1.0 is getting closer, thanks to valuable feedback from people in this forum :slight_smile:

Where Next?

Popular in Discussions Top

WolfDan
After doing a port from a c++ library to my project in phoenix I’ve seen that I need a faster way to run this algorithm and I found this ...
New
AstonJ
Are there any Elixir or Erlang libraries that help with this? I’ve been thinking how streaming services like twitch have exploded recentl...
New
AstonJ
If a newbie asked you about Phoenix Contexts, how would you explain the basics to them? Feel free to be as concise or in-depth as you li...
New
Crowdhailer
I’ve been hearing much about the new formatter and it’s something I have been keen to try. I find examples buy far the most illuminating...
248 19204 150
New
marciol
Please, let me know if this kind of discussion already took place in another topic . Hi all, how do you consider if is better to build ...
New
sergio
There’s a new TIOBE index report that came out that shows Elixir is still not in the top 50 used languages. It also goes on to call Elix...
New
boundedvariable
I am going through the kafka architecture. All the features what the kafka is providing are already in Erlang. I would like hear your opi...
New
hazardfn
I suppose this question is effectively hackney vs. ibrowse but we are at a point in our project where we have to make a choice between th...
New
und0ck3d
Hello everyone! A few days ago I’ve created a topic here about how people were creating CMSs with Elixir and Phoenix. I’ve been studying...
New
RudManusachi
What configs will make sense to put to runtime.exs? – A bit of how I configure apps: I have generic configs in config/config.exs, dev...
New

Other popular topics Top

lessless
I believe there are people here who are dealing with CSV files import on the daily basis, and since Excel is a really popular tool there ...
New
fireproofsocks
Forgive me if this is obvious, but how does one delete a database record WITHOUT selecting it first? Ecto.Repo — Ecto v3.14.0 has exampl...
New
JakeBecker
TL;DR: I’ve just released an implementation of Microsoft’s IDE-independent Language Server Protocol for Elixir. It adds language support ...
1144 53690 245
New
chrismccord
This release brings a number of exciting features, including integration with the new Phoenix LiveDashboard and Phoenix LiveView. There h...
New
alice
Hey, Just curious what are the main benefits of Elixir compared to Clojure? When is Elixir more useful than Clojure and vice versa? Th...
New
grych
Hi folks, Few months ago I have announced the proof-of-concept of the library to manipulate the browsers DOM objects directly from Elixi...
639 52341 488
New
jason.o
In the code below, if the create action is not set to accept “extra_key” as an input, it errors out with a message shown above. Is there ...
New
nsuchy
Hi. I’ve noticed that Windows Powershell has it’s own IEX command and you cannot access Elixir’s IEX due to the conflict. This isn’t a cr...
New
Qqwy
Update: How to use the Blogs & Podcasts section You can post links to your blog posts or podcasts either in one of the Official Blog...
3271 126479 1222
New
PeterCarter
There are pre-rolled solutions for other frameworks that do work. However, Phoenix does not seem to have these. Have people had good expe...
New

We're in Beta

About us Mission Statement