CubDB is an embedded database written in pure Elixir, designed for robustness and minimal use of resources. It strives to be as developer-friendly as possible.
Some of you, especially in the Nerves community, already know and use CubDB
since a while, and might have read the original thread about CubDB on the Nerves Forum.
Version v2.0.0
is now published on Hex, with big improvements and exciting new features, so I thought it is a good time for a proper library post here.
Since this is a long-ish post, here’s a Table of Content:
- Why CubDB?
- How does it compare to ETS, DETS, Mnesia, SQLite, etc.?
- What’s new in v2.0.0?
- How does it look in code?
Why CubDB?
CubDB
is an embedded database written in Elixir. It runs inside your application, as opposed to on a separate server, and saves its data in a local file. In this respect, it is similar to SQLite, but offers an idiomatic Elixir API.
It is NOT a replacement for Postgres for multi-instances web applications, nor a distributed database, but rather a solution for cases when a lightweight but robust local data store is needed.
Typical use cases are applications running on embedded devices (CubDB
runs well on Nerves), desktop applications, or applications running locally. CubDB
is often used to persistently store data and configuration, as a data logging or time series store, or to persist state of an application.
Some of the features of CubDB
are:
- Basic key/value access, and selection of sorted ranges of entries.
- Both keys and values can be any Elixir (or Erlang) term.
- ACID transactions to perform atomic changes.
- Multi version concurrency control (MVCC), allowing concurrent reads that do not block nor are blocked by writes.
- Unexpected shutdowns or crashes won’t corrupt the database, nor break atomicity of transactions.
- Manual or automatic compaction to reclaim disk space.
How does it compare to ETS, DETS, Mnesia, SQLite, etc.?
The FAQ section in the documentation has a chapter about this.
What’s new in v2.0.0?
Head to the CHANGELOG for more information, but in short:
- Vastly improved concurrency
- Improved
CubDB.select
function, which now returns lazy streams, allowing any custom composition ofStream
andEnum
functions - Atomic transactions with arbitrary operations with
CubDB.transaction
and theCubDB.Tx
module - Zero-cost immutable snapshots with
CubDB.with_snapshot
and theCubDB.Snaphot
module - CubDB.back_up for creating database backups
This major version comes with some backward incompatible changes, so refer to the upgrade guide on how to upgrade from v1.1.0
to v2.0.0
. The data format is completely compatible across these major versions though, so you can upgrade and downgrade your code without needing to migrate data.
How does it look in code?
Start a CubDB
database process by providing a directory to store its data:
{:ok, db} = CubDB.start_link(data_dir: "some/data/directory")
Basic key/value access
Key/value access works as you probably expect:
CubDB.put(db, :some_key, "some value")
#=> :ok
CubDB.get(db, :some_key)
#=> "some value"
CubDB.delete(db, :some_key)
#=> :ok
Both keys and values can be arbitrary Elixir (or Erlang) terms, such as scalar, tuples, maps, structs, and really anything:
CubDB.put(db, {:users, 123}, %User{id: 123, name: "Andrea"})
#=> :ok
CubDB.get(db, {:users, 123})
#=> %User{id: 123, name: "Andrea"}
Selection of sorted ranges
Selection of sorted ranges is done with CubDB.select
, and returns a lazy stream that can be passed to functions in Stream
and Enum
. Data is fetched lazily, only when the stream is iterated or otherwise run:
# Put several entries atomically
CubDB.put_multi(db, [a: 1, b: 2, c: 3, d: 4, e: 5, f: 6, g: 7, h: 8])
# Get the sum of even entries between :b and :g
CubDB.select(db, min_key: :b, max_key: :g) # select entries in reverse order
|> Stream.map(fn {_key, value} -> value end) # discard the key and keep only the value
|> Stream.filter(fn value -> is_integer(value) && Integer.is_even(value) end) # filter only even integers
|> Enum.sum() # sum the values
Thanks to the fact that all Elixir terms have a well defined order, CubDB
can be used to store and select multiple collections in the same database, akin to SQL tables.
Atomic transactions
Multiple operations can be performed atomically using the CubDB.transaction
function and functions in the CubDB.Tx
module:
# Swapping `:a` and `:b` atomically:
CubDB.transaction(db, fn tx ->
a = CubDB.Tx.get(tx, :a)
b = CubDB.Tx.get(tx, :b)
tx = CubDB.Tx.put(tx, :a, b)
tx = CubDB.Tx.put(tx, :b, a)
{:commit, tx, :ok}
end)
#=> :ok
Alternatively, all the ..._multi
functions perform their operations atomically.
Zero-cost immutable snapshots
If you need to ensure consistency when reading multiple values, but do not need to perform any write, there is a better alternative to transactions that won’t block writes: zero-cost immutable snapshots. Using CubDB.with_snapshot
one can perform several read/select operations isolated from concurrent writes, but without blocking them. Think about this like immutability in Elixir data structures, but in a database:
# the key of y depends on the value of x, so we ensure consistency by getting
# both entries from the same snapshot, isolating from the effects of concurrent
# writes
{x, y} = CubDB.with_snapshot(db, fn snap ->
x = CubDB.Snapshot.get(snap, :x)
y = CubDB.Snapshot.get(snap, x)
{x, y}
end)
Head to the API documentation for more information.
I hope you enjoy CubDB
as much as I do