andzdroid
Question about memory copy
I saw this bit of text on the cubdb repo’s readme (GitHub - lucaong/cubdb: Elixir embedded key/value database · GitHub):
Snapshots come at no cost: nothing is actually copied or written on disk or in memory, apart from some small internal bookkeeping.
In relation to this code:
{x, y} = CubDB.with_snapshot(db, fn snap ->
x = CubDB.Snapshot.get(snap, :x)
y = CubDB.Snapshot.get(snap, x)
{x, y}
end)
I looked at the with_snapshot function and all it does is make a genserver call. Can somebody explain how this doesn’t copy anything in memory?
def snapshot(db, timeout \\ 5000) do
GenServer.call(db, {:snapshot, timeout}, :infinity)
end
My understanding of elixir was that messages across processes are always copied. Is my understanding wrong? Am I missing something?
Marked As Solved
LostKobrakai
That expects that “the snapshot” is “the data you store in the db”. It probably isn’t. It’s likely just a pointer or reference to the actual data stored elsewhere.
Also Liked
lucaong
@LostKobrakai is essentially right.
CubDB uses an append-only Btree: changes are appended to the database file, like in a log file. Basically, every new change causes a new Btree root to be appended to the database file. The root points at the old nodes as well as to some new ones containing the data that changed. Existing Btree nodes are never updated in place. It works very similarly to an immutable data structure, like the ones used by Elixir. This means that if one has a reference to an old Btree root, they can traverse the tree and see the database exactly as it was when that root was the current one.
Obtaining a snapshot then is as simple as holding a reference to a Btree root: new changes can be written to the database, but they won’t change what is pointed by that old root. Therefore, there is no database copying, as in CubDB does not have to make a copy of the database to give you an immutable snapshot. The Btree root is a small data structure, only holding a reference to other nodes down the tree: it doesn’t contain any data. Taking a snapshot is a very fast O(1) operation. In fact, it is internally used on every read operation, to ensure consistency in presence of concurrent writes.
The bookkeeping mentioned is simply because occasionally, in order to avoid growing the database file indefinitely as more changes accumulate, CubDB performs a compaction, which discards the data which is old and not referenced anymore. CubDB has to keep track of what’s referenced by the existing snapshots to avoid removing it, a bit like a garbage collector.
The details of compaction are a bit more involved, as compaction can proceed in parallel with reads and writes, but in essence when you take a snapshot CubDB does not copy any data: it merely remembers which Btree root your snapshot points to.
When you read from the snapshot, of course data will be copied from disk into memory. But that’s just the data you read, not the whole database. And, again, that is when you read from the snapshots, not when you take a snapshot.
I hope this clarifies it ![]()
LostKobrakai
I’d imagine that in the grand scheme of things the tree is probably tiny compared to what would need to be moved around if values would be affected – especially when the db reaches sizes where you actually start caring for it not being copied.
lucaong
Also, more clarifications regarding @joey_the_snake replies:
If it was a pointer or a reference then it wouldn’t really be a snapshot because then you could see other people’s modifications. Otherwise it would have to be a pointer or a reference to a copy.
No: it is a reference to the original data, a simple pointer, but it is isolated from other updates. The trick is that the original data is immutable, it cannot be changed by other updates. Instead, updates create a new Btree root that points to all the data that did not change, plus the part that changed. It is the same principle underlying immutable data structures with structural sharing (like those in Elixir, Clojure, etc.)
Looking at the source, it seems like a tree structure is copied to the caller. And this tree structure is walked when trying to read particular keys.
Not the whole tree, just the root, which is a small data structure. And yes, the tree is walked when accessed, but it is not a copy: it is the original (immutable) tree, but accessed from the snapshot root.
Yeah true. I do see the OP’s point though. I wouldn’t necessarily expect the full tree structure, even if it doesn’t have all the data, based on the description.
It’s not the full tree. You can think about it with an analogy with Elixir immutable data structures:
# Say we have a map, which in our analogy is the database
m = %{ foo: 1, bar: 2, baz: [1, 2, 3] }
# I take a "snapshot" by saving a reference to the current map
s = m
# Note that nothing was copied above. Simply, a new reference to the
# same (immutable) data structure is created
# This "snapshot" is isolated from changes:
m = Map.put(m, :bar, 3)
m # => %{ foo: 1, bar: 3, baz: [1, 2, 3] }
s # => %{ foo: 1, bar: 2, baz: [1, 2, 3] }
Note the part that is in common between s and m was not copied: both s and m point to the same values (for example the key :baz points to the same list in m and s, not to two identical copies). Yet, because maps are immutable in Elixir, s does not see changes. Simply, m tracks the latest updates, while s keeps pointing to a certain version.
As soon as s is not in scope anymore, the garbage collector can then clean up the old data that is not reachable anymore from m.
CubDB works in much the same way, just on disk instead than in memory, and therefore with some different implementation details to account for the difference performance characteristics of disk vs. memory.
I just found this nice explanation of how an append-only Btree works, which corresponds to how CubDB works internally. On top of that, CubDB uses a compaction process to “garbage collect” disk space and avoid growing indefinitely with each update.
I hope this clarifies it further ![]()








