Mnesia vs Cassandra (vs CouchDB vs ...) - your thoughts?

distributed-systems
mnesia
cassandra
couchdb
distributed-database

#1

Looking at the stacks that existing large companies have used, WhatsApp internally uses Mnesia to store the messages, while Discord uses Cassandra.

Also, in the past we’ve had quite a long discussion about ‘When to use Mnesia’ (vs. a traditional relational database), but this was about a year ago, and maybe there have been changes to the Mnesia Ecto bindings or similar?


What I already know:

  1. Cassandra runs on Java, Mnesia runs inside your BEAM.
  2. Both are distributed databases.
  3. Mnesia does not do conflict resolution (although you can write your own, and the packages unsplit and [reunion](https://github.com/snar/reunion exist to do this for you). Cassandra always uses last-write-wins for conflict resolution.

Ecto-less migrations?
#2

I think it’s moving to “ForgETS” or something like that which is their reimplementations of mnesia on top of ets.

I’d use scylladb for storing unimportant data like messages.

I particularly liked this talk about scylla https://www.youtube.com/watch?v=4BVdZw9uUMA

But I haven’t used it in production yet though.


#3

I don’t think anything in particular has changed over the year in regards to mnesia.

Things you need to know:

  • Although WhatsApp use mnesia they don’t use it in the “normal” way. They do async dirty transactions and they have their own replication between nodes (they have two nodes per partition). It is also used for temporary type of data.
  • Keep in mind, that unless you go with one of the new mnesia backends (such as leveldb) you must hold all your data in memory.
  • mnesia supports transactions (cassandra has light-weight transactions)
  • They are distributed but they do differ how this is done. Cassandra is more (and easier) scalable than mnesia.

Overall, if your data is more ephemeral in nature mnesia is a good choice, you need transactions. If you have large amount of data and the need to scale a lot cassandra would be better.


#4

You have big company behind Cassandra https://www.datastax.com/ with theirs tools, Apache Sorl integration (search engine)
Also Casandra has good integration with big data eco system like Apache Spark, Apache Kafka.


#5

No, mnesia has always had the ability to store data on disk. You just specify on which nodes you want disc copies. The only earlier limitation has been that the disk files must not be larger than 2 Gb but you got around this by partitioning your tables.


#6

Yes, you are right. For small datasets it is doable. On the other hand if they are comparing it with cassandra I just assumed they had “a lot” of data.

In my experience the dets backend is not feasible. The small amounts of data it can keep and the overhead of managing table partitioning is not really worth it if you have large amounts of data. Just assume you have a dataset of 1TB, that in itself means 500 partitions and to cater for growth you want more than that. And because it performs best when you add partitions in the power of two the next step is 1024 partitions which you need to manage. The time it takes to change the number of partitions is also not very optimized. The process uses temporary ETS tables of bag or duplicated bag with an O(N) behaviour. If you have more than 10-20K keys it will take forever to add a single fragment. (actually I have not tested this with dets, only the disc_copies backend so if it differs I may be wrong here)


#7

Yes, it very much depends on how much data you want manage.


#8

A third option I have came across is to use CouchDB, which also some large parties use. It’s also built on the BEAM, but unfortunately currently does not support running in the same BEAM as your app. The main difference between CouchDB, Mnesia and Cassandra are:

Cassandra

  • runs on the JVM (There is also a drop-in replacement called ‘Scylla’ that runs on C++).
  • It stores data in a relational format (tables with specified columns with specified types)
  • uses SQL(-like; some features are not supported because of the distributed nature) statements to query data.
  • Updates are per-column, so conflicts are also per-column (although lightweight transactions to ensure all-or-nothing updates are also supported).
  • For cassandra to work well, your server’s clocks need to be synchronized.

CouchDB

  • runs on the BEAM. (but including it inside your BEAM instance is not supported)
  • It stores data in a document format,
  • javascript map/reduce functions to query data.
  • conflicts are resolved by you yourself periodically fetching multiple revisions and storing a new conflict-free version; until you run this, last-write-wins.

Mnesia

  • Runs inside your BEAM.
  • It stores data in a set/ordered_set/bag of Elixir/Erlang datatypes.
  • Erlang QLC queries to perform queries.
  • Does not handle conflicts; although libraries such as unsplit and reunion allow you to do this by writing your own conflict-resolution function.

According to the Mnesia documentation, the 2GB limit only applies to disc_only_copies; ram_copies and disc_copies are only affected by the size of available RAM. For practical purposes this still means that >16GB (or maybe >32GB or >64GB, but then we definitely are at the limit) tables need to be partitioned.


To be honest, I still have no clue which database engine I’d like to pick. I might end up going for the one that restricts my system the least for the future because I see no clear disadvantages from picking one over the other.


#9

I start with Mnesia because its just there. No extra installs needed, no mucking with another language (i.e. SQL) to interact with the dbase. Most of the systems I work on don’t do Updates on a row, only Creates so the whole conflict resolution typically don’t come into play.

For longer term storage, past a week in my cases, I would move the data from Mnesia to a RDMS or another type of database. Typically which ever flavor the Sales/Accounts/Finance people are keen on using.


#10

Going with the last few posts, put it all behind a module interface, start with the most basic, which might just be ETS or so in RAM, maybe mnesia later, maybe move to eleveldb or something later, maybe even migrate to PostgreSQL later, all without changing the interface.


#11

I saw your request on the couchdb mailing list, and while it’s “not supported” it’s not very hard to do it either, as this is actually how CouchDB is built internally anyway.

These notes are for 1.7.x which is what I’ve got to hand:

  • create your elixir application as usual and compile it
  • unpack the release into couchdb plugins dir (or /usr/local/lib/couchdb/erlang/lib/... ) for whatever your OS uses
  • add a [daemons] section to your local.ini config with myapp={'Elixir.MyApp', start_link, []} which will ensure it gets started up as part of the main supervision tree of couch
  • set ERL_FLAGS environment variable to use -config /usr/local/etc/couchdb/sys.config to get your custom application settings loaded up
  • look through https://github.com/apache/couchdb/tree/master/src/couch_plugins which has some useful comments and source snippets

I’m reasonably sure this is complete, and while the complexities of 2.x series with cluster support make this moire complicated, the guts of that should remain the same.

At some point, somebody will decide to put a nice tidy shim in between the chttpd “clustered httpd” layer that exposes the soft erlang underbelly of couchdb, and then you’d have native erlang terms accessible to Elixir as well. There are a few not-so-handwavey details to deal with - no maps yet as couch uses proplists (no maps at the time in Erlang, and to support some quirks of JSON), things like structs have no way of being converted back to JSON at the http layer, and a bunch of valid BEAM types that would need to be cleaned when turning them into JSON again etc etc, once the shim is available.

The big issue is that releases for a clustered db are a serious thing, and we tend to update and restart our apps far more often than our DBs. If you’re already co-locating the DB and the app on the same server, only the JSON encoding & recoding is the bottleneck.


#13

So… we’re a lot further in development right now.

Currently, we’re using Mnesia, mostly because it allows us to postpone the choice for another datastore for later.
At first we thought Mnesia might be useful to use as a distributed datastore that would scale to millions of users all on its own, but:

  • The number of nodes in a database cluser is not 1:1 with the number of user-serving application nodes. Separating these into two layers would make the reason to let your database live ‘in’ your application no longer be worth it.

  • Messages in our chat application are broadcasted to all users connected to a chat channel (== a phoenix channel) in parallel to storing something to the datastore: In effect, the datastore is only used to read back in history when (re-)connecting to a channel at a later time. So: Read speed (the extra bit you get from having the database live inside your appiication) is not thát important here.

  • Mnesia does not do many of the things that are important for a large distributed datastore system on its own: It would mean building a lot of custom logic to handle the following features that you’d like in a distributed, fault-tolerant datastore:

  • network partitions resolving. (Mnesia suffers from split-brain, and performing read-time fixes is non-trivial)

  • proper replication/fault tolerance,

  • dataset sharding/consistent hashing etc. (Mnesia has limits to table sizes; table fragmentation is supported, but a very leaky abstraction when used directly; and you have to manually manage the fragments (and e.g. resizing them when new nodes are added)).

  • run-time network topology reshaping (mnesia is completely configurable at runtime, but does not have any strategies on its own).

So, together, these reasons are plentiful enough to keep our eyes open. There’s a high possibility that we’ll switch to either Riak or Cassandra or CouchDB. However:

  • Riak is cool, has great features and interacting with the Erlang-based system feels relatively natural from Elixir. However, Basho (that made Riak) is no longer functioning. What does/will this mean for the longevity of the database, and support etc?
  • Cassandra is supported by a large group of companies, and is e.g. what Discord currently uses in their Chat system (blog post). However, it’s based in Java, and its last-write-wins methodology of handling conflicts (which is non-optional, there is no way to manually resolve conflicts or e.g. build CRDTs) is frequently critiqued (including in Discord’s blog post).

I’ll look into CouchDB more before comparing it with these two again. Anyhow, in the meantime I’d like to hear if anyone has used these systems recently and has some more feedback: Almost all critiques you can find on the internet are multiple years old, and especially now that e.g. Basho is no longer a functioning company, the landscape might have changed somewhat.


#14

Riak. Where support is concern: https://www.erlang-solutions.com/blog/riak-commercial-support-now-available-post-basho.html


#15

Hi there,

Great post, thanks for that. Just to say we here at Erlang Solutions have, together with the other prominent Riak users, taken on the effort of furthering Riak post-Basho. We also provide full Riak support to anyone needing it, and many Riak users have since signed up for the same. As you say, Riak is a great fit, being a natural part of the Beam ecosystem so worth considering in this situation. We will work to keep Riak moving forward and support should not be a concern as we can provide any help you might need in your adoption and use of the datastore.


#16

Hello @Mladen, thank you for your reply! That sounds great.

One thing we are of course still a bit scared about, is that Riak is still running on Erlang R16B2, whereas we’re now at Erlang 21. Is there still someone taking care of e.g. security fixes being applied to the Riak codebase?


#17

BTW you’re welcome to ask about CouchDB directly - contact me, or the couchdb user mailing list. There’s commercial support available, hosted solutions too, and a BEAM-friendly community as well on IRC. See the https://couchdb.org/ website for more info.