Nebulex – a fast, flexible and powerful caching library for Elixir (not only local but also distributed)

Introducing Nebulex

Source
Hex Package
Getting Started
Examples

8 Likes

I like this.

Have been waiting for someone to write a distributed Elixir cache so we don’t have to.

Will be nice to see the benchmarking results so we can see if it’s feasible to drop the Redis dependancy.

The usual challenges with going from a local to a distributed cache has to do with the fact that you go from a local to a distributed computing problem, and with it all the challenges of CAP.

@cabol Can you elaborate on how netsplit is handled? Suppose you have two nodes and a netsplit occurs, and then each node writes a different value to the same key, how is this resolved?

3 Likes

@benwilson512 good point, thanks for your comment/question, and I’ll take advantage of it to clarify some things about Nebulex.

  1. The distributed cache that Nebulex provides is not a Key-Value store/database like Riak, it is more like Memcached, where you have a local in-memory backend (hash table) running on each node of your cluster, and on top of that from the client/consumer side you run a consisten hash algorithm (preferably) to acces any of those in-memory backends – sharding topology. Based on that, caches like Memcached (AFAIK) will have the same issue you are describing here – even with Redis, if it is deployed as cache in tha same way as Memcached (and for distributed part you use e.g. Twemproxy).

  2. Bringing here CAP theorem, like any other distributed system, we cannot sacrifice “Partition Tolerance”, only remaining “Availability” or “Consistency”, therefore, in Nebulex we went for “Availability” – we could say it is eventually consistent. Let’s try to illustrate what would happen in Nebulex based on your example:

T0: set(1, 1) --> |     |
                  |     |
T1:  1 <-- get(1) |     | get(1) --> 1
                  |     |
                  |--X--| T2: Network Partition
T3: set(1, 3) --> |     |
                  |     |<-- set(1, 4) :T4
                  |     |
T5:  3 <-- get(1) |     | get(1) --> 4
                  |     |
                  |-----| T6: cluster re-established
                  |     |
T7:  3 <-- get(1) |     | get(1) --> 3
                  |     | # it might be either 3 or 4, depending on the hash function,
                  |     | # data can be in one node or another

Eventually, the key will expire, or it will be updated again, or the generation where the key resides will be dropped, forcing the key be retrieved from database (main storage) again, so eventually, the data will be “consistent” again. Now, if you really need to be CP or be able to move between consistency and availability, you probably need something like Riak.

Now, the trick of this relies on how we design our app, this kind of issues shouldn’t affect our business logic, for that reason, we have to be careful at picking the data to be cached.

Finally, one of the benefits of Nebulex is that you can provide your own implementation/adapter, and improve/modify the current behaviour according to your needs.

I know this is a whole topic of discussion, but I really hope I have answered your inquiry, stay tuned :slight_smile: !

4 Likes

What is your recommendation for a configuration to use this as a session store?