Nebulex – a fast, flexible and powerful caching library for Elixir (not only local but also distributed)

Introducing Nebulex

Source
Hex Package
Getting Started
Examples

10 Likes

I like this.

Have been waiting for someone to write a distributed Elixir cache so we don’t have to.

Will be nice to see the benchmarking results so we can see if it’s feasible to drop the Redis dependancy.

The usual challenges with going from a local to a distributed cache has to do with the fact that you go from a local to a distributed computing problem, and with it all the challenges of CAP.

@cabol Can you elaborate on how netsplit is handled? Suppose you have two nodes and a netsplit occurs, and then each node writes a different value to the same key, how is this resolved?

3 Likes

@benwilson512 good point, thanks for your comment/question, and I’ll take advantage of it to clarify some things about Nebulex.

  1. The distributed cache that Nebulex provides is not a Key-Value store/database like Riak, it is more like Memcached, where you have a local in-memory backend (hash table) running on each node of your cluster, and on top of that from the client/consumer side you run a consisten hash algorithm (preferably) to acces any of those in-memory backends – sharding topology. Based on that, caches like Memcached (AFAIK) will have the same issue you are describing here – even with Redis, if it is deployed as cache in tha same way as Memcached (and for distributed part you use e.g. Twemproxy).

  2. Bringing here CAP theorem, like any other distributed system, we cannot sacrifice “Partition Tolerance”, only remaining “Availability” or “Consistency”, therefore, in Nebulex we went for “Availability” – we could say it is eventually consistent. Let’s try to illustrate what would happen in Nebulex based on your example:

T0: set(1, 1) --> |     |
                  |     |
T1:  1 <-- get(1) |     | get(1) --> 1
                  |     |
                  |--X--| T2: Network Partition
T3: set(1, 3) --> |     |
                  |     |<-- set(1, 4) :T4
                  |     |
T5:  3 <-- get(1) |     | get(1) --> 4
                  |     |
                  |-----| T6: cluster re-established
                  |     |
T7:  3 <-- get(1) |     | get(1) --> 3
                  |     | # it might be either 3 or 4, depending on the hash function,
                  |     | # data can be in one node or another

Eventually, the key will expire, or it will be updated again, or the generation where the key resides will be dropped, forcing the key be retrieved from database (main storage) again, so eventually, the data will be “consistent” again. Now, if you really need to be CP or be able to move between consistency and availability, you probably need something like Riak.

Now, the trick of this relies on how we design our app, this kind of issues shouldn’t affect our business logic, for that reason, we have to be careful at picking the data to be cached.

Finally, one of the benefits of Nebulex is that you can provide your own implementation/adapter, and improve/modify the current behaviour according to your needs.

I know this is a whole topic of discussion, but I really hope I have answered your inquiry, stay tuned :slight_smile: !

5 Likes

What is your recommendation for a configuration to use this as a session store?

@cabol it was interesting listening to your chat on Thinking Elixir podcast! If one were to switch from using Cachex directly to Nebulex, what would the pros/cons of using the cachex adapter vs one of the built-in ones?

That is a good/interesting question. I think the question, in that case, would be whether to use Cachex adapter or the build-in local adapter, because the rest of the adapters, partitioned, replicated, and multilevel can be used alongside Cachex adapter, they rely on a pre-configured backend (or local adapter) to work, and that could be either the build-in local one or Cachex (you can see some examples here: GitHub - cabol/nebulex_adapters_cachex: A Nebulex adapter for Cachex). Now, coming back to the initial question, pros/cons of using Cachex adapter vs the built-in local one, they are different cache implementations, especially in terms of the eviction policy, so it is difficult to answer because at first glance they look similar, they do pretty much the same, but they are very different underneath. The decision whether to use one or another will come based on the results you get, like how efficient the cache is in terms of memory and/or eviction policy (you have to measure here because this also will depend on your use case), performance (here I’d say they are very similar, both have very good performance, but you can run some benchmarks though), features (Cachex is much more advanced, provides more features, the build-in local adapter is much simpler, so with Cachex you will get more features/flexibility), maturity (Cachex is more matured, is probably the most used cache in Elixir, it is well maintained, excellent implementation, consolidated, overall it is great). So, I’d say, if you are using Cachex already and you want to start using Nebulex, start with the Cachex adapter (the combination of Cachex and Nebulex is great), then if you have the chance, try the build-in local adapter out, see how it works, take some measurements, etc.

Thanks a lot for that explanation, appreciate it!