Cuckoo Cache - A probabilistic LRFU in-memory cache with one-hit-wonders detection

farhadi · March 14, 2021, 6:50pm

cuckoo_cache is a high-performance probabilistic LRFU in-memory cache with the ability to detect one-hit-wonders using a built-in Cuckoo Filter for Erlang and Elixir.

I call it a probabilistic cache because it relies on probabilistic features of its underlying Cuckoo Filter. There is a high chance that most recently/frequently used items will remain in the cache and least recently/frequently used items are either not cached at all or removed lazily to give space to recently/frequently accessed items.

CuckooCache uses a cuckoo filter to keep track of accessed/cached items, and uses an ets table to store cached items.

Every time an item is accessed, its key is inserted in the cuckoo filter. An item is cached in the ets table only if it already exists in the cuckoo filter. In other words items that are accessed once and never used again in a sliding window of requests (one-hit-wonders), will not be cached.

When an item is added to the cuckoo filter, and filter capacity is full, another random item gets removed from the filter, and removed item is also removed from the ets cache if it exists in the cache and no longer exists in the filter.

One interesting feature of a cuckoo filter is that it can contain duplicate elements (up to bucket_size * 2 times). This feature gives frequently accessed elements more chance to stay in the filter as long as they are accessed frequently enough.

Feed backs are welcome!

derek-zhou · March 15, 2021, 1:21am

Nice. Couple of questions:

Is the memory size bounded, and configurable?
If the above is true, can I shrink/grow the bound at run time, without affecting operation?

farhadi · March 15, 2021, 3:48am

Cache capacity is bounded by quantity not by size, and it is not possible to reconfigure it at runtime.

Actually you can specify the capacity of the underlying cuckoo filter and the capacity of cache is always less than the capacity of the filter.

Assume the capacity of the filter as a sliding window of requests. For this sliding window of requests only those that are repeated more than once will be stored in the cache. For example a cuckoo cache with a capacity of 1024 will keep track of latest 1024 requests probabilisticly (not exactly the latest ones but some recent 1024 requests.) and among those requests repeated ones are cached in ets. So the capacity of cache at its highest utilization could be equal to the number of unique requests in the recent 1024 requests.
And with default configurations, duplicate requests are not counted more than 8 times, which means if you have let’s say a single request that is repeated 500 times in the last 1024 requests in our example, it is only counted 8 times, letting more requests to be tracked.

brightball · March 15, 2021, 5:00am

I like this approach. I thought about building something similar for years with the idea of using a cost benefit analysis to achieve the same type of thing.

Cost benefit meaning, total time to retrieve the item to be cached, amount of space used in the cache to store it and then frequency of access.

The general idea was that frequency of access * time to retrieve the item to be cached is the benefit, while the spaces used to store is the cost.

The approach should naturally bias towards storing the most frequently accessed and slowest items that are small so that more could be stored.

Maybe one day I’ll get around to building it.

derek-zhou · March 15, 2021, 5:14am

I see. If this cache can occupy significant memory, some runtime knobs that can reclaim some memory without bringing everything down will be very useful.

farhadi · March 15, 2021, 10:40am

Knowing the average size of the items that you add to the cache can give you an approximate estimation of how big it could grow in size. And if you need, you can create a gen_server to periodically check the memory usage of the table and remove some elements with your own strategy.

dimitarvp · March 15, 2021, 10:58am

@farhadi beat me to it – you can just periodically check and clean your caches.

Even better, you can put your own code before it, something like MyCache.maybe_put that checks the size and TTL of various items and just not put the item if your code deems the cache full.

farhadi · March 15, 2021, 11:26am

There is already a mechanism to lazily remove least recently/frequently used items, and I think it is enough for most of the use cases.
And for real-time use cases it gives a more predictable performance to lazily remove one item at a time rather than cleaning up the table periodically.

derek-zhou · March 15, 2021, 2:06pm

Of course, it would be nice that this functionality is included in the library, so my gen_server can do the check and tell the lib “memory is tight, up the aggressiveness in reclaim” or the other way around, and the cache will just shrink or grow gradually.