Bring mem to caching zoo

there’re too many caching packages in elixir world, just create one more:

I do performance benchmark by insert 1_000_000 key with 5s ttl, got such result, if you get different ones, we can discuss :smiley:

cachex

inserting data into cachex is very fast (about 2.2x faster than mem).
a 100% cpu used system beam.smp live for a long time for cleaning expire keys.
cachex clean ttl with a standalone process and query expired keys with qlc.

con_cache

inserting data into con_cache is about 1.2x slower than mem.
ttl fail, I also can get a lot of data after 5 seconds
con_cache use async :timer.send_after to clean expired keys.

mnemonix (ETS backend)

inserting data into mnemonix is about 7x slower than mem.
mnemonix use :timer.apply_after to clean expired keys.

mem (without replacement and persistence)

mem use addition ets table to save ttl and clean expired keys by traversing this table. (just like what redis do)
so mem use addition memory.

just choose what you like.

6 Likes

Interesting! I’d say that caching is one of these things where more alternatives is better :smiley: .

Besides showing these benchmark results, it would be awesome if you could elaborate on the design choices you made when designing mem.
Are there any situations in which we should or should not choose mem? (besides the speed considerations of inserting/removing elements that you’ve already talked about).
What other properties does it have? How does mem's Time To Live work? How does mem persist data?

3 Likes

here are some details of mem:

Mem Use Such ETS Tables

For Original Data Storage

Backend: ETS
Table Type: set
Data Format: {key, value}
Index: key

For TTL Storage

One addition ets table used:

Backend: ETS
Table Type: set
Data Format: {key, expire_time}
Index: key

For Replacement Storage

Two addition ets tables used:

Backend: ETS
Table Type: set
Data Format: {key, {out_timestamp, unique_integer}}
Index: key

Backend: ETS
Table Type: ordered_set
Data Format: {{out_timestamp, unique_integer}, key}
Index: {out_timestamp, unique_integer}

out_timestamp decided by maxmemory strategy, :fifo :lru or ttl

For Persistence

Just replace all ETS backends with Mnesia.

the reason I don’t use dets is the 2G limit.

Operating Procedures

SET

  1. storage data
  2. set ttl
  3. set replacement flag

GET

  1. get ttl
  2. delete data, ttl and replacement infomation if expired
  3. return data

TTL Cleaner

  1. traverse 200 keys of addition ttl table
  2. delete all expired keys
  3. if numbers of deleted keys > 100, do next clean after 100ms
  4. fo next clean after 20s

200 keys, 100ms and 20s are hardcoded now, I’ll make it configurable if necessary.

Replacemet Cleaner

  1. check memory used
  2. get smallest one from addition ordered_set table
  3. delete replacement information, ttl and data
  4. do clean again immediately or after 20s base on memory_used

20s is hardcoded now

When Should NOT Choose Mem

Mem eat 2 times memory with TTL and 4 times memory with TTL and Replacement, so you shouldn’t use mem if you don’t need high performance and just storage small term.
Mem don’t have transactions and I don’t have plan to do this, I think cachex is the best choice if you need it.
Don’t use mem if you don’t like the code style, DSL or other things make you unhappy.

4 Likes

Hey, I just discovered this! (I’m the guy working on Mnemonix.)

It doesn’t surprise me at all that mem is much faster – map-compatibility, flexibility, and ease of adding new backends is more of the focus of that project. I’ve also deferred looking into any optimization until the fully-featured API is stable. Even then, my wildest ambitions are to bring it maybe within 3x slower than mem. :slight_smile:

Your deconstruction of ttl implementations was really insightful. You mention that ConCache uses :timer.send_after and Mnemonix uses :timer.apply_after, both as cons. I see your ttl uses Process.send_after, and that led me to discover the common caveat with :timer, thanks! That goes to the top of my notes for when I get around to improving Mnemonix’s efficiency.

I’m curious about the sweep timing you use in your implementation. As far as I can tell, you sweep expired keys every 20 seconds or so? I can’t quite follow what the state.number means in your algorithm.

I really like how close mem runs to the metal, and gives you optimized no-nonsense default configuration. I know that, for instance, if I was playing around with using Mnemonix just to experiment with Plug.Session backends, and after some benchmarking decided that the ETS or Mnesia stores made the most sense for my application, I’d probably reach for mem.

Have you considered adding a Plug Session adapter to it? It’s a trivial behaviour to implement, and would let anyone fill a common caching need by swapping out for mem without any code modification.

4 Likes

Really glad to hear you about the discuss.
:eralng.send_after is better than :timer.send_after, but it’s not the point. The essential different is mem only has one timer for cleaner process, not one for each key-value pair with expire flag.
I have the expire time saved in another ets table, the 20 seconds and state.number means that scan state.number items in expire ets table every 20 seconds and clean expired ones.
The reason I create mem is I need a database cache with ttl and lru, so I just built basic function for mem. It seems it easy to build a plug session adapter, I’ll try to make one.

2 Likes