Caching strategies (limit by memory usage not number of entries)

I’m coming to Elixir & Phoenix from a Rails background. In Rails it’s pretty common to use Redis for caching, where you just set a memory limit and and allow old entries to be expired.

I was looking at caching in Elixir and it seems like cachex limits are set to the number of entries you allow. This is all fine if the data you’re caching is all about the same size, but, not great (as in our case) if different entries could be extremely different in size.

What are people doing here? Can you create multiple Cachex providers and keep similarish data in each of them (so that you can limit by number correctly), or do tools like Redis and friends still get used?

You can still use redis for caching here, people use usually cachex to avoid adding external dependencies and having network latency (that a lot of time defeats the propose of having cache in the first place).

As for the limit, since the caching is part of the runtime from where you are running the app (and sharing the memory), I don’t think a feature that would limit memory usage would make much sense, moreover I don’t think this can be reliably measured from within the application code (where cachex operates from).

1 Like

It cannot, specifically taking into account that the Erlang VM is smart enough to not duplicate the memory for the same objects.

t1 = {1, 2}
m1 = %{a: t1}
m2 = %{a: t1}
:erts_debug.same(m1.a, m2.a)
#⇒ true
1 Like

Also :erts_debug – being a debug tool – doesn’t include data on the shared binary heap as well, being asked for size of a term.

1 Like

Answering the question stated, partitioning the data per expected size and starting several Cachex.start/2 instances with different limits, one per each “size range” would do.

1 Like

Thanks for all the replies - and great point of course about it all being part of the one one runtime.

I think @mudasobwa’s suggestion of starting several Cachex.start/2’s is probably the way to go :+1: