Caching strategies (limit by memory usage not number of entries)

patrickdavey · September 11, 2024, 8:21pm

I’m coming to Elixir & Phoenix from a Rails background. In Rails it’s pretty common to use Redis for caching, where you just set a memory limit and and allow old entries to be expired.

I was looking at caching in Elixir and it seems like cachex limits are set to the number of entries you allow. This is all fine if the data you’re caching is all about the same size, but, not great (as in our case) if different entries could be extremely different in size.

What are people doing here? Can you create multiple Cachex providers and keep similarish data in each of them (so that you can limit by number correctly), or do tools like Redis and friends still get used?

D4no0 · September 12, 2024, 7:36am

You can still use redis for caching here, people use usually cachex to avoid adding external dependencies and having network latency (that a lot of time defeats the propose of having cache in the first place).

As for the limit, since the caching is part of the runtime from where you are running the app (and sharing the memory), I don’t think a feature that would limit memory usage would make much sense, moreover I don’t think this can be reliably measured from within the application code (where cachex operates from).

mudasobwa · September 12, 2024, 9:35am

It cannot, specifically taking into account that the Erlang VM is smart enough to not duplicate the memory for the same objects.

t1 = {1, 2}
m1 = %{a: t1}
m2 = %{a: t1}
:erts_debug.same(m1.a, m2.a)
#⇒ true

LostKobrakai · September 12, 2024, 9:38am

Also :erts_debug – being a debug tool – doesn’t include data on the shared binary heap as well, being asked for size of a term.

mudasobwa · September 12, 2024, 9:39am

Answering the question stated, partitioning the data per expected size and starting several Cachex.start/2 instances with different limits, one per each “size range” would do.

patrickdavey · September 13, 2024, 1:25am

Thanks for all the replies - and great point of course about it all being part of the one one runtime.

I think @mudasobwa’s suggestion of starting several Cachex.start/2’s is probably the way to go