How do I limit the size of an ETS table to avoid a memory crash!?

james-bowers · May 15, 2019, 2:11pm

hello!

Context
We are planning on using ETS to cache web responses for up to 6 hours.

Question
Is there a neat way of limiting the number of records stored in an ETS table? I’m worried that if we do not have a way of limiting this, our instances will crash if we suddenly have a burst of variations that need to be cached.

Possible solutions ?

Store the time when each record was inserted with the cached content, and have a GenServer periodically query and delete the oldest records. However, this could still crash if the table isn’t “maintained” frequently enough.
Upon each insert, we could query the size or number of items in the ETS table, but I imagine this would be slow unless an ets table itself stores the count of items?
Maintain a second ETS table that keeps count of the cached items. However, I’m unsure of how to keep that in sync with the cache table when records expire.
Use DETS, as there is far more disk than memory space, but that just postpones the issue.

At some point, we will consider using mnesia to share these cached responses with the other instances we have running.

Any feedback on those options, or an alternative solution entirely would be much appreciated!

outlog · May 15, 2019, 2:21pm

look at (and use?) cachex: https://github.com/whitfin/cachex

limits are here https://github.com/whitfin/cachex/blob/master/docs/features/cache-limits.md
though currently it only supports limiting amount of keys - not memory usage… but limiting keys could be good enough…

tty · May 15, 2019, 2:35pm

There isn’t any options in ets which would limit the number of records in a table. ets is bounded by the amount of memory you have on the server. How large do you expected ets to grow and how much memory do you have ? We have routinely gone into GBs ets table size without issues.

If your records expire based on time you might want a round-robin ets data structure to make it easier to drop an entire set of records.

mnesia is based on ets and if you have concerns with the size of your tables exceeding your memory moving to mnesia would not resolve this concern.

Fl4m3Ph03n1x · May 15, 2019, 2:54pm

Welcome to the forum! You pose a interesting question where many solutions are viable. You already thought about some, so lets dissect them a little.

You could have a GenServer check your ETS table every minute. Afraid this GenServer that clears the table may die? No worries, have Supervisor for it! Simplicity at its finest.

Worried one minute may be too much? You can use select_count and use an inverse exponential backoff algorithm for the pooling time (the bigger the table, the quicker you check).

You can use the select_count I mentioned before and it will be reasonably fast, while still being atomic. you can have each operation do an insert and then and update_counter but then you won’t have atomicity.

If using select_count is too slow, imagine having to write in two different tables! No way it’s gonna be faster and you still have to sync them! Is sync really important to you? Or is it OK to be a few values away from reality?

I don’t recommend DETS overall. Last year we had to remove DETS from all our systems because we were such under heavy load that our DETS tables were saving corrupt data. They just couldn’t keep up. We moved to ETS and we never had a problem again.

Furthermore, caches are supposed to be fast, and IO access to disk is by far one of the slowest things you can ask your machine to do, together with network requests. So I wouldn’t advise it.

@outlog cachex looks really cool!

Alternately a few weeks ago I posted a similar issue. I don’t check for ETS size, instead I check for the machines available RAM using memsup. You can see the original topic here:

Hope it helps!

james-bowers · May 15, 2019, 2:57pm

thanks for your responses!
@tty:

How large do you expected ets to grow and how much memory do you have ?

Our usage of ets would occasionally grow beyond what one of our instances have in memory. If we were to constantly provision an instance with the amount of memory required to cope with occasional spikes, it would be too expensive for us. Therefore somehow we need to manage the cache size.

round-robin ets data structure

this is a good idea, thanks

mnesia is based on ets

yes, sorry for the confusion, I meant this not as a solution, but as an additional bit of information that in hind-sight, could’ve been removed from my question.

james-bowers · May 15, 2019, 3:13pm

@Fl4m3Ph03n1x Thanks for your detailed and very helpful response!

Out of curiosity, you would recommend using select_count instead of :ets.info?

In the library @outlog linked to which looks good, and at a very quick look, they use :ets.info

zambal · May 16, 2019, 5:51am

I haven’t compared it with select_count, but :ets.info is very fast, so I have used it for similar use cases as yours.

Fl4m3Ph03n1x · May 16, 2019, 7:42am

I didn’t bencharm select_count VS :ets.info so I can’t give you guarantees. However, from the documentation I understand that :ets.info is O(1) while select_count is O(n) because it needs to check every element for the condition you provide.

Therefore, based on this premise, I would recommend :ets.info instead of select_count.

If you have the chance you can confirm it via benchmarks (you can use benchee for that) and then post the results. It would be interesting, at least to me