ETS vs GenServer for caching

jswny · November 6, 2021, 3:23am

Hi all! I’m going to add some caching to an app of mine.

My question is; what are the advantages of ETS vs a GenServer (or an Agent for that matter, the semantics are about the same) for caching?

In my scenario the app will be receiving lots of web requests. Requests to the same resource will have to read and update the cache for that resource. Not a ton of requests to update the same cache entry, but enough. I’ll need the flexibility to cache arbitrary maps of data for an entry.

dom · November 6, 2021, 3:34am

GenServer can only do one operation at a time. ETS is concurrent.

Typically you’d use both: ETS for the fast path (reading keys), and a GenServer for things that need coordination. See for instance GitHub - sasa1977/con_cache: ets based key/value cache with row level isolated writes and ttl support

jswny · November 6, 2021, 4:03am

So basically the question is whether my operations are atomic then right? If each request does something atomic like adding and subtracting from a single counter, then that can be done in any order and I could use ETS. However, if each request sets the counter to a specific value, I’d want to use GenServer to ensure they are all ordered correctly?

I guess I’d have to use a registry if I wanted to store each resource’s cache value in a separate GenServer so there wouldn’t be one single bottleneck for every single request right? So I could lookup the right GenServer to call.

RudManusachi · November 6, 2021, 4:46am

Some process should own the ETS table. Most of the times you would have a GenServer that would create a “protected” ETS table. By default ETS is created with protected access control, that allows any process to read but only owner can write. (see https://elixir-lang.org/getting-started/mix-otp/ets.html#ets-as-a-cache)

I guess I’d have to use a registry if I wanted to store each resource’s cache value in a separate GenServer so there wouldn’t be one single bottleneck for every single request right?

I don’t think that’s the right way to do, though it depends on the problem . Even if we store each resource’s cache in a separate GenServer it’s still a single GenServer for the resource’s cache - so N requests for that resource would have queued in the GenServer’s mailbox…

dom · November 6, 2021, 3:08pm

However, if each request sets the counter to a specific value, I’d want to use GenServer to ensure they are all ordered correctly?

You can do atomic compare-and-swap with ETS: ets:select_replace/2.

Note that Erlang only guarantees message ordering between pairs of processes. If processes A and B send a message to C, the messages can arrive in any order, even if one was sent “before” the other. Erlang only guarantees that if e.g. A sends two messages to C, they’ll arrive in the same order they were sent.

jswny · November 6, 2021, 3:28pm

Right but wouldn’t having one GenServer per resource cache value be better than ETS with one process owning all writes? In that scenario you’d have a bottleneck for writing to any cache value, while the GenServer one you’d have a bottleneck only writing to a single resource cache value, which is fine because you want those ordered anyway.

Or am I getting this wrong?

RudManusachi · November 6, 2021, 5:26pm

You are getting it right.
At first I just thought you want to store values in separate GenServer states, not separate ETS for each resource.

kokolegorille · November 6, 2021, 5:59pm

You have an example of how to build a caching system with OTP in this book…

…if You don’t mind reading some Erlang code.

AFAIK they use one gen_server per key.