Hi all! I’m going to add some caching to an app of mine.
My question is; what are the advantages of ETS vs a GenServer (or an Agent for that matter, the semantics are about the same) for caching?
In my scenario the app will be receiving lots of web requests. Requests to the same resource will have to read and update the cache for that resource. Not a ton of requests to update the same cache entry, but enough. I’ll need the flexibility to cache arbitrary maps of data for an entry.
So basically the question is whether my operations are atomic then right? If each request does something atomic like adding and subtracting from a single counter, then that can be done in any order and I could use ETS. However, if each request sets the counter to a specific value, I’d want to use GenServer to ensure they are all ordered correctly?
I guess I’d have to use a registry if I wanted to store each resource’s cache value in a separate GenServer so there wouldn’t be one single bottleneck for every single request right? So I could lookup the right GenServer to call.
Some process should own the ETS table. Most of the times you would have a GenServer that would create a “protected” ETS table. By default ETS is created with protected access control, that allows any process to read but only owner can write. (see https://elixir-lang.org/getting-started/mix-otp/ets.html#ets-as-a-cache)
I guess I’d have to use a registry if I wanted to store each resource’s cache value in a separate GenServer so there wouldn’t be one single bottleneck for every single request right?
I don’t think that’s the right way to do, though it depends on the problem . Even if we store each resource’s cache in a separate GenServer it’s still a single GenServer for the resource’s cache - so N requests for that resource would have queued in the GenServer’s mailbox…
Note that Erlang only guarantees message ordering between pairs of processes. If processes A and B send a message to C, the messages can arrive in any order, even if one was sent “before” the other. Erlang only guarantees that if e.g. A sends two messages to C, they’ll arrive in the same order they were sent.
Right but wouldn’t having one GenServer per resource cache value be better than ETS with one process owning all writes? In that scenario you’d have a bottleneck for writing to any cache value, while the GenServer one you’d have a bottleneck only writing to a single resource cache value, which is fine because you want those ordered anyway.