ETS or GenServer for caching messages

askasp · September 1, 2021, 11:22am

Hi!
I have a real time chat application that stores the messages in postgres. I want to store the latest 15 messages of each chatroom in-memory. Lets say we have 100k users which in average has 7 rooms each.

I’m unsure if i should use

one GenServer for each room
one ets table for each room
one ets table and a different key for each room.

What should i have in mind when making this decision? ETS allows for concurrent reads, and having a genserver for each room naturally requires a registry and dynamic supervisor, which is sort of an overhead. But I can then use a circular buffer Cbuf.Queue – cbuf v0.7.1

Any tips or insights would be greatly appreciated!

MrDoops · September 1, 2021, 8:50pm

Both! You typically want a GenServer owning an ETS whereby it can handle writes and expirations but reads go directly to ETS for that sweet read concurrency.

I implemented a similar feature but the goal was a sliding window cache with TTLs per message. Scrolling up the chat history would fetch more messages which would get loaded into the cache, but expire if not observed for a while. I’d have finished the feature but it got into hooks and Javascript and I didn’t want to that for a side project .

It was implemented by dynamically (Dynamic Supervisor + Registry + GenServer) spawning a chat room cache (GenServer) to own an ordered set ETS table with the cached messages. To find the right ETS table name of the cache at runtime a normal ETS table owned by another Genserver, this time started on app startup, would host the state for key values to find the cache name. I suspect the Registry can also be used for this. I used :ets.select_delete/2 and a Process.send_after to find expired messages and remove them.

For caches you typically want concurrent reads (direct to ETS) but for writes - a process, usually a Genserver, to manage things like scheduling expirations or listening to pubsub or handling a callback to update the cache state.

Qqwy · September 1, 2021, 9:25pm

I would recommend to start with a plain GenServer, and only move to ETS once you are struggling with scalability issues (like too many connections trying to read information from the GenServer at a single time), as adding ETS makes the system significantly more complex.