"Best" solution to cache database queries?

denvaar · March 16, 2021, 4:25am

I have a LiveView which lets the user filter a list of things as they type (which is a database query). I’m using phx-debounce to help limit how often events are fired, yet still provide a nice, real-time search experience.

I was thinking that I could take it a little bit further and do some caching of the search queries as well. I can think of several options to do this off the top of my head, but I am interested in knowing if there’s a way that would be considered most pragmatic or common.

Use an Agent / GenServer to keep a mapping of queries and their results. Seems doable, but would I be reinventing the wheel? Also would probably not be good for concurrency because GenServer’s handle messages one at a time.
Create a module which wraps ets table (looks like you can auto-expire elements, which seems handy?)
Use a library like ConCache. Looks like probably the best solution to me since it’s ets based, and probably takes care of some details that I may otherwise miss – eg. provides a nice api.

evadne · March 16, 2021, 4:43am

How about not caching at all. Have you done any profiling that indicates the necessity to cache?

I have used cache_tab with great success.

wmnnd · March 16, 2021, 7:20am

I’ve had good experiences with ConCache but you should definitely heed @evadne’s advice: Don’t implement caching before you know it’s absolutely necessary.

mindok · March 16, 2021, 8:19am

Another library option - Cachex. I used it for permission-checking through complex authorisation policies which get hit multiple times in a request cycle (or liveview page mounting). I haven’t really loaded it up, but it seems well behaved so far and has quite a few niceties (e.g. cache expiry, max size, pre-warming etc).

denvaar · March 16, 2021, 1:45pm

Yeah, I agree that I have no real need to cache, as my app has one user right now: me. I was mostly curious to find out what the common approach is assuming that there was a need. It seems like most people have suggested libraries which rely on ets, so it seems to me like this would be the most common way to handle it.

rcavanaugh · March 16, 2021, 2:19pm

I’d say there is no one “best” caching solution. It very much depends on your specific use case. As you’ve seen, there’s a lot of tools available and they have different strengths and weaknesses.

Before anything else, I couldn’t agree more, wait to implement caching until you actually have a specific need.

I have a production system that leans very heavily on caching but, the approaches differ based on the needs. Some cases are trivial bits of data that happen to be used a lot and only looked up by a single well-known key. Those are all using Cachex which caches locally to ETS.

I’m using Nebulex where I’m caching data that’s heavier on my backend systems to generate. It’s still all looked up by known keys but, may take a while to generate and consume significant resources while doing it. in this case I’m using a layered cache going from local (ETS) to distributed (Redis) and finally actually generating the data. This way node/instance failures or deploys don’t trigger a thundering herd.

Finally, there’s some purpose built caching that leverages ETS directly where the cost to implement and maintain is worth it compared to the cost of not caching the data. This is largely used in cases where the query and expiration patterns don’t lend themselves cleanly to a single key mapping to a single record.

One additional concern that gets overlooked too often, think long and hard about cache expiration and the impact of stale data. Craft consistent APIs that push querying and cache management through a single module. You really don’t want to end up in a situation where a change to cache strategies requires a hunt through code.