This is a general water-cooler type of question for all the smart folks in this forum – thanks for any feedback and apologize if this is too wordy.
The problem: requests to /api/expensive
responds too slowly.
The solution: caching. (Unless there’s some miracle alternative?).
One common approach implements a callback… i.e. “if cached result exists, return it, if not, then perform the callback: do the expensive operation, cache it, and then return it.” That’s fairly clean to implement, but I’ve seen that solution fail when lots of requests enter into the callback before the cached result is calculated. I’ve heard this condition referred to as a “cache slam”, and in PHP (and presumably with other languages too?), using a semaphore was required to lock and serialize requests in order to avoid it. Frequently, the database was the unavoidable “expensive” thing in this scenario.
One of the drawbacks of the above is that the code isn’t as clean… you can’t tell from looking at the route whether or not the results accurately reflect what is in the data model. You don’t know if you’re looking at cached data or the results directly from the database.
So, the next tweak is often to add an optional “refresh” parameter that triggers a fresh lookup. Although that works, you sacrifice even more clarity in your code.
In a super-clean/transparent API world of resources, I thought it might be a cleaner implementation if you implemented a cache service. In other words, /api/expensive
ALWAYS performed that expensive operation (just as its name suggests). And a cache service could expose a route like /api/cache/expensive
that would store the result of that operation. You could make POST/PUT operations to the cache service to add/update its contents, and there would never be any guessing – the cache endpoints would contain cached data, just as its name suggests.
Behind the scenes, we’d have to implement some message queue or callbacks to ensure that any changes to the data behind one of the expensive endpoints would cause the result to be stored in the cache service. Thoughts?
I’ve seen some discussion like this:
That’s really over my head, but is that a recommended solution to this problem?
I’m new to Elixir/Erlang/Phoenix, so I admit that my notions of caching are probably out of whack with what is idiomatic here, so I’d love some guidance and thoughts on how others have dealt with this problem. Many thanks!