Optimization by storing database table in memory?

I’m hoping for some advice in optimizing a Phoenix/Elixir application with respect to a couple different kinds of database tables. The regular CRUD stuff is simple: lookups based on an id that can get cached at the database layer or in Elixir (e.g. using Cachex) as needed. Those use-cases are under control. There are 2 other use cases that are a bit trickier.

First use case: small, commonly used tables like “countries” or “currencies”. Our application has dozens of small (ish) tables with maybe 100 rows each whose values do not change often. Is it possible in Elixir to load these tables into memory or cache when the application is started? We could save ourselves many database lookups and still maintain a modest memory footprint. Are there recommended approaches to this tactic? Has anyone seen a tutorial for this?

Second use case: large, but critical tables that do not change often. This one is trickier, and I may not be able to describe the exact need this fills without ending up in the weeds. But for every request into the application, we need to do a couple lookups to this table to “translate” certain values that relate to the route and do some “reverse lookups” of other values. Currently, in our legacy PHP application, these values are stored in Redis, and that works reasonably well. The one feature of Redis that makes this approach possible is its mget feature: with a single Redis call, we can retrieve multiple records. If this were happening in a SQL database, it might be analogous to a query like “SELECT value FROM table WHERE column IN (a, b, c, …)” where we specify ~50 values to lookup.

This 2nd use case is trickier because of the “mget” lookup – I’m not sure how we might cache that because the result returned depends on a fairly long set of conditions.

To clarify the difficulties, we have not been successful in any solution that attempted to “lazy load” results: the query simply takes a bit too long and the results are unique enough so that there isn’t much payoff afterwards because the results are rarely re-used. So in this case, we have needed to preload the results in Redis, and this has worked reasonably well.

If anyone has some advice / links / thoughts on the matter, I’m thankful for any guidance!

As long as data fits into RAM (which seems to be the case), ETS would perfectly fit your needs: load your mostly-static tables data into a ETS table on application startup and add a bit of logic to update the ETS table along with persistent one.

1 Like

It’s very tough to know when to invalidate cache. However, using Postgres notifications might help. Put everything in ETS and invalidate and reload cache when Postgres sends you a notification.

1 Like

Thanks. ETS seems like it would be a good solution for the smaller tables. I’m not clear as to whether or not it would be suitable for the larger tables with complex lookup criteria.

it depends on how complex it is. ETS doesn’t provide SQL-like queries, however it is much more powerful than just a simple key lookup. Search for :ets.match/2 match patterns, or look through original (Erlang) ETS docs here: http://erlang.org/doc/man/ets.html

2 Likes

This performs a linear search internally though.

But yeah, overall ETS (or a wrapper like Cachex, my favorite thus far) is exactly suited to be a cache, just listen for postgres notifications to know when to invalidate and/or update values. With Cachex you can also limit the number of entries (to keep memory from blowing), auto reclaim unused values, etc…

2 Likes

For the second case (even the first case) you might find FastGlobal an alternative approach to encapsulating data that changes infrequently and requires fast and efficient access.

1 Like

If the key is part of a match spec, it doesn’t. In ordered_set tables just a prefix of a key is enough not to trigger a full table scan.

4 Likes