What's the best approach to use Cachex with Ash framework for read query?

kamaroly · November 6, 2024, 4:06pm

Is there an elegant way of using Cachex and Ash Framework?

I have a read query that I need to cache results for, because of the many relationships attached to it. I wonder if there’s a better way of caching results while reading in Ash Framework.

Here’s how I am doing it from the domain module.

  @doc """
  Get cached person if exists, otherwise fetch from the DB and cache for future fast-reading
  """
  def get_cached_person_first(person_id, context) do
    case Cachex.get(:person, person_id) do
      {:ok, %{} = person} ->
        person

      {:ok, nil} ->
        Hr.People.get_loaded_person!(person_id, context)
        |> cache_person()
    end
  end

barnabasJ · November 6, 2024, 6:11pm

There is no Cachex extension or anything like that at the moment.

A more Ash-centric approach would be to check the cache in the preparation of the read action, and if so, you can set the data on the query, and Ash won’t do the roundtrip to the database.

https://hexdocs.pm/ash/preparations.html
https://hexdocs.pm/ash/Ash.Query.html#set_result/2

You can add an after_action hook into the read action to populate the cache

I would create a specific read action for this.

You can also use a global change that runs on updates and destroys to invalidate the cache.

Caching involves a lot of complexities, make sure you really need it

BartOtten · November 6, 2024, 6:18pm

Every extra layer adds complexity and caching is one of those layers that tend to become complex. Most databases can be tweaked to cache more (they already do, but default settings are generally quite low) and if tweaking is enough, the complexity is completely handled by the database. Let the database do the heavy lifting for you (also research ‘materialized tables’)

That being said: let’s continue with the original question below.

srikanthkyatham · March 2, 2025, 4:02pm

I am attaching a code, for the sake of reference. Combing through different resources. To come up with possible answer.

action

    read :read do
      primary? true
      pagination offset?: true, default_limit: 10
      argument :check_cache, :boolean do
        default true
      end
      prepare Preparations.CheckCache
    end


defmodule Preparations.CheckCache do
  @moduledoc """
  Checks a simple agent cache for libraries
  """
  use Ash.Resource.Preparation

  def prepare(query, _, _) when query.arguments.check_cache == true do
    CacheHelper.attach_cache(query)
  end

  def prepare(query, _, _) do
    query
  end
end

  def attach_cache(%Ash.Query{} = query) do
    query
    |> before_action()
    |> Ash.Query.after_action(fn query, records -> after_action(query, records) end)
  end

  def before_action(%Ash.Query{} = query) do
    cache_name = ReserveCache.name()

    key = query_filter_to_key(query)
    # take arguments to key as well

    updated_query =
      case Cachex.exists?(cache_name, key) do
        {:ok, true} ->
          {:ok, results} = Cachex.get(cache_name, key)
          Ash.Query.set_result(query, results)

        _ ->
          query
      end

    updated_query
  end

jimsynz · March 10, 2025, 6:01pm

This is the approach I’d take but I want to put in a pretty strong warning here that you need to be really sure of any authorisation implications. Using Ash.Query.set_result/2 is going to essentially bypass any policies and field policies of the resource in question and those of any loaded values. The other issue is that it’s possibly to accidentally leak one user’s data to another user. Including the actor in the cache key will help mitigate the latter problem but I don’t think there’s anything that can be done about the former.

srikanthkyatham · March 11, 2025, 9:43am

@jimsynz Could you suggest any better approach for caching.

jimsynz · March 11, 2025, 6:33pm

Well I guess the first question is to ask whether you have a measured performance issue or are just assuming that there will be one?

Postgres is pretty damn good at planning and executing queries efficiently so I’d probably start by analysing the queries that Ash is generating to make sure that the appropriate indexes are in place and being utilised. Second if Ash is generating a pathological query I’d open an issue on the ash_postgres repo to see if it can be made more efficient. Only after I had exhausted the “just use the database” options would I seriously consider the complexity of caching or denormalising the data.

If the data is very fast moving then I might start considering my options re keeping some of it in RAM; caching is one method, the other is an ETS backed resource that works like a read through cache or is populated by changes to the underlying resource.

Otherwise, given that storage is usually cheaper and more abundant than memory I would consider denormalising the data into a JSON attribute somewhere and using that for reads. Probably populated by after action hooks or background jobs.