Ash aggregates with sqlite

I’m in the process of migrating an application over to using ash. I do need to support both postgres and sqlite, as the app can be either deployed on a server or shipped as an android application. (Side note: If I could somehow use postgres in an embedded way like sqlite then I wouldn’t be in this mess. Maybe someone knows how we could run PGlite embedded in Elixir?)

Now I’ve run into a big obstacle: Ash aggregations are not supported in ash_sqlite.
This wouldn’t be that big of an issue if I could use the old ecto code that does the aggregation, however those queries also include a lot of preload queries. And preload queries do not work with ash resources.

So now I have a working ash implementation, but it only works with postgres.
And I have a working ecto implementation, but that only works with ecto modules, not ash resources.

How do I get out of this? I do not want to keep around both ecto models and ash resources.
Can I somehow implement those aggregates manually, such that they work with sqlite?
Can I somehow convert the ash resources into ecto models that support preloads?
Can I contribute to ash_sqlite to implement aggregates? Is that even feasible?

I greatly appreciate any help here, thanks :folded_hands:

What kind of aggregates are you using?

What kinds are there?
It’s not actually an aggregate, but a calculation that does the aggregate. But the error is quite clear that aggregates aren’t supported with ash_sqlite (it works with ash_postgres):

** (exit) an exception was raised:
    ** (Ash.Error.Invalid)
Bread Crumbs:
  > Building expression for calculation: time_played_during([earliest_date: ~N[2025-12-10 00:00:00], latest_date: nil])
  > Error returned from: DB.Music.Track.read_with_meta

Invalid Error

* Data layer for DB.Music.Track does not support using aggregates
  (ash 3.11.3) lib/ash/error/query/aggregates_not_supported.ex:8: Ash.Error.Query.AggregatesNotSupported.exception/1

The calculation is a bit involved, I have this on tracks and web_tracks:

calculate :time_played_during,
          :float,
          expr(
            duration *
              sum(play_stats,
                field: :completion_ratio,
                query: [
                  filter:
                    expr(
                      (is_nil(^arg(:earliest_date)) or inserted_at >= ^arg(:earliest_date)) and
                        (is_nil(^arg(:latest_date)) or inserted_at <= ^arg(:latest_date))
                    )
                ]
              )
          ) do
  argument :earliest_date, :naive_datetime
  argument :latest_date, :naive_datetime
end

And on another resource that uses this calculation on it’s relationships:

calculate :time_played_during,
          :float,
          expr(
            (sum(tracks,
                field: :time_played_during,
                arguments: %{
                  earliest_date: ^arg(:earliest_date),
                  latest_date: ^arg(:latest_date)
                }
              ) || 0) +
              (sum(web_tracks,
                  field: :time_played_during,
                  arguments: %{
                    earliest_date: ^arg(:earliest_date),
                    latest_date: ^arg(:latest_date)
                  }
                ) || 0)
          ) do
  argument :earliest_date, :naive_datetime
  argument :latest_date, :naive_datetime
end

(Aside, I’m still a beginner ash user, if there is a better way to write that please let me know)

Got it, yeah so it is a bit unfortunate. The primary issue is that aggregates use lateral joins in postgres and I haven’t had a chance to figure out a good way to support them in ash_sqlite yet. However, you can do something like:

calculate :time_played_during, :float, expr(fragment("""
  (SELECT * FROM ...)
""")

You lose a lot of the goodness unfortunately. You can also make a calculation that uses calculate and does this in memory when its ash_sqlite for example:

defmodule MyCalculation do
  use Ash.Resource.Calculation

  if <using_sqlite> do
    def load(_, _, _), do: [:dependency, of: [:calc]]
    
    def calculate(records, _, _) do
    end
  else
    def expression(_, _) do
      ....
    end
  end
end

Hmm, ok, the fragment approach doesn’t seem too bad. I used a bit of AI help to translate it, seems to work:

    calculate :time_played_during,
              :float,
              expr(
                duration *
                  fragment(
                    """
                    (SELECT SUM(completion_ratio)
                     FROM play_stats
                     WHERE play_stats.track_id = ?
                       AND (? IS NULL OR play_stats.inserted_at >= ?)
                       AND (? IS NULL OR play_stats.inserted_at <= ?))
                    """,
                    id,
                    ^arg(:earliest_date),
                    ^arg(:earliest_date),
                    ^arg(:latest_date),
                    ^arg(:latest_date)
                  )
              ) do
      argument :earliest_date, :naive_datetime
      argument :latest_date, :naive_datetime
    end

Would be nice if I could pass in an expr() as an argument to a fragment to simplify that last condition to: fragment(".. AND ?", id, expr(...)). However that does not seem to work.

Also, I’m wondering, if it works with sqlite if using a subquery instead of a lateral join, could this be how to implement aggregates support in ash_sqlite?

Using the fragment approach, would there be any way to reuse that calculation on the related resource?
Probably not, right?

As the one that uses aggregates on those aggregates now becomes very ugly:

      calculate :time_played_during,
                :float,
                expr(
                  fragment(
                    """
                    COALESCE((
                      SELECT SUM(t.duration * (
                        SELECT SUM(ps.completion_ratio)
                        FROM play_stats ps
                        WHERE ps.track_id = t.id
                          AND (? IS NULL OR ps.inserted_at >= ?)
                          AND (? IS NULL OR ps.inserted_at <= ?)
                      ))
                      FROM tracks t
                      WHERE t.album_id = ?
                    ), 0) +
                    COALESCE((
                      SELECT SUM(wt.duration * (
                        SELECT SUM(ps.completion_ratio)
                        FROM play_stats ps
                        WHERE ps.web_track_id = wt.id
                          AND (? IS NULL OR ps.inserted_at >= ?)
                          AND (? IS NULL OR ps.inserted_at <= ?)
                      ))
                      FROM web_tracks wt
                      WHERE wt.album_id = ?
                    ), 0)
                    """,
                    ^arg(:earliest_date),
                    ^arg(:earliest_date),
                    ^arg(:latest_date),
                    ^arg(:latest_date),
                    id,
                    ^arg(:earliest_date),
                    ^arg(:earliest_date),
                    ^arg(:latest_date),
                    ^arg(:latest_date),
                    id
                  )
                ) do
        argument :earliest_date, :naive_datetime
        argument :latest_date, :naive_datetime
      end

I’d really love to solve this at the ash_sqlite level, maybe you can give me some pointers on how I could approach contributing this feature?

Hmm…unfortunately no, probably not. I do think the big win would be figuring out how to model at least simple aggregates in ash_sqlite. I also do recall hearing something about sqlite supporting lateral joins at some point?

1 Like

I took some time to dig into the source code of ash_sqlite and ash_sql, which would have to change significantly.
Unfortunately this is definitely over my head.

There are many lateral joins in ash_sql, and replacing them by correlated sub queries seems really difficult.
For one, ecto doesn’t support correlated subqueries, so fragments would have to be used for them.
Plus, finding the correct join clause for each lateral join will probably be different for each usage of lateral join, requiring proper understanding of each usage.

It really seems like lateral joins are just the right fit for this kind of problem and sqlite not supporting them really makes it painfully difficult.

I also do recall hearing something about sqlite supporting lateral joins at some point?

There is a branch that implements lateral, but Richard is not convinced it is needed.

Maybe you could be the one to convince him?

If you thinking having LATERAL is important, then try to convince me otherwise. Your best approach would be do demonstrate a real-world problem that LATERAL solves that would be difficult to resolve without LATERAL.
– Richard Hipp

I mean, ash definitely solves real-world problems, so you might actually be able to sway him?

I also do recall hearing something about sqlite supporting lateral joins at some point?

I just saw that turso might be getting lateral joins soon: Implement LATERAL joins by LeMikaelF · Pull Request #4462 · tursodatabase/turso · GitHub

Maybe in the future sqlite will be replaced by turso, as it might just become a better sqlite. “Extended ALTER support” and “improved write throughput using MVCC” sounds awesome to me.