Failing to get aggregation to work with `parent()` and 3-layer nesting

I want to calculate the number of unread Posts for each User in a list of User IDs. Read (verb, past tense) Posts are tracked with PostView, by recording the ID of the last read Post for a given User ID.

Here’s the SQL of what I need:

select
  u.id user_id,
  case
    when pv.user_id is not null then pv.count
    else p.count
  end unread_posts_count
from users u
left join (
  select pv.user_id, count(p2.id) "count"
  from post_views pv
  left join posts p1 on p1.id = pv.last_post_id
  left join posts p2 on p2.inserted_at > p1.inserted_at and p2.archived_at is null
  group by pv.member_id
) tp on tp.user_id = u.id
left join (
  select count(p.id), t.id tenant_id
  from tenants t
  left join posts p on p.tenant_id = t.id
  group by t.id
) p on p.tenant_id = u.tenant_id
where u.id in $1

It seems Ash wants you to use the aggregate DSL combined with relationships to do this. I modified test/actions/has_many_test.exs from the Ash code repo to approximate what my app is doing. Also, I added an attribute to the respective resources to represent a tenant ID, because my app is multitenant.

  defmodule Post do
  ...
      attribute :tenant_id, :string, public?: true
  ...
  end

  defmodule PostView do
    @moduledoc false
    use Ash.Resource,
      domain: Ash.Test.Actions.HasManyTest.Domain,
      data_layer: Ash.DataLayer.Ets

    ets do
      private?(true)
    end

    actions do
      default_accept :*
      defaults [:read, :destroy, create: :*, update: :*]
    end

    attributes do
      uuid_primary_key :id

      attribute :last_post_id, :uuid, public?: true
      attribute :user_id, :uuid, public?: true
    end

    relationships do
      belongs_to :user, Ash.Test.Actions.HasManyTest.User do
        source_attribute :user_id
        destination_attribute :id
        public?(true)
      end

      belongs_to :last_post, Post do
        source_attribute :last_post_id
        destination_attribute :id
        public?(true)
      end
    end
  end

  defmodule User do
    @moduledoc false
    use Ash.Resource,
      domain: Ash.Test.Actions.HasManyTest.Domain,
      data_layer: Ash.DataLayer.Ets

    ets do
      private?(true)
    end

    actions do
      default_accept :*
      defaults [:read, :destroy, create: :*, update: :*]
    end

    attributes do
      uuid_primary_key :id

      attribute :name, :string, public?: true
      attribute :tenant_id, :string, public?: true
    end

    relationships do
      has_one :post_view, PostView do
        source_attribute :id
        destination_attribute :user_id
        public?(true)
      end

      has_many :unread_posts, Post do
        source_attribute :tenant_id
        destination_attribute :tenant_id
        public?(true)

        # This is eventually what I want to do, but it always produces no matches
        # filter expr(inserted_at > parent(post_view.last_post.inserted_at))

        # After some experimentation, I found that this filters as expected
        filter expr(id != parent(post_view.last_post_id))
        # But this never matches, even though it seems equivalent
        filter expr(id != parent(post_view.last_post.id))
      end
    end
  end

So it seems that when the parent(...) expression reaches more than two layers deep, nothing matches, but if it reaches only two layers, it does. I also tried parent(...).id, but that gave an error.

Hm…this does seem like a bug. Can you reproduce it in a repo I can test against? I’ll look tomorrow morning.

Something to try would be setting authorize? false on the aggregate, as it by default filters records according to authorization policies. Can users view post_views?

Done

Well, in the example reproduction, I haven’t added the aggregate definition yet. The expression in the relationship needs to work correctly first.

Perfect thank you. I will investigate tomorrow morning and get back to you :bowing_man:

1 Like

Alright, so, the issue in the test reproduction you provided is a bit insidious and I’m not sure if it relates to the actual issue you are seeing because it seems isolated to the way that the ETS data layer computes aggregates.

the id type of Post is :ci_string in that test.

      uuid_primary_key :id, type: :ci_string

But the attribute type of the relationship was :uuid.

      attribute :last_post_id, :uuid, public?: true

Changing that to :ci_string resolves the issue.

The reason this causes a bug is because we check the :source_attribute’s type to see if it supports simple_equality?/0, which means "can we compare values of this with ==. If we can, we use Elixir’s builtin maps for grouping and connecting related entities. This is actually a critical optimization, all Ash apps would be dramatically slower if we didn’t do this.

Due to this misconfiguration, we tried somewhere to compare a CiString struct with a string, and it was not equivalent.

There is supposed to be a compiler warning about types not matching here, not sure why there wasn’t, that can be investigated separately.

So the question is whether or not this has anything to do with your own issue as well, or if its just a red herring in the tests.

1 Like

If this doesn’t relate to your root issue, I’ll need a Postgres-specific reproduction. I would do it myself but my stack is Yuge at the moment.

Here’s an AshPostgres solution:

The current issues are:

  1. When loading the aggregate that delegates to the previously-mentioned relationship, it raises an error. See the test for details.
  2. I can’t figure out how to add the clause to the filter expression to cover one of the required use cases. The details are in the test file.

Feel free to checkout that repo and run the tests.

Will look tomorrow :+1:, thank you for the detailed reproduction.

1 Like

Did some basic investigation, and it’s just a “good old fashioned bug”. The issue is pretty much what you described at the outset, and it stems from essentially a missing set of logic to add the required joins for parent expressions that are contained in relationships used by aggregate queries to the parent query itself.

I’ve pushed a fix to ash_sql main which you can try. It’s not complete but should cover your current case. There is still an issue specifically around if that parent expr itself contains another aggregate that will take me a bit longer to nail down.

Tomorrow or early next week I will transplant your tests into ash_postgres as regression tests and fix the other bug that I mentioned around aggregates.

2 Likes

Two additional issues need to be resolved. It’s using an inner join on those parent expressions where it needs to use a left join, in addition to fixing the aggregate references. Working on it :smile:

And it looks like potentially a bug in Ecto that needs to be fixed around inner_lateral_join vs left_lateral_join :cry:

Nevermind :laughing: hours down the drain not realizing it was just a logic issue. As you had alluded to in the comments in your test, you need a filter like this:

filter expr(
  is_nil(parent(post_view.viewed_post.inserted_at)) or
    inserted_at > parent(post_view.viewed_post.inserted_at)
)

I still need to investigate the aggregates issue, but that shouldn’t affect your use case.

Alright, I’m on the trail of the aggregate bug. I’ve got to step away for now, but should have it fixed this weekend or early next week. The fix for your original case if you use the snippet I showed above should be functional for now, but I won’t release until next week when I have the full fix and have verified w/ tests.

Alright, this should be fully fixed in the latest release of ash_sql, with one small fix in ash_postgres which you should not need immediately. ash_postgres will be released next week.

1 Like

That appears to have worked. Thank you @zachdaniel!

1 Like