No attribute left join relantionship not giving correct results

Hi, I want to translate the following query to a query inside Ash with my resource:

select buy_record.transaction_id as buy_transaction,
       sell_record.transaction_id as sell_transaction,
       buy_record.external_id,
       buy_record.first_grantor_full_name as original_seller,
       buy_record.first_grantee_full_name as buyer,
       sell_record.first_grantor_full_name as seller,
       sell_record.first_grantee_full_name as next_buyer,
       coalesce(buy_record.recording_date, buy_record.instrument_date) as buy_date,
       coalesce(sell_record.recording_date, sell_record.instrument_date) as sell_date

from raw.records as buy_record

left join raw.records as sell_record
          on sell_record.external_id = buy_record.external_id and
             sell_record.first_grantor_full_name = buy_record.first_grantee_full_name and
             coalesce(sell_record.recording_date, sell_record.instrument_date) is not null and
             coalesce(sell_record.recording_date, sell_record.instrument_date) >= coalesce(buy_record.recording_date, buy_record.instrument_date) and
             not sell_record."deleted?"

where buy_record.first_grantee_full_name %> 'vinh truong' and
      coalesce(buy_record.recording_date, buy_record.instrument_date) is not null and
      not buy_record."deleted?" and
      coalesce(buy_record.recording_date, buy_record.instrument_date) >= now() - interval '2' year

order by buy_record.recording_date

The important part is the left join one.

To do something similar to this, I added the following to my resource:

  relationships do
    alias Pacman.Raw.Record

    has_one :sell_record, Record do
      no_attributes? true

      filter expr(
               external_id == parent(external_id)
               and first_grantor_full_name == parent(first_grantee_full_name)
               and not is_nil(recording_date || instrument_date)
               and (recording_date || instrument_date) >= (parent(recording_date) || parent(instrument_date))
               and not deleted?
             )
    end

    ...

    read :by_entity do
      argument :entity_full_name, :string, allow_nil?: false

      filter expr(fragment("(? %> ?)", first_grantee_full_name, ^arg(:entity_full_name))
                  and not is_nil(recording_date || instrument_date)
                  and (recording_date || instrument_date) >= ago(2, :year)
                  and not deleted?)

      prepare build(sort: [:recording_date], load: [:sell_record])
    end

Ash will split these into two queries, one loading the resource with the filter and then another one to load the relationship. That query is pretty big and get truncated in my terminal, so I can’t easily see what is doing, but what is returning is not correct.

My guess is that the way relationships in Ash works is somewhat different from what I expected and so the above code is not equivalent to the left join from the SQL query.

Do someone knows what I’m doing wrong here and how to make this work as expected?

I don’t get where the problem is here. I was able to get the query Ash is using to do the join:

SELECT DISTINCT s1."id", s1."external_id", s1."transaction_id", s1."first_grantee_full_name", s1."first_grantor_full_name", s1."instrument_date", s1."recording_date", s1."transaction_time", s1."deleted?", s1."inserted_at", s1."updated_at", r0."id"
FROM "raw"."records" AS r0
INNER JOIN LATERAL (
  SELECT sr0."id" AS "id", sr0."external_id" AS "external_id", sr0."transaction_id" AS "transaction_id", sr0."first_grantee_full_name" AS "first_grantee_full_name", sr0."first_grantor_full_name" AS "first_grantor_full_name", sr0."instrument_date" AS "instrument_date", sr0."recording_date" AS "recording_date", sr0."transaction_time" AS "transaction_time", sr0."deleted?" AS "deleted?", sr0."inserted_at" AS "inserted_at", sr0."updated_at" AS "updated_at"
  FROM "raw"."records" AS sr0
  WHERE (sr0."external_id"::text = r0."external_id"::text)
    AND (sr0."first_grantor_full_name"::text = r0."first_grantee_full_name"::text)
    AND (NOT ((ash_elixir_or(sr0."recording_date"::date, sr0."instrument_date"::date) IS NULL) = true::boolean))
    AND (ash_elixir_or(sr0."recording_date"::date, sr0."instrument_date"::date) >= ash_elixir_or(r0."recording_date"::date, r0."instrument_date"::date))
    AND (NOT (sr0."deleted?"::boolean))
  LIMIT 1) AS s1 ON TRUE
WHERE (r0."id" = ANY('{c6659a06-ebec-4a24-a58d-446a0f3352fd, 479660ff-d2a6-4593-abc6-b93aa9451270, e71c1fab-47ff-423a-b757-ee033138a424, 1b3a5c1d-dccb-43c0-90b1-fc8a67c8e8e8, 0f35f010-905e-417e-a3f3-4c29c6768702, 6bd76455-f89d-4b2e-ba85-768c456ddc87, 4ac6a6e6-7a0a-4151-be49-98aeb2e0b521, 083c47a0-bf2c-44d4-8893-351ab7b50587}'::uuid[]))

This query will result in one row for id f71c4e47-86d7-403b-9886-7300db077ebb, which is correct.

But, Ash will return a join for each one of the 8 lines with some random data, instead of just having the sell_record field as nil.

Also, I noticed that I can only trigger this issue when the join returns some data.

For example, from my original query, I will get 8 lines, the first four and last 3 lines should all contain sell_record as nil and the fifth line should contain data in sell_record.

If I limit the query to only fetch the first 4 elements for example, it will correctly put sell_record as nil, but if I allow it to fetch the fifth line (which has data in sell_record, meaning that the join query above will return 1 line as result), then all the lines will have sell_record populated with some data.

Ah, I just noticed that actually is not random data that is being added to sell_record, but actually that the single line the join query will return will populate every single one of the 8 result lines with it instead of only the fifth line.

Unless I’m missing something, this seems like a bug to me.

Can you try setting the from_many? true option?

    has_one :sell_record, Record do
      no_attributes? true
      from_many? true

      filter expr(
               external_id == parent(external_id)
               and first_grantor_full_name == parent(first_grantee_full_name)
               and not is_nil(recording_date || instrument_date)
               and (recording_date || instrument_date) >= (parent(recording_date) || parent(instrument_date))
               and not deleted?
             )
    end

nvm, you’re right, its a bug. With the no_attributes? option and doing lateral joins. :thinking: need to figure that one out…

The way lateral joins work is that they run once per source row. It shouldn’t be populating all rows on the left side of the join with the result of the right side unless there is an error somewhere in the filters on the right hand side.

Same result using from_many? true

Is there any other information that I can give to you to help solve this issue?

I think the best thing to start with is to find two related items and explain why they shouldn’t have been paired together.

I just noticed that the query Ash is generating for the join (the one I posted above), has two fields called id on the select part (the first item and the last), do you think that can be the problem?

I don’t think that would be an issue, no.

I don’t know if I got what you mean here, but here is the result of the above join query:

                  id                  | external_id | transaction_id | first_grantee_full_name | first_grantor_full_name | instrument_date | recording_date | transaction_time | deleted? |        inserted_at         |         updated_at         |                  id                  
--------------------------------------+-------------+----------------+-------------------------+-------------------------+-----------------+----------------+------------------+----------+----------------------------+----------------------------+--------------------------------------
 f71c4e47-86d7-403b-9886-7300db077ebb | 166842475   |     1015448543 | <redacted>              | <redacted>              | 2023-07-21      | 2023-07-24     |    1699902508200 | f        | 2023-11-13 21:48:51.043316 | 2023-11-13 22:39:59.134758 | 0f35f010-905e-417e-a3f3-4c29c6768702
(1 row)

That id 0f35f010-905e-417e-a3f3-4c29c6768702 is the ID from one of the rows I’m doing the join (the id inside the array[4] in the above query).

From, that, I would expect that only that record would have that join row inside sell_record, but all of them have it.

What I’m interested in is those other rows, and pointing to specific values that shouldn’t match the filter you’ve provided in the relationship. Ideally we can look at the structs that come back from the request to Ash, for example:

%Foo{a: 2, bar: %Bar{a: 1}} - “this is wrong because foo.a is > bar.a”

Something like that :slight_smile:

The data that I have there is pretty critical, so instead of that I created a very small project tht reproduces the issue:

Basically you just need to clone it, run mix setup to create and populate the database with the seeds and then, inside of it you can run:

JoinBug.MyApi.MyResource.by_entity(%{entity_full_name: "B"})

Which will return 3 rows, but only one of them should’ve the sell_record field populated.

So, just for completeness, what I did in this small project was add 4 rows to the db with this data:

[
  %{
    external_id: "999999999",
    first_grantor_full_name: "A",
    first_grantee_full_name: "B"
  },
  %{
    external_id: "166842475",
    first_grantor_full_name: "A",
    first_grantee_full_name: "B"
  },
  %{
    external_id: "166842475",
    first_grantor_full_name: "B",
    first_grantee_full_name: "C"
  },
  %{
    external_id: "149628232",
    first_grantor_full_name: "A",
    first_grantee_full_name: "B"
  },
]

Then, my action by_entity want’s to search for each row where first_grantee_full_name is "B" and then join the sell_record relationship which should check for two things: if the external_id is the same and if the joined first_grantor_full_name is the same as the first_grantee_full_name from the parent.

As you can see, with this criteria, the second row would have a sell_record populated with the third row, but Ash will populate the third row in all the other ones giving the following result:

iex(1)> JoinBug.MyApi.MyResource.by_entity(%{entity_full_name: "B"})
[debug] QUERY OK source="records" db=10.0ms decode=2.1ms queue=0.7ms idle=1140.8ms
SELECT r0."id", r0."external_id", r0."first_grantee_full_name", r0."first_grantor_full_name", r0."inserted_at", r0."updated_at" FROM "records" AS r0 WHERE (r0."first_grantee_full_name"::text = $1::text) ["B"]
↳ AshPostgres.DataLayer.run_query/2, at: lib/data_layer.ex:646
[debug] QUERY OK source="records" db=0.7ms queue=1.2ms idle=1203.4ms
SELECT DISTINCT s1."id", s1."external_id", s1."first_grantee_full_name", s1."first_grantor_full_name", s1."inserted_at", s1."updated_at", r0."id" FROM "records" AS r0 INNER JOIN LATERAL (SELECT sr0."id" AS "id", sr0."external_id" AS "external_id", sr0."first_grantee_full_name" AS "first_grantee_full_name", sr0."first_grantor_full_name" AS "first_grantor_full_name", sr0."inserted_at" AS "inserted_at", sr0."updated_at" AS "updated_at" FROM "public"."records" AS sr0 WHERE (sr0."external_id"::text = r0."external_id"::text) AND (sr0."first_grantor_full_name"::text = r0."first_grantee_full_name"::text) LIMIT $1) AS s1 ON TRUE WHERE (r0."id" = ANY($2)) [1, ["f7122323-0fda-4047-ba99-1ecfb03dc065", "46aa04a7-cadd-4d9f-87b7-14ef1156f26b", "8a3c6e42-394e-4a75-b671-e729ecc27c86"]]
↳ AshPostgres.DataLayer.run_query_with_lateral_join/4, at: lib/data_layer.ex:890
{:ok,
 [
   #JoinBug.MyApi.MyResource<
     sell_record: #JoinBug.MyApi.MyResource<
       sell_record: #Ash.NotLoaded<:relationship>,
       __meta__: #Ecto.Schema.Metadata<:loaded, "public", "records">,
       id: "135245fa-99b2-41c7-b03f-db2bc2bbf6a6",
       external_id: "166842475",
       first_grantee_full_name: "C",
       first_grantor_full_name: "B",
       inserted_at: ~U[2023-11-15 22:30:35.145764Z],
       updated_at: ~U[2023-11-15 22:30:35.145764Z],
       aggregates: %{},
       calculations: %{},
       ...
     >,
     __meta__: #Ecto.Schema.Metadata<:loaded, "records">,
     id: "f7122323-0fda-4047-ba99-1ecfb03dc065",
     external_id: "999999999",
     first_grantee_full_name: "B",
     first_grantor_full_name: "A",
     inserted_at: ~U[2023-11-15 22:30:35.108437Z],
     updated_at: ~U[2023-11-15 22:30:35.108437Z],
     aggregates: %{},
     calculations: %{},
     ...
   >,
   #JoinBug.MyApi.MyResource<
     sell_record: #JoinBug.MyApi.MyResource<
       sell_record: #Ash.NotLoaded<:relationship>,
       __meta__: #Ecto.Schema.Metadata<:loaded, "public", "records">,
       id: "135245fa-99b2-41c7-b03f-db2bc2bbf6a6",
       external_id: "166842475",
       first_grantee_full_name: "C",
       first_grantor_full_name: "B",
       inserted_at: ~U[2023-11-15 22:30:35.145764Z],
       updated_at: ~U[2023-11-15 22:30:35.145764Z],
       aggregates: %{},
       calculations: %{},
       ...
     >,
     __meta__: #Ecto.Schema.Metadata<:loaded, "records">,
     id: "46aa04a7-cadd-4d9f-87b7-14ef1156f26b",
     external_id: "166842475",
     first_grantee_full_name: "B",
     first_grantor_full_name: "A",
     inserted_at: ~U[2023-11-15 22:30:35.139916Z],
     updated_at: ~U[2023-11-15 22:30:35.139916Z],
     aggregates: %{},
     calculations: %{},
     ...
   >,
   #JoinBug.MyApi.MyResource<
     sell_record: #JoinBug.MyApi.MyResource<
       sell_record: #Ash.NotLoaded<:relationship>,
       __meta__: #Ecto.Schema.Metadata<:loaded, "public", "records">,
       id: "135245fa-99b2-41c7-b03f-db2bc2bbf6a6",
       external_id: "166842475",
       first_grantee_full_name: "C",
       first_grantor_full_name: "B",
       inserted_at: ~U[2023-11-15 22:30:35.145764Z],
       updated_at: ~U[2023-11-15 22:30:35.145764Z],
       aggregates: %{},
       calculations: %{},
       ...
     >,
     __meta__: #Ecto.Schema.Metadata<:loaded, "records">,
     id: "8a3c6e42-394e-4a75-b671-e729ecc27c86",
     external_id: "149628232",
     first_grantee_full_name: "B",
     first_grantor_full_name: "A",
     inserted_at: ~U[2023-11-15 22:30:35.151634Z],
     updated_at: ~U[2023-11-15 22:30:35.151634Z],
     aggregates: %{},
     calculations: %{},
     ...
   >
 ]}

Thank you for the reproduction project and the detailed report! I’ve fixed the issue in Ash core, the fix is on main.

Thank you very much @zachdaniel now it is working great!

1 Like