Is it possible to pushdown a filter into ecto preload for partitioned table preload

mpope · September 8, 2024, 3:35pm

I have several partitioned tables in PostgreSQL, and an ecto query that adds a filter to limit the number of partitions scanned. The filter on the top-level query successfully uses partition pruning. I notice that the preload queries use parent_id = ANY($1) as the filter, which results in all partitions being scanned. For other reasons I cannot yet rely on using a JOIN to preload these records, so the Query level preloads will not be helpful here yet.

While I have read and understand that one can do this through Preload Queries I wanted to know if there was an easier way as I have to potentially apply this pattern to many dynamically assembled queries. An ‘easier way’ being some keyword arg that I could have missed in the docs.

D4no0 · September 8, 2024, 3:49pm

This is actually the easiest one, there are more involved ways, but if you can get away with this, then that is good of solutions as any.

mpope · September 8, 2024, 3:55pm

Okay, thanks.

An alternative question is: I know I want to query 4 tables that are all partitioned the same way and all have the consistent prefix of table_X, is there a way to flag to the Ecto engine that instead of using the Table schema to instead resolve the target table as the partitioned key? Not automagically ofcourse but through something like

resolve_tables: [
  Table: :"table_#{partition_key}",
  TableChild: :"table_child_#{partition_key}"
]

For the specific case of knowing which partition needs to be queried?

D4no0 · September 8, 2024, 4:02pm

There are a few ways to do this, the most reliable one would be to use reflection to get the actual table name and add the suffix you need for the correct partition, then you can simply query it as Ecto supports querying by the table name as string:

partition_table_name = MySchema.__schema__(:source) <> "my_partition_key"

query = partiton_table_name |> where([entry], entry == 1)

PS: This is a schemaless query, for more information read about schemaless queries: Schemaless queries — Ecto v3.12.3

mpope · September 8, 2024, 4:09pm

Thanks! I’ll read into that more to see if I can use it.

gregvaughn · September 8, 2024, 5:59pm

Another nice feature in Ecto is that you can supply both a table name and a schema module as the “queryable” in the first parameter of most Ecto.Query functions. That lets you do things like:

from({"table_#{partition_key}", Table}, ...)

The results will be deserialized into Table schemas/structs, if that’s valuable to your situation.

mpope · September 8, 2024, 6:07pm

While I very much like this solution and I think this is probably the most workable one so far, some of the libraries that this application uses are introspection heavy, any expect the atom of the schema. But its probably easier to remove those libraries in this case, or modify them to allow for this tuple-ized version.

Future readers can find this tuple queryable in this Queryable documentation.