Ecto Cross-schema join

So I got handed more tables in a different schema from mine that I need to cross-join on. I know Ecto did have this bug last year, does it have it fixed ‘now’ though? Right now the schema_prefix definition on ecto schema seems entirely ignored and it is instead using the wrong schema on the join, still. I just updated Ecto and it still has this bug. How can you work around this bug?

Right now doing something like:

Repo.one(from(s in RemoteDB.S, join: t in Tag, on: t.name==s.last_name, where: s.id==1))

Is giving a query like (abbreviated):

SELECT s0.* FROM "otherschema"."s" AS s0 INNER JOIN "otherschema"."tags" AS t1 ON t1."name" = s0."last_name" WHERE (s0."id" = 1)

Which is of course entirely a wtf and causes it to puke with:

** (Postgrex.Error) ERROR 42P01 (undefined_table): relation "otherschema.tags" does not exist

When the SQL should of course be:

SELECT s0.* FROM "otherschema"."s" AS s0 INNER JOIN "myschema"."tags" AS t1 ON t1."name" = s0."last_name" WHERE (s0."id" = 1)

Which of course works fine from the database directly.

This has suddenly put a right-on stop on my work because of this bug… >.<

2 Likes

How would ecto know that Tag is located within another (database) schema? Ecto Schemas are not in any way tied to the database they are stored in.

As far as I know this is not a supported feature. I would +1 this feature however, I’m not sure what that would look like… Perhaps something like join: t in {MySchema, Tag}?

1 Like

Please see: Ecto.Schema — Ecto v3.11.1

Specifically (bolding is my own emphasis):

@schema_prefix - configures the schema prefix. Defaults to nil, which generates structs and queries without prefix. When set, the prefix will be used by every built struct and on queries where the current schema is used in from (and only from exclusively).If a schema is used as a join or part of an assoc, @schema_prefix won’t be obeyed. In PostgreSQL, the prefix is called “SCHEMA” (typically set via Postgres’ search_path). In MySQL the prefix points to databases.

So yes, it does know, it has the information. ^.^

That is how it is populating the schema as it is, probably is that whatever schema the initial ‘from’ is becomes the schema for all the joins, even when they have their own, specifically the joins are ignoring the @schema_prefix on all schemas in the join and overriding it with the from’s @schema_prefix. Since I am using the from syntax, but it fails on joins, I’m curious if there’s been any movement towards fixing this long-standing bug or if there is any safe work-around that does not involve dropping into raw SQL (or fragments)? I’m entirely halted on integrating these new tables until I drop down to raw SQL statements, which is not as careful for obvious reasons. ^.^;

2 Likes

Feel free to jump to bottom (I found a fix!), this will mostly be a dump while I troubleshoot. ^.^

It seems the reason the join’s are ignored is that there is a query-wide prefix which is used for all tables referenced, which is definitely not right. ^.^

As can be seen at:


The prefix is set if there was a :prefix option passed in, which again further overrides the @schema_prefix attribute on a schema. Even ignoring if a :prefix option is given it still seems to hold the table atoms themselves, so it can still get the schema information, and in fact it does by getting it on the ‘from’ part, but the next thing it calls is execute as per:

Then to the Planner where it builds the query for the adapter:

So going into the Planner we see it calls prepare at:

Where it then calls prepare_sources at:

Which then calls prepare_joins at:

Which then hits at:

Which then goes to prepare_source via:

Which does grab the source from it but also leaves the module atom at:

Which then goes all back up and appends all these sources onto the main sources list. So at this point they all should be correct and have the source information grabbed out and all combined properly as can be seen at:

So at this point we go back up to the prepare and continue does the cache section at:

Which ends up calling finalize_cache at:

Which seems to cache based on the prefix if a user passed it in at:

Excepting this, we continue back up and down the path to query_with_cache at:

Which calls the function right below it that calls build_meta in every branch depending on if found in the cache or not at:

So continue to that function to see what it does and we see it grabs the ‘global query prefix’ as well as process the sources via build_sources_meta at:

to handle the subqueries and so forth. But it soon gets passed to adapter.prepare, which ends up calling:
https://github.com/elixir-ecto/ecto/blob/master/lib/ecto/adapters/sql.ex#L63
So @conn.all ends up calling into the postgresql adapter as that is what I’m using, where at the top of that function it calls create_names:
https://github.com/elixir-ecto/ecto/blob/master/lib/ecto/adapters/postgres/connection.ex#L120
Which ends up iterating over the list of sources (which is a tuple of {"table_name", SchemaModule} at:
https://github.com/elixir-ecto/ecto/blob/master/lib/ecto/adapters/postgres/connection.ex#L567-L579
And at:
https://github.com/elixir-ecto/ecto/blob/master/lib/ecto/adapters/postgres/connection.ex#L570-L572
You can clearly see it ignores the module entirely, thus never grabbing the prefix for it, thus completely ignoring their configuration. I changed those lines to be this instead (I.E. I just added this second line is all):

          {table, schema} ->
            prefix = schema.__schema__(:prefix)
            name = [String.first(table) | Integer.to_string(pos)]
            {quote_table(prefix, table), name, schema}

And lo-and-behold, it works! Honestly I’ve no clue why the :prefix key is on the query at all, if anything the schema should have the prefix and the prefix could be overridable on a per-schema basis, not globally…

But yeah, have to edit my own version of ecto, and it breaks the :prefix option on things like Repo.all (good riddance, schema’s prefix should be overridable, not globally…), but joins actually work now! ^.^

4 Likes

Sent an Issue request at:

1 Like

Well cool! Good to know, I didn’t think this feature existed!

1 Like

I am really really badly needing to join across schemas, like really badly right now, preferably without building up SQL manually as there is a lot of ecto query transformations happening…

Help? :frowning:
@josevalim @michalmuskala Anyone?

I still do not understand why this limitation exists, the use-cases I’d read make no sense for the generic case that could handle it all… :frowning:

1 Like

Nothing changed since the issue you opened. Ecto does not allow queries across schemas. And nothing will change until someone writes a proposal that is backwards compatible and puts the work into it.

1 Like

FWIW, you should be able to do:

join: c in fragment("some_other_schema.example")

or at least:

join: c in fragment("SELECT * FROM some_other_schema.example")

which you can easily encapsulate in a macro. But you do lose the Ecto schema information (although Ecto has many tools to help with that).

3 Likes

I’ve actually had to resolve into passing maps around everywhere and anywhere as it is due to extremely dynamic returns from the ancient database, so it works.

However, your example gave me an idea for a macro that I think can work well, hmm…

/me still has never seen anyone in real life use prefix’s for user separation and not as dividing of departments at a place of work or so, which I’ve seen at 4 places to date…

2 Likes