Use of custom sources in Ecto

mguimas · June 28, 2019, 5:29pm

Hello

in section Polymorphic associations there a few uses of custom sources, in the example

has_many :comments, {"posts_comments", Comment}, ...

and

has_many :comments, {"tags_comments", Comment}, ...

which makes sense because Comment is defined without a fixed table,

defmodule Comment do
  use Ecto.Schema

  schema "abstract table: comments" do
    # This will be used by associations on each "concrete" table
    field :assoc_id, :integer
  end
end

Now my question is the following:

Are custom sources meant to be used only with schemas like Comment, which do not have a fixed table?

Or are there other uses for custom sources, like in this example, in which it is unclear if Category is defined like Comment or it is something else?

Thanks

LostKobrakai · June 29, 2019, 1:00pm

{table, Schema} is simply a way to override the source used to query data for a certain schema. You can also use them in ecto queries like from comment in {"posts_comment", Comment}. It doesn’t matter what your schema/2 macro lists as source, so you can do that with any ecto schema you might have. It is useful for any case, where you might want to use the same schema with different tables as source for data.

mguimas · June 29, 2019, 1:14pm

Thanks for your clear answer.

Have you found any practical uses, besides the case of polymorphic associations, that you might want to share?

LostKobrakai · June 29, 2019, 1:17pm

I’m currently using it for an entity, which is stored in different tables based on its current state. Like for example {"available_jobs", Job} and {"unavailable_jobs", Job}, but it only works really well if you don’t need additional data of the table, which is different between both.

mguimas · June 29, 2019, 1:23pm

Don’t know if I clearly understood this

but it only works really well if you don’t need additional data of the table, which is different between both.

Does this mean that it only works in practice if the tables have exactly the same schema (i.e., columns)?

LostKobrakai · June 29, 2019, 1:32pm

It means the table needs to have the columns for the fields of the schema you’re selecting in a query. For associations by default all fields will be loaded. It’s as simple as that. In practice it’s way simpler to handle two different schemas if you want to handle different data than trying to make a single schema work by only querying parts of its fields. So yeah, this is most useful if the tables have exactly the same columns and you’re selecting the same fields of of them. It doesn’t mean the table cannot have extra columns, which you don’t select for the schema. All of that is also the case for any other schema though. Just imagine {"other_table", SomeTable} like you would’ve written:

defmodule SomeTable do
  schema "other_table" do
  …
  end
end

Just that you didn’t need to “copy the module” just to use another source table.

mguimas · June 29, 2019, 1:42pm

Ok, now it is even clearer

It seems useful:

for polymorphic associations
if you want to partition a huge amount of data, of the same schema, across several tables, for performance reasons (which seems like what you do for Job, no?)
and probably other use cases I can’t remember right now

Thanks again @LostKobrakai

Dear readers: if you know of other interesting use cases, please post them here for future reference. Thanks.

LostKobrakai · June 29, 2019, 1:48pm

It’s not performance reasons, but rather querying ones. Sometimes it’s easier to move data elsewhere instead of handling differentiation only via where clauses in queries.

mguimas · June 29, 2019, 1:51pm

Interesting … why is that? … perhaps the codebase you are working on is too complex to change?

It would be interesting to know, because it seems another case to add to the previous two.

LostKobrakai · June 29, 2019, 1:54pm

Yeah in my case it was because of not wanting to touch the querying of existing code, but I can also see it being an intentional design decision. E.g. there might be quite a lot of metadata on those tables, which is different between states and queried through different schemas.