Dynamically defining schema in function with schema macro

NaN · March 25, 2022, 8:01pm

I have multiple tables with the same structure. They are all located in the same db with different names, storing different data. How can I set the schema in a fn to reduce boilerplate? Am I going about this the wrong way?

The code below results in the following error:
(ArgumentError) cannot set attribute @ecto_schema_defined inside function/macro

defmodule NsMetrics.Name.Api do
  use Ecto.Schema
  import Ecto.Changeset

  def set(db_name) do
    schema db_name do
      field(:date, :string)
      field(:t1, :integer)
      field(:t2, :integer)
      field(:t3, :integer)

      timestamps()
    end
  end

  def changeset(line, attrs) do
    line
    |> cast(attrs, [:date, :t1, :t2, :t3])
    |> validate_required([:date, :t1, :t2, :t3])
  end
end

LostKobrakai · March 25, 2022, 8:57pm

You can use a single schema with multiple tables: Ecto.Schema — Ecto v3.7.2

NaN · March 25, 2022, 9:41pm

Thank you for the reply. Im not sure this solves the problem for 2 reasons. Maybe Im missing it?

Assume the database names are not known at compile time but might be generated ad hoc (only partially true)
assume there are a lot of them where making a schema per table does not make sense.

al2o3cr · March 25, 2022, 9:53pm

Can you clarify on this? What specifically wouldn’t make sense?

A lot of what Ecto provides is at compile-time, so runtime-definition like you’re thinking of is going to be fighting with the tool quite hard.

NaN · March 25, 2022, 10:02pm

Thank you for the question! Even if I define a master schema, creating a schema module per table and defining the associations does not make sense. There could be hundreds of them and new ones will be added as time goes on… but they all share the same structure.

dimitarvp · March 25, 2022, 10:24pm

Then you should go for a custom Ecto.Type that is aware of the polymorphic pieces of data and properly branches on them to create the appropriate (different) Elixir structs.

If you can give us an example of a master and e.g. two “child” associations I believe we would be able to give you a head-start.

NaN · March 25, 2022, 10:51pm

It would look something like this, if Im understanding correctly. So Im clear about my understanding of the following code. This will create a shared instance between 2 tables with the same fields… If that understanding is misapprehended please let me know!

EDIT: Id like to add here there are neither nor will there ever be fields that deviate from the same structure.

defmodule FieldsInCommon do
  use Ecto.Schema
  schema "fields_in_common" do
    field(:date, :string)
      field(:t1, :integer)
      field(:t2, :integer)
      field(:t3, :integer)

      timestamps()
  end

 def changeset(line, attrs) do
    line
    |> cast(attrs, [:date, :t1, :t2, :t3])
    |> validate_required([:date, :t1, :t2, :t3])
  end
end

defmodule AdHocTable do
  use Ecto.Schema

  schema "ad_hoc_gen_table_name" do
    many_to_many :fields, FieldsInCommon, join_through: "ad_hoc_gen_table_name_fields"
  end
end

defmodule AdHocTable2 do
  use Ecto.Schema

  schema "ad_hoc_gen_table_name_2" do
     many_to_many :fields, FieldsInCommon, join_through: "ad_hoc_gen_table_name_2_fields"
  end
end

dimitarvp · March 25, 2022, 10:59pm

Still not quite clear. I don’t see the value add of the second table.

Can you also give a few examples of actual data inside the DB? Just do a few select queries in psql and paste us the result here (with anonymized / edited values if you like)?

NaN · March 25, 2022, 11:05pm

Its time series data from different sources. Putting them all in one table only stands to make handling the data arduous. They will all look the same as below.

id  |        inserted_at         |   data_timestamp    | length | count 
-----+----------------------------+---------------------+--------+-------
   1 | 2022-03-11 20:23:06.186993 | 2022-03-11 20:15:00 |  60000 |    31
   2 | 2022-03-11 20:23:06.191469 | 2022-03-11 20:16:00 |  60000 |    46
   3 | 2022-03-11 20:23:06.19321  | 2022-03-11 20:17:00 |  60000 |    42
   4 | 2022-03-11 20:23:06.194651 | 2022-03-11 20:18:00 |  60000 |    32
   5 | 2022-03-11 20:23:06.195871 | 2022-03-11 20:19:00 |  60000 |    30
   6 | 2022-03-11 20:23:06.196954 | 2022-03-11 20:20:00 |  60000 |    26
   7 | 2022-03-11 20:29:07.706952 | 2022-03-11 20:21:00 |  60000 |    32
   8 | 2022-03-11 20:29:07.717481 | 2022-03-11 20:22:00 |  60000 |    31
   9 | 2022-03-11 20:29:07.719433 | 2022-03-11 20:23:00 |  60000 |    32
  10 | 2022-03-11 20:29:07.721166 | 2022-03-11 20:24:00 |  60000 |    25
  11 | 2022-03-11 20:29:07.723034 | 2022-03-11 20:25:00 |  60000 |    20
  12 | 2022-03-11 20:29:07.724765 | 2022-03-11 20:26:00 |  60000 |    33

dimitarvp · March 25, 2022, 11:17pm

Are these the contents of the “fields_in_common” table, or “ad_hoc_*”?

NaN · March 25, 2022, 11:22pm

ad_hoc_* there is no table fields_in_common its just that all the tables share all the same fields in common… Your question leads me to believe Im using it wrong… ? Am I missing it?

Thank you for working with me!

Im looking over the Ecto.Type docs to try to anticipate your solution…

dimitarvp · March 26, 2022, 1:00am

So you basically have N tables that look exactly like “fields_in_common”?

NaN · March 26, 2022, 2:02am

That is correct. How do you see Ecto.Type solving the problem? Or is this taking another direction?

dimitarvp · March 26, 2022, 2:36am

Nope, now it doesn’t make sense because I misunderstood at the start.

Seems like you’ll have to use the schemaless Ecto API. Have you checked it out?

LostKobrakai · March 26, 2022, 7:32am

The table names don‘t need to be known at compile time. You can create the table/schema tuples at runtime.

NaN · March 26, 2022, 4:24pm

How would you do that?

LostKobrakai · March 26, 2022, 5:47pm

Querying using {table_name, MySchema} is the way to go. It‘ll use your schema, but load data from the table provided.