Is at the core of the problem:
Enum.at(Regex.run(~r{(.*)\.([^\.]*)}, p.name), 2)
isn’t something that Ecto can “translate” so that the database can do it (i.e. the database can’t run Elixir code).
PostgreSQL does have regex functionality
=# SELECT substring('2018.0044.2' from '\d+$');
substring
-----------
2
(1 row)
=# SELECT substring('2018.0044.2' from '^\d+\.\d+');
substring
-----------
2018.0044
(1 row)
set
expects a value. I don’t know if set
will accept a fragment
(example).
But the idea is to give Ecto the information it needs so that it can generate a single update
statement the database can understand that ends up being something like:
UPDATE plants SET
code = substring(name from '\d+$'),
accession_id = id;
Another option is to just execute
the raw SQL.
iex(1)> query =
...(1)> from(p in "plants",
...(1)> update: [set: [code: fragment("substring(? from \'\d+$\')", p.name)]],
...(1)> update: [set: [accession_id: p.id]]
...(1)> )
#Ecto.Query<from p0 in "plants",
update: [set: [code: fragment("substring(? from '\d+$')", p0.name)]],
update: [set: [accession_id: p0.id]]>
iex(2)> Ecto.Adapters.SQL.to_sql(:update_all, Repo, query)
{"UPDATE \"plants\" AS p0 SET \"code\" = substring(p0.\"name\" from '\d+$'), \"accession_id\" = p0.\"id\"",
[]}
iex(3)
Filling the accessions
table probably looks something like this
INSERT INTO accessions (id, bought_on, bought_from, code)
(SELECT p.id, p.bought_on, p.bought_from, substring(p.name from '^\d+\.\d+') FROM plants AS p
INNER JOIN (SELECT min(t.id) AS id FROM plants AS t GROUP BY substring(t.name from '^\d+\.\d+')) AS u ON p.id = u.id);
I’m not aware of a way to “build” this type of query from Ecto, so execute
and query/4
may be options.
Alternately you could use a query to create the structures you need to run in a separate insert:
iex(1)> sub = from(t in "plants", select: %{id: min(t.id)}, group_by: fragment("substring(? from \'^\d+\.\d+\')", t.name))
#Ecto.Query<from p0 in "plants",
group_by: [fragment("substring(? from '^\d+.\d+')", p0.name)],
select: %{id: min(p0.id)}>
iex(2)> query =
...(2)> from(p in "plants",
...(2)> select: %{id: p.id, bought_on: p.bought_on, bought_from: p.bought_from, code: fragment("substring(? from \'^\d+\.\d+\')", p.name)},
...(2)> join: u in subquery(sub), on: u.id == p.id
...(2)> )
#Ecto.Query<from p0 in "plants",
join: p1 in subquery(from p0 in "plants",
group_by: [fragment("substring(? from '^\d+.\d+')", p0.name)],
select: %{id: min(p0.id)}),
on: p1.id == p0.id,
select: %{id: p0.id, bought_on: p0.bought_on, bought_from: p0.bought_from, code: fragment("substring(? from '^\d+.\d+')", p0.name)}>
iex(3)> Ecto.Adapters.SQL.to_sql(:all, Repo, query)
{"SELECT p0.\"id\", p0.\"bought_on\", p0.\"bought_from\", substring(p0.\"name\" from '^\d+.\d+') FROM \"plants\" AS p0 INNER JOIN (SELECT min(p0.\"id\") AS \"id\" FROM \"plants\" AS p0 GROUP BY substring(p0.\"name\" from '^\d+.\d+')) AS s1 ON s1.\"id\" = p0.\"id\"",
[]}
iex(4)>
As such your migration plan still doesn’t quite work because of the accession_id
. You should probably
- First generate the
accessions
records from theplants
table, auto generating the ids in theaccessions
table - Populate the
plants.accession_id
by performing a join betweenaccessions.code
andsubstring(plants.name from '^\d+\.\d+'))
- Populate
plants.code
fromsubstring(plants.name from '\d+$'))