Ooccasionaly I need to run tasks that update data in a database. I may need to run them ever again, or not a new server. That is, I need to run them once and in a certain release only. And they should be outside of git index.
Some tutorials, for instance a one at dashbit.co, suggest that I run them with “custom migration”, in which a 2nd directory for migrations is created called “custom_migrations” and they’ll be run from there via Ecto.Migrator. But this will case a problem: I run all of the custom_migrations, then delete all of migration files (because I won’t need them anywhere else, not on a new server either, once I’ve run them), then create new ones when a need arises, and then Ecto.Migrator will complain about absense of the migrations that I’ve deleted.
I’m also aware of ./bin/my_app eval MyApp.Tasks.custom_task1 but it’s not convinient because I’ll have to call it manually and passing arguments to a function isn’t convinient via the command line.
What I want is: create a several files that I want to be run in this current release, once. Store them in a certain directory of an application. Deploy a application. They’ll get run automatically, probably on application boot and then I remove them. Then, after some time, I may want to create new ones and only those new ones will need to get run.
How to do this? What’s a recommended way in Ellixir/Phoenix?
We’ve followed that approach for our data migrations and it works good so far. Our only issue is that our data migrations need to run the application code. Since migrations are scripts, they rot over time without you knowing, so we need to remove the migration files once in a while.
Here’s the partial output of mix ecto.migrations:
$ mix ecto.migrations --migrations-path=priv/repo/data_migrations
Repo: App.Repo
Status Migration ID Migration Name
--------------------------------------------------
up 20210511064356 ** FILE NOT FOUND **
up 20210512072827 ** FILE NOT FOUND **
up 20210513063626 ** FILE NOT FOUND **
up 20210514093522 ** FILE NOT FOUND **
up 20210518091202 ** FILE NOT FOUND **
up 20210519094816 ** FILE NOT FOUND **
up 20210519095658 cleanup_user_references
up 20210519162228 ** FILE NOT FOUND **
up 20210519162229 ** FILE NOT FOUND **
up 20210531140549 ** FILE NOT FOUND **
up 20210604160944 backfill_contexts
up 20210616082543 ** FILE NOT FOUND **
up 20210629000000 ** FILE NOT FOUND **
up 20210629000001 ** FILE NOT FOUND **
We don’t see any errors during migrations. What’s the error that you’re getting?
To me data migration is part of the job of migrations, therefore I don’t use a separate mechanism for it unless outside constraints require it (e.g. time to completion being rather long).
Depending on the complexity one can get away with not needing application level code quite well on ecto. A bunch of ecto query or repo api can work with table names instead of schemas modules, so one can have a convenient api and still not depend on application level code, which can change over time.
(1) Technically no errors, but there’re “FILE NOT FOUND” ones. How will you know whether or not it’s really a migration file that’s missing for some reason and which is to be fixed, or is it one of the ones with tasks that you’ve deleted intentionally and which is not to be fixed?
What I mean is that, I may need to run such database data updates only once and only on a certain server and only in a certain release. Therefore, they can’t be migrations, because each particular data update may never get run ever again elsewhere.
What is the concern here? Why would the file get missing other than intentional deletion? Do you really even care that much given that you want to run it only once?
If you really need the extra safety, the deleted version is in VCS. You can move it to a different folder too (e.g. “completed_data_migrations”).
def migrate_data do
load_app()
for repo <- repos() do
path = Ecto.Migrator.migrations_path(repo, "data_migrations")
{:ok, _, _} = Ecto.Migrator.with_repo(repo, &Ecto.Migrator.run(&1, path, :up, all: true))
end
end
I don’t think I understand your use case. You want to run some code to migrate the data in the database once per new server? Do these servers share the same database?