Manipulate / Create data on deployment

ChristophK · May 31, 2021, 9:38am

We have a working phoenix application.
Now I have some new data that needs to be created on the production / staging database. As we have no interface for it, we need to create it manually.

I’ve now created a migration that would create all records, but it seems it does create the records in the test database as well, so some tests start to fail, as it expected a clean table with no records.

Whats the best practice to add / manipulate data on deployment?

kokolegorille · May 31, 2021, 9:41am

Hello and welcome,

There is a seeds file to create data. And this file is not run in test unless You explicitly do so…

csadewa · May 31, 2021, 9:52am

Hmmm, usually for me, it depend on what kind of data that’s being created. if its:

Data for defining constants in application (for example, defining some kind of truck should have X height, Y length, Z weight capacity): usually i put it with ecto migration file, sometime with help importing with CSV in case amount of data is big).
Data for specific application configuration / some initial seed data / special cases (for example, like prefilling some kind of data due changes in application logic and circumstance between old & new version): usually i create private API call, which then will be triggered manually by developer at/just after deployment.

ChristophK · May 31, 2021, 9:53am

we used seeds for the initial setup, but now its just new data thats inserted. so I don’t want to run the seeds again.

ChristophK · May 31, 2021, 9:55am

I tried it like your 1.) but now it seems the tests are setting that data up as well and causing the existing tests to fail.
Like i’m inserting new sensors, and the test expects having no sensors in the table, so the test fails.

csadewa · May 31, 2021, 9:57am

hmmm, how about adjusting the test for the data? so like changing the test to expect there should be some kind of sensors on the table?

stefanchrobot · May 31, 2021, 10:26am

There is nothing special about the auto-generated seed file, so you can add more - they won’t be automatically run for tests.

The other option to consider is to make your seeds idempotent, that is, they check if the data already exists and only add missing stuff. It’s a bit more difficult to write, but then you can keep on extending the seed file.

ChristophK · May 31, 2021, 10:57am

I think I keep it in the migration file and exclude the inserts if run in the test environment. It’s the simplest way to do it.
Adding the data into the seed file is not useful, as the data isn’t really needed for setup and might even be outdated in future.

dimitarvp · June 6, 2021, 10:14pm

A separate migration file that inserts the data works fine. Combine it with a unique constraint on the table and you will avoid duplicating data as well. I’ve done it, works just fine.

Alternative, have infrastructure in place that runs jobs on deployment and then just add that job to that chain of jobs. Requires much more fiddling but some people find it useful.