How to backup data to create sample seeds?

I run a small open-source project (https://feriendaten.de with the code on https://github.com/Feriendaten/feriendaten.de). To do any meaningful development one needs a set of sample data to run it. I have all the data on my development system in a PostgreSQL database. I would like to offer potential new developers an easy way to populate their development database with something like priv/repo/seeds.exs

Is there a good to go tool to extract data from a database and print it in a seeds.exs code style? Of course I can do it manually but that would take quite some time and I am wondering if not anybody else already solved this problem.

2 Likes

A common solution to something like this would be to export an SQL dump of the data with pg_dump. Would that be a viable approach? Developers can then seed the data with just psql, and you could write a Mix task to perform the seeding.

3 Likes

That would be a plan b for me. I don’t want to force a specific database to anybody.

IMO this is a task that always has some annoying specifics so people just roll their own code every time – I know I did.

If I were you I’d make code that just reads stuff off of your DB via Ecto and then serializes it to .exs e.g. SampleHelper.insert(%{...}) 1000+ times. :smiley:

1 Like

I did something similar to this. I made a python script that exported tables as JSON and then seeds.exs will open those JSON files.

I don’t see the advantage of creating your own intermediate format (a generated seeds.exs file, or json files as suggested) over the pg_dump alternative, as proposed above. But yes, you’d be nudging your developers (only developers though, anyone installing your webapp can still choose whatever db they wish) towards postgres (or whatever tool you use to export).

Leaving your options open for the database technology is a huge constraint, and I don’t see any upside. Bite the bullet, given your specific needs, and go with it. I don’t see any reason why an webapp should keep that option open…

1 Like

I‘d probably just move everything to seeds.exs once and maintain the data needed there going forward. Or even add some installer, which runs when the db is detected to be empty.

My need was that I have to just grab data from specific tables because the prod schema went totally sideways but I still want to develop the next version with a pretty radical schema change and at the same time maintain the prod db.

Then you’ll need to do a data conversion anyway, in the end, no? If you solve that problem first (or in lockstep with developing your new app), you get the representative development data for free, and mitigate the risk of having to do the conversion in the end.

1 Like

Sounds like a plain old CSV file will do you just fine then.

1 Like