Is it possible to automatically create migrations?

wrigley · February 6, 2023, 5:57am

I’m working through Programming Phoenix >= 1.4 and was surprised to see that mix ecto.migration doesn’t actually create migrations automatically. Coming from a Django background this seems like a major additional step every time we make changes to the database schema (which happens very often for a startup!)

Is there a reason for this, or is it something that just hasn’t been implemented yet?

andreaseriksson · February 6, 2023, 6:39am

Hey. not sure what you mean and I dont remember how it works in Django but the migration file is generated with a command like

mix ecto.gen.migration add_users

That would give you an empty migration file. But if you run a command like

mix phx.gen.live Users User users name:string email:string

It would create a full set of CRUD files including a migration that creates a users table with a name and email column.
There are also several other commands.

lud · February 6, 2023, 2:45pm

In frameworks like Symfony/Doctrine you can enter a command to edit a schema. For instance you wil add a new field. The command asks for the name and type of the field, and then update the schema file and generates a migration for the new field.

That would be very cool to have this in ecto but I guess it is a lot of work.

LostKobrakai · February 6, 2023, 2:58pm

I guess the reason for ecto to not do that is much more that in ecto schemas don’t need to map 1:1 to tables. You can have a single schema for a single table maping to all fields, but you can also have a schema mapping to more fields than a single table or to a subset of fields of a table. You can have many schemas mapping to a table and you can have a single schema be used with multiple tables.

Effectively when you add a field to a schema in ecto there’s no way to know if you also need another column in the db or not.

sbuttgereit · February 6, 2023, 3:51pm

Even when they do, there are database use cases for ecto schemas which don’t require migrations at all. Building an Elixir app which accesses a preexisting database or database otherwise built/managed elsewhere wouldn’t have migrations to update.

I think of you’re working with an ORM, there is an expectation of a tight ORM to database server coupling. Ecto, not being an ORM, but rather a data mapper with ORM like capabilities the effort was likely put elsewhere. I could see a case for schema managed migrations, but am not surprised that they didn’t exist it of the gate nor seen as priority to get done up to this point.

lud · February 6, 2023, 9:18pm

While that is true I’m pretty sure you can do the same with any ORM since you can override the table name, and opt-out generating the migration.

wrigley · February 6, 2023, 10:19pm

In Django, you edit the Python file where your models are defined, make whatever changes you want (adding/removing fields, indexes, tables, constraints, etc), and then run makemigrations which inspects the changes, compares them to the existing schema, and generates the entire migration for you (which is a python file). You then apply the migration as a separate step migrate which updates the database. You can also edit the migration manually if you need to but that’s rare.

This is very useful especially when a project is young and you’re changing the schema many times per day.

D4no0 · February 6, 2023, 10:51pm

Sounds pretty interesting, however I doubt it is that useful. Let’s say you have a table with data in it and you want to change the type of a column, I doubt the migrations will be capable to do that automatically. Usually if we talk about many changes to database structure, you just write seeds, migrations and drop the database each time a new version is deployed.

Moreover, like people mentioned above, there are a lot of times when your schema might differ from the table structure, so you lose this flexibility if you want automatic migrations.

Imeguras · July 18, 2023, 6:33pm

Well coming from a dotnet core background, i decided to check out elixir as i tried haskell but didnt find any cool project to do with it. Phoenix sounds cool and until now i’ve somewhat liked the experience the thing is i have an habit of developing piece by piece even when it comes to databases as i hate having half written features or “leftover” data and functions. This means i tipically add some fields that have standard application and only then i will add field by field for other characteristics(even the password i tipically start out without it) which means that i will now have to manually do field by field.
Also most operations would be creation and deletion, EF has annotations to ignore data structures you dont want mapped, and i kinda like this idea as from my 2 dimes of experience i’ve noticed that around 80% of the data gets mapped, then again C sharp is completely different and i have no experience with elixir.

Regardless im still having a good experience but not having it automatically do migrations based on what i wrote is annoying as unless im working for a company i tipically dont do any documentation previous to implementation or proper structuring unless it really gets hard and i need to redraw some part.

sodapopcan · July 18, 2023, 7:03pm

While I’ve never known any other way, I do find writing migrations slightly annoying, but really not that bad. It’s something that is pretty easily solved with an editor macro, which is essentially what I do but I currently still write it on the fly. This would be a very nice package for someone to write and publish. I don’t think it’s a maintenance burden that should fall onto the Ecto or Phoenix teams, though. With all the points above regarding Ecto not being an ORM and bigger projects breaking schemas into smaller slices, at times overlapping, of the same table(s), I could see there being a lot of requests for specific behaviour and options that would make it more than just a nice little convenient add-on feature.

dimitarvp · July 18, 2023, 7:17pm

That’s down to personal preference and productive flow IMO. I personally don’t trust my memory to not delete whether I have put my socks in the washing machine 3 minutes ago so when it comes to DB migrations I am extremely diligent by necessity. I have found upsides to being such.

Though I’ll agree with the C# and Python folk that having these automatically done migrations helps with development speed. Whether that speed is sustainable and doesn’t come with its own negative tradeoffs (like being easily able to allow N+1 queries as Rails was famously doing) is also debatable and sadly often the answers to both questions are “no” and “yes”.

I also find Ecto’s ability to have multiple schema modules all referencing the same table (but all using different subset of columns from it) to be oversold. I haven’t ever needed it. Even in pretty huge projects that I have contracted for we still opted for a middle layer of modules that massage data and remove fields that the current security context is not authorized to see. The matter of using different Ecto schema modules referring to the same table was raised and was always dismissed.

My summary after so many years in the profession is:

The idea of migrations is sound. The way most migration systems are implemented leaves a lot to be desired, Ecto’s included.

But that’s a pretty huge topic by itself and it would involve language-agnostic schema descriptions of the declarative type, ability to generate and also modify code from these declarations, and automated linting and checking for SQL gotchas in the resulting files, and a lot more. Maybe after I retire I’ll finally work on something like this because I can’t see myself embarking on such a huge task without being paid a penny.

sodapopcan · July 18, 2023, 7:39pm

Fair enough!

I have no statistics on the matter so I’m talking out my ass when I say that I think it’s just underused. Out of the gate I always create a separate Users context which lets Accounts to be purely about authentication. Users.User shares email with Accounts.User but that’s it. I feel that one of the biggest problems in every Rails app I worked on were the multiple god objects that resulted from having 1 object == 1 table. I know there are other ways around it, of course.

dimitarvp · July 18, 2023, 7:40pm

I like how many apps made a DB distinction between a User and Profile by the way. Maybe that would be the right thing to do when battling god objects.

sodapopcan · July 18, 2023, 7:46pm

Ya that’s the idea. It can be nice being able to store everything in one table, though, as there is often shared data between them and saves you from joining. The context I add is about Authorization and settings. Profile would likely be something else altogether, especially if users can have multiple profiles.

aiwaiwa · July 18, 2023, 8:37pm

Rereading the original message, I now see the intent of automatic migration. This is a good read explaining it’s more into migration than an automatic brain might be able to provide:

ahey · July 19, 2023, 1:50am

The Ash Framework has fully automatic schema migrations. You define resources similarly to ecto schemas, then run a mix command to generate an ecto migration.

Here’s the kind of workflow it handles:

Define some resources with attributes and relationships
Run mix ash_postgres.generate_migrations and an ecto migration file is created with fully functioning up and down functions
Add some more fields to your models
Run the mix task and it generates a migration that handles adding the new fields to your database schema
Rename a field and add some new resources with relationships
Run the mix task and it generates a migration that handles renaming the field as well as creating the new tables

sbuttgereit · July 19, 2023, 5:34am

There’s an interesting tension between what you might do to create a “normalized” database structure, what an application developer may want their database to look like relative to the immediate application needs (I have to reference the original question of this thread, don’t I?) and what database performance and similar runtime considerations would have you do.

Consider this example: you have a database which keeps tracks of ‘products’. These are products you both manufacture, stock, and ultimately sell… and assume a single ERP like application for now just so we focus on database architecture rather than application architecture. There is information of the product which we might call “configuration data”, data which tells our application how to treat specific items… but then we also have quantitative data related to the product… how many units we have on hand, what the financial value of those units are, etc… From just looking at the surface, this seems like this is one table. The application in many contexts will want all of this information to work with at once and, hey, avoiding joins is good right?

But now consider how we use the data. The configuration data will change, but infrequently. The quantitative data however may be updated multiple times per second as product is moved. When configuration changes (writes) happen I end up disrupting the product movement transaction flow… probably not seriously or even enough to worry too much, but it happens (I’ll have some sort of lock on the row though in practice it should be short lived). But those product movements force a different consideration.

I’m going to stick with PostgreSQL here since it will be the most common for this audience. The quantitative data I described as rapidly changing previously, but consider that if I use a single product table containing both my configuration data and the quantitative data (because both is just “Product Data”)… think about what happens: Postgres doesn’t update data in place… it copies the row as a “delete”/“insert” combo… including all of that configuration data if it live alongside my quantitative data. Ouch. A recipe for table bloat as well as slower than necessary writes (nevermind the locking considerations). There are other sorts of issues in this vein, but we’re getting away from the points.

Naturally, in simpler, lower scaled applications we can get away from worrying about these sorts of things too much and just worry about our experience as developers. But there does come a point where picking the alignment of database structure directly to application data structures, or directly to developer ergonomics, becomes to simplistic and we need to consider some other factors such as performance characteristics around data architecture choices…

D4no0 · July 19, 2023, 7:08am

I will say that the feature of automatic migrations based on schema is overrated. In my experience in doing prototypes the gain from having this feature is so small that it doesn’t make that much sense, since there you just end up editing migrations and recreating database everytime.

If we talk about production use, I would 100% definitely wouldn’t want this feature, sometimes even the explicit migrations are not possible or very hard, but to have something that does this under the hood, even more.

In my opinion this is the same battle that is between phoenix and rails, explicit declarations vs magic under the hood, personally I will always choose the explicit approach, even if it is more verbose.

sodapopcan · July 19, 2023, 7:09am

Soooo… yes to all of that. I probably should have emphasized the “can” in “It can be nice being able to store everything in one table”. Nuance can get lost in hasty forum responses where claiming that “avoiding a join” can come off as: “I universally avoid joins.”

The example I gave is a real one of a user spanning two different contexts: before/during login, and after successful login. I’ve experimented with having two separate tables for each context which means either copying the email over to the “logged in user” table or doing a join just to share the email. It turns out it’s even simpler, and a bit cleaner on the application side, to use one table with different schemas (in different Phoenix contexts) defining a cross-sections of relevant data.

I’m otherwise very pro breaking things up as you outlined. The bulk of my experience building production apps on a team has been in e-commerce with a healthy stint working on a massive ERP One of the more frustrating experiences was working with a certain ecom solution that treated the entire cart → checkout → order → fulfillment process as an Order with a state column. I also saw a similar idea propagated across several models in the ERP system I worked on.

While I have yet to work on an Elixir project of a similar size, I feel Ecto has an advantage here as well. Setting aside that Ecto can seamlessly handle [database] views, it’s also a bit more intuitive to use it to create schemas tailored to scenarios (contexts) where there may be heavy joins at play.

zachdaniel · July 19, 2023, 3:29pm

If we talk about production use, I would 100% definitely wouldn’t want this feature, sometimes even the explicit migrations are not possible or very hard, but to have something that does this under the hood, even more.

I don’t think anyone is talking about automatic migrations that like… “automagically run when you deploy” or something, right? If so, that is 10000% a bad idea.

Migration generators like the one in Ash are conveniences that give you a starting point that you should then review yourself, and often will need to make edits. So you make some changes to your app, run the migration generator, and it spits out regular Ecto migrations that you can edit to your hearts content (it won’t try to change them or anything).