Added a field to my User module schema, and now old migration doesn't run with missing column error

Pretty straight forward situation. I added a field in a new migration and added the field to the User module’s schema.

Now a previous migration fails to run saying:

** (Postgrex.Error) ERROR 42703 (undefined_column): column u0.is_bot does not exist
(db_connection) lib/db_connection.ex:1406: DBConnection.prepare_declare/4
(elixir) lib/stream.ex:1270: anonymous fn/5 in Stream.resource/3
(elixir) lib/stream.ex:1433: Enumerable.Stream.do_each/4
(elixir) lib/stream.ex:806: Stream.do_transform/8
(elixir) lib/stream.ex:1433: Enumerable.Stream.do_each/4
(elixir) lib/stream.ex:591: Stream.run/1
(ecto) lib/ecto/adapters/sql.ex:576: anonymous fn/3 in Ecto.Adapters.SQL.do_transaction/3
(db_connection) lib/db_connection.ex:1275: DBConnection.transaction_run/4

The migration code that can’t run looks like:

  def up do
    Repo.transaction(
      fn ->
        User
        |> Repo.stream()
        |> Stream.each(fn user ->
          # Do something with `user`.
        end)
        |> Stream.run()
      end,
      timeout: :infinity
    )
  end

Any suggestions on how to fix this situation? Locally, I can comment out the is_bot field declaration and run the migrations, but this has broken CI completely for me.

What do you guys think of modifying my migration script to only select the fields it needs, thus preventing is_bot from being attempted?

Changing my migration to this didn’t work :frowning:

Same error:

User
|> select([:id, :email, :onboarded, :roles])
|> Repo.stream()
|> Stream.each(
** (Postgrex.Error) ERROR 42703 (undefined_column): column u0.is_bot does not exist
(db_connection) lib/db_connection.ex:1406: DBConnection.prepare_declare/4
(elixir) lib/stream.ex:1270: anonymous fn/5 in Stream.resource/3
(elixir) lib/stream.ex:1433: Enumerable.Stream.do_each/4
(elixir) lib/stream.ex:806: Stream.do_transform/8
(elixir) lib/stream.ex:1433: Enumerable.Stream.do_each/4
(elixir) lib/stream.ex:591: Stream.run/1
(ecto) lib/ecto/adapters/sql.ex:576: anonymous fn/3 in Ecto.Adapters.SQL.do_transaction/3
(db_connection) lib/db_connection.ex:1275: DBConnection.transaction_run/4

What’s your database? If it’s an SQL one, why not write SQL in migrations?

Yeah I can fallback to raw sql, but I was wondering why this seemingly common case isn’t supported easily by Ecto. Most likely, I’m messing up somewhere.

A migration should be reversible, why does previous migrations see is_bot? Also why use def up and not def change? def up should have a corresponding def down.

If You could show your last migration file (the one that add is_bot attribute), that would help.

My most recent migration looks like this:

def change do
  alter table(:users) do
    add(:is_bot, :boolean, default: false)
  end
end

Again, nothing crazy.

in that old migration you are using the User schema, that now has changed… it’s a big no no to not have isolated migrations - as you are experiencing :wink:

either rewrite the migration to not access the User schema, or make a copy of that schema at the time of migration, rename it, and include it in the migration…

2 Likes

Can you elaborate how to make a copy of that schema for the migration?

I think @outlog meant something like having a UserBeforeChange schema, which does not include is_bot attribute, and use it instead of User in your migration.

The other way would be to replace

User
|> select([:id, :email, :onboarded, :roles])
|> Repo.stream()
...

with a schemaless query, something like this…

import Ecto.Query

from u in "users", select: %{id: u.id, email: u.email, ...}
|> Repo.stream()
...
2 Likes

Oh yikes - that additional schema just for this one migration would be terrible. I’ll try your suggestion of using a schemaless query.

Still not working :thinking:

query =
  from(
    u in "users",
    select: %{id: u.id, email: u.email, onboarded: u.onboarded, roles: u.roles}
  )

Repo.transaction(
  fn ->
    query
    |> Repo.stream()
    |> Stream.each(fn user ->
** (Postgrex.Error) ERROR 42703 (undefined_column): column u0.is_bot does not exist
(db_connection) lib/db_connection.ex:1406: DBConnection.prepare_declare/4
(elixir) lib/stream.ex:1270: anonymous fn/5 in Stream.resource/3
(elixir) lib/stream.ex:1433: Enumerable.Stream.do_each/4
(elixir) lib/stream.ex:806: Stream.do_transform/8
(elixir) lib/stream.ex:1433: Enumerable.Stream.do_each/4
(elixir) lib/stream.ex:591: Stream.run/1
(ecto) lib/ecto/adapters/sql.ex:576: anonymous fn/3 in Ecto.Adapters.SQL.do_transaction/3
(db_connection) lib/db_connection.ex:1275: DBConnection.transaction_run/4

What do You do here?

# Do something with `user`.

You do not receive a user, but a map when using schemaless queries.

This:


Repo.transaction(
  fn ->
    query
    |> Repo.stream()
    |> Stream.each(fn user ->
      cond do
        user.onboarded && user.roles != nil && length(user.roles) > 1 ->
          roles = user.roles -- ["wrong"]

          user
          |> Ecto.Changeset.change(roles: roles)
          |> Repo.update()

        true ->
          IO.puts("Skip")
      end
    end)
    |> Stream.run()
  end,

From previous link.

Inserts, updates and deletes can also be done without schemas via insert_all, update_all and delete_all respectively:

Probably You should change this

user
  |> Ecto.Changeset.change(roles: roles)
  |> Repo.update()

to use Repo.update_all instead. There are examples in the linked post.

@kokolegorille So by using Repo.update() it’s going through the schema declaration in User?

One easy test is to comment following code, and see if that pass…

      roles = user.roles -- ["wrong"]
      user
      |> Ecto.Changeset.change(roles: roles)
      |> Repo.update()

I would use a schemaless query again for update.

2 Likes

I tried this and still seeing the same issue. :scream:

def up do
  query =
    from(
      u in "users",
      select: %{id: u.id, email: u.email, onboarded: u.onboarded, roles: u.roles}
    )

  Repo.transaction(
    fn ->
      query
      |> Repo.stream()
      |> Stream.each(fn user ->
        roles = ["test"]

        user
        |> Repo.update_all(set: [roles: roles])
      end)
      |> Stream.run()
    end,
    timeout: :infinity
  )
end
** (Postgrex.Error) ERROR 42703 (undefined_column): column u0.is_bot does not exist
(db_connection) lib/db_connection.ex:1406: DBConnection.prepare_declare/4
(elixir) lib/stream.ex:1270: anonymous fn/5 in Stream.resource/3
(elixir) lib/stream.ex:1433: Enumerable.Stream.do_each/4
(elixir) lib/stream.ex:806: Stream.do_transform/8
(elixir) lib/stream.ex:1433: Enumerable.Stream.do_each/4
(elixir) lib/stream.ex:591: Stream.run/1
(ecto) lib/ecto/adapters/sql.ex:576: anonymous fn/3 in Ecto.Adapters.SQL.do_transaction/3
(db_connection) lib/db_connection.ex:1275: DBConnection.transaction_run/4

What’s wrong with defining the User schema that you want within the migration file? This would allow you to be insulated from future changes to the User schema and makes the migrations stand-alone.

defmodule MyApp.Repo.Migrations.ChangeUsers
  defmodule User do
    use MyAppWeb, :schema
    schema "users" do
      field :name
      field :email
    end
  end

  def up do
   Repo.transaction(
      fn ->
        User
        |> Repo.stream()
        |> Stream.each(fn user ->
          # Do something with `user`.
        end)
        |> Stream.run()
      end,
      timeout: :infinity
    )
  end
end

I believe that something along the lines is the general recommended approach, although I’m having trouble finding a mention in the docs. Maybe it was in the ecto e-book.

1 Like

It’s a nuisance. Every time we write a migration we’ll have to explicitly create a schema with the fields or else we might regress and have to add it when a bug happens like today.

I still don’t understand why in my previous post .update_all is calling the schema in my user module.

Anybody have any ideas?

I suspect (with limited confidence) that in the code above, user is of type %User{}. ie, your update_all is updating only one row and its using the schema from %User{}

1 Like