Added a field to my User module schema, and now old migration doesn't run with missing column error

sergio · March 13, 2018, 3:30pm

Pretty straight forward situation. I added a field in a new migration and added the field to the User module’s schema.

Now a previous migration fails to run saying:

** (Postgrex.Error) ERROR 42703 (undefined_column): column u0.is_bot does not exist
(db_connection) lib/db_connection.ex:1406: DBConnection.prepare_declare/4
(elixir) lib/stream.ex:1270: anonymous fn/5 in Stream.resource/3
(elixir) lib/stream.ex:1433: Enumerable.Stream.do_each/4
(elixir) lib/stream.ex:806: Stream.do_transform/8
(elixir) lib/stream.ex:1433: Enumerable.Stream.do_each/4
(elixir) lib/stream.ex:591: Stream.run/1
(ecto) lib/ecto/adapters/sql.ex:576: anonymous fn/3 in Ecto.Adapters.SQL.do_transaction/3
(db_connection) lib/db_connection.ex:1275: DBConnection.transaction_run/4

The migration code that can’t run looks like:

  def up do
    Repo.transaction(
      fn ->
        User
        |> Repo.stream()
        |> Stream.each(fn user ->
          # Do something with `user`.
        end)
        |> Stream.run()
      end,
      timeout: :infinity
    )
  end

Any suggestions on how to fix this situation? Locally, I can comment out the is_bot field declaration and run the migrations, but this has broken CI completely for me.

What do you guys think of modifying my migration script to only select the fields it needs, thus preventing is_bot from being attempted?

sergio · March 13, 2018, 3:57pm

Changing my migration to this didn’t work

Same error:

User
|> select([:id, :email, :onboarded, :roles])
|> Repo.stream()
|> Stream.each(

** (Postgrex.Error) ERROR 42703 (undefined_column): column u0.is_bot does not exist
(db_connection) lib/db_connection.ex:1406: DBConnection.prepare_declare/4
(elixir) lib/stream.ex:1270: anonymous fn/5 in Stream.resource/3
(elixir) lib/stream.ex:1433: Enumerable.Stream.do_each/4
(elixir) lib/stream.ex:806: Stream.do_transform/8
(elixir) lib/stream.ex:1433: Enumerable.Stream.do_each/4
(elixir) lib/stream.ex:591: Stream.run/1
(ecto) lib/ecto/adapters/sql.ex:576: anonymous fn/3 in Ecto.Adapters.SQL.do_transaction/3
(db_connection) lib/db_connection.ex:1275: DBConnection.transaction_run/4

idi527 · March 13, 2018, 3:59pm

What’s your database? If it’s an SQL one, why not write SQL in migrations?

sergio · March 13, 2018, 4:00pm

Yeah I can fallback to raw sql, but I was wondering why this seemingly common case isn’t supported easily by Ecto. Most likely, I’m messing up somewhere.

kokolegorille · March 13, 2018, 4:05pm

A migration should be reversible, why does previous migrations see is_bot? Also why use def up and not def change? def up should have a corresponding def down.

If You could show your last migration file (the one that add is_bot attribute), that would help.

sergio · March 13, 2018, 4:10pm

My most recent migration looks like this:

def change do
  alter table(:users) do
    add(:is_bot, :boolean, default: false)
  end
end

Again, nothing crazy.

outlog · March 13, 2018, 4:14pm

in that old migration you are using the User schema, that now has changed… it’s a big no no to not have isolated migrations - as you are experiencing

either rewrite the migration to not access the User schema, or make a copy of that schema at the time of migration, rename it, and include it in the migration…

sergio · March 13, 2018, 4:33pm

Can you elaborate how to make a copy of that schema for the migration?

kokolegorille · March 13, 2018, 5:03pm

I think @outlog meant something like having a UserBeforeChange schema, which does not include is_bot attribute, and use it instead of User in your migration.

The other way would be to replace

User
|> select([:id, :email, :onboarded, :roles])
|> Repo.stream()
...

with a schemaless query, something like this…

import Ecto.Query

from u in "users", select: %{id: u.id, email: u.email, ...}
|> Repo.stream()
...

sergio · March 13, 2018, 5:12pm

Oh yikes - that additional schema just for this one migration would be terrible. I’ll try your suggestion of using a schemaless query.

sergio · March 13, 2018, 5:39pm

Still not working

query =
  from(
    u in "users",
    select: %{id: u.id, email: u.email, onboarded: u.onboarded, roles: u.roles}
  )

Repo.transaction(
  fn ->
    query
    |> Repo.stream()
    |> Stream.each(fn user ->

** (Postgrex.Error) ERROR 42703 (undefined_column): column u0.is_bot does not exist
(db_connection) lib/db_connection.ex:1406: DBConnection.prepare_declare/4
(elixir) lib/stream.ex:1270: anonymous fn/5 in Stream.resource/3
(elixir) lib/stream.ex:1433: Enumerable.Stream.do_each/4
(elixir) lib/stream.ex:806: Stream.do_transform/8
(elixir) lib/stream.ex:1433: Enumerable.Stream.do_each/4
(elixir) lib/stream.ex:591: Stream.run/1
(ecto) lib/ecto/adapters/sql.ex:576: anonymous fn/3 in Ecto.Adapters.SQL.do_transaction/3
(db_connection) lib/db_connection.ex:1275: DBConnection.transaction_run/4

kokolegorille · March 13, 2018, 5:43pm

What do You do here?

# Do something with `user`.

You do not receive a user, but a map when using schemaless queries.

sergio · March 13, 2018, 5:45pm

This:


Repo.transaction(
  fn ->
    query
    |> Repo.stream()
    |> Stream.each(fn user ->
      cond do
        user.onboarded && user.roles != nil && length(user.roles) > 1 ->
          roles = user.roles -- ["wrong"]

          user
          |> Ecto.Changeset.change(roles: roles)
          |> Repo.update()

        true ->
          IO.puts("Skip")
      end
    end)
    |> Stream.run()
  end,

kokolegorille · March 13, 2018, 5:51pm

From previous link.

Inserts, updates and deletes can also be done without schemas via insert_all, update_all and delete_all respectively:

Probably You should change this

user
  |> Ecto.Changeset.change(roles: roles)
  |> Repo.update()

to use Repo.update_all instead. There are examples in the linked post.

sergio · March 13, 2018, 5:53pm

@kokolegorille So by using Repo.update() it’s going through the schema declaration in User?

kokolegorille · March 13, 2018, 5:55pm

One easy test is to comment following code, and see if that pass…

      roles = user.roles -- ["wrong"]
      user
      |> Ecto.Changeset.change(roles: roles)
      |> Repo.update()

I would use a schemaless query again for update.

sergio · March 13, 2018, 7:54pm

I tried this and still seeing the same issue.

def up do
  query =
    from(
      u in "users",
      select: %{id: u.id, email: u.email, onboarded: u.onboarded, roles: u.roles}
    )

  Repo.transaction(
    fn ->
      query
      |> Repo.stream()
      |> Stream.each(fn user ->
        roles = ["test"]

        user
        |> Repo.update_all(set: [roles: roles])
      end)
      |> Stream.run()
    end,
    timeout: :infinity
  )
end

** (Postgrex.Error) ERROR 42703 (undefined_column): column u0.is_bot does not exist
(db_connection) lib/db_connection.ex:1406: DBConnection.prepare_declare/4
(elixir) lib/stream.ex:1270: anonymous fn/5 in Stream.resource/3
(elixir) lib/stream.ex:1433: Enumerable.Stream.do_each/4
(elixir) lib/stream.ex:806: Stream.do_transform/8
(elixir) lib/stream.ex:1433: Enumerable.Stream.do_each/4
(elixir) lib/stream.ex:591: Stream.run/1
(ecto) lib/ecto/adapters/sql.ex:576: anonymous fn/3 in Ecto.Adapters.SQL.do_transaction/3
(db_connection) lib/db_connection.ex:1275: DBConnection.transaction_run/4

axelson · March 13, 2018, 8:24pm

What’s wrong with defining the User schema that you want within the migration file? This would allow you to be insulated from future changes to the User schema and makes the migrations stand-alone.

defmodule MyApp.Repo.Migrations.ChangeUsers
  defmodule User do
    use MyAppWeb, :schema
    schema "users" do
      field :name
      field :email
    end
  end

  def up do
   Repo.transaction(
      fn ->
        User
        |> Repo.stream()
        |> Stream.each(fn user ->
          # Do something with `user`.
        end)
        |> Stream.run()
      end,
      timeout: :infinity
    )
  end
end

I believe that something along the lines is the general recommended approach, although I’m having trouble finding a mention in the docs. Maybe it was in the ecto e-book.

sergio · March 14, 2018, 4:34am

It’s a nuisance. Every time we write a migration we’ll have to explicitly create a schema with the fields or else we might regress and have to add it when a bug happens like today.

I still don’t understand why in my previous post .update_all is calling the schema in my user module.

Anybody have any ideas?

kip · March 14, 2018, 5:15am

I suspect (with limited confidence) that in the code above, user is of type %User{}. ie, your update_all is updating only one row and its using the schema from %User{}