Unexpected union results with certain Ecto selects

So I bumped into this weird behaviour:

query = from(p in Post, select: {:foo, p.id})
  
from(p in Post)
|> select([p], {"bar", p.id})
|> union_all(^query)
|> Repo.all()

Looks like bar string is passed to the query, while :foo atom is not, so the queries end up having different number of columns:

** (Exqlite.Error) SELECTs to the left and right of UNION ALL do not have the same number of result columns
SELECT 'bar', p0."id" FROM "posts" AS p0 UNION ALL SELECT p0."id" FROM "posts" AS p0
    (ecto_sql 3.12.1) lib/ecto/adapters/sql.ex:1096: Ecto.Adapters.SQL.raise_sql_call_error/1
    (ecto_sql 3.12.1) lib/ecto/adapters/sql.ex:994: Ecto.Adapters.SQL.execute/6
    (ecto 3.12.4) lib/ecto/repo/queryable.ex:232: Ecto.Repo.Queryable.execute/4
    (ecto 3.12.4) lib/ecto/repo/queryable.ex:19: Ecto.Repo.Queryable.all/3
    ecto_sql.exs:50: Main.main/0
    ecto_sql.exs:55: (file)

There’s also an interesting interaction if both queries use an atom:

query = from(p in Post, select: {:total_views, sum(p.views)})
  
from(p in Post)
|> select([p], {:highest_views, max(p.views)})
|> union_all(^query)
|> Repo.all()
|> IO.inspect()
# [highest_views: 15, highest_views: 25]

Is this a bug or a known limitation perhaps?

Here's a one file example for anyone interested
Mix.install([
  {:ecto_sql, "~> 3.12"},
  {:ecto_sqlite3, "~> 0.17.4"}
])

Application.put_env(:foo, Repo, database: "mix_install_examples.db")

defmodule Repo do
  use Ecto.Repo, adapter: Ecto.Adapters.SQLite3, otp_app: :foo
end

defmodule Migration0 do
  use Ecto.Migration

  def change do
    create table("posts") do
      add(:title, :string)
      add(:views, :integer)
      timestamps(type: :utc_datetime_usec)
    end
  end
end

defmodule Post do
  use Ecto.Schema

  schema "posts" do
    field(:title, :string)
    field(:views, :integer)
    timestamps(type: :utc_datetime_usec)
  end
end

defmodule Main do
  import Ecto.Query, warn: false

  def main do
    Repo.__adapter__().storage_down(Repo.config())
    :ok = Repo.__adapter__().storage_up(Repo.config())
    {:ok, _} = Supervisor.start_link([Repo], strategy: :one_for_one)
    Ecto.Migrator.run(Repo, [{0, Migration0}], :up, all: true, log_migrations_sql: :info)

    Repo.insert!(%Post{
      title: "Post 1",
      views: 10
    })

    Repo.insert!(%Post{
      title: "Post 2",
      views: 15
    })

    query = from(p in Post, select: {:total_views, sum(p.views)})
    
    from(p in Post)
    |> select([p], {:highest_views, max(p.views)})
    |> union_all(^query)
    |> Repo.all()
    |> IO.inspect()
  end
end

Main.main()

:wave: @martosaur

This is not an answer but just another observation :slight_smile:

Literals don’t seem to be passed to the database, and instead are used only to build the results in “postprocess” (except if there are unions or other combinations, Use database returned values for literals · elixir-ecto/ecto@4bd6b40 · GitHub). If you “plan” the query struct and inspect it, the atom and tuple would go away and only the sum would stay in fields:

query = from(p in "posts", select: {:total_views, sum(p.views)})
{query, _, _} = Ecto.Adapter.Queryable.plan_query(:all, Ecto.Adapters.SQLite3, query)
query.select

%Ecto.Query.SelectExpr{
  expr: {:{}, [],
   [
     :total_views,
     {:sum, [], [{{:., [type: :any], [{:&, [], [0]}, :views]}, [], []}]}
   ]},
  file: "iex",
  line: 11,
  fields: [{:sum, [], [{{:., [type: :any], [{:&, [], [0]}, :views]}, [], []}]}],
  params: nil,
  take: %{},
  subqueries: [],
  aliases: %{}
}

And the adapters use Ecto.Query.SelectExpr.fields to build the SQL for select:

All of this happens in ecto/lib/ecto/query/planner.ex at master · elixir-ecto/ecto · GitHub, in normalize_select and collect_fields:


Maybe, if there are combinations, atoms could be “casted” to strings, since they can be considered literals too … :thinking: Keep (and stringify) atoms in SELECT fields if keeping literals (for combinations) · Issue #4546 · elixir-ecto/ecto · GitHub

2 Likes

In your second example, [highest_views: 15, highest_views: 25] is merely syntactic sugar for [{:highest_views, 15}, {:highest_views, 25}]. I haven’t studied the Ecto code, but the error makes sense to me in that you are union_alling on a positional structure (a tuple) where one of those positions has a different type in each query. This tracks with how databases work: you can’t have a column where the values in different rows have different types.

This isn’t the same error you would get in Postgres so I’m assuming this has to do with how Ecto shuffles things around to try and make sense of the query, but I’m not really sure. I haven’t studied the Ecto code myself.

In your second example, [highest_views: 15, highest_views: 25] is merely syntactic sugar for [{:highest_views, 15}, {:highest_views, 25}]

To be clear the unexpected behaviour here is not that 2 tuples are printed as a keyword list, but that the first atom total_views became highest_views in the result.

1 Like

It’s true not only for atoms but for a lot of other literals in selects. Use Ecto.Enum to select strings as atoms in unions or just select strings and cast them manually.

When selecting a literal atom, its value must be the same across all queries. Otherwise, the value from the parent query will be applied to all other queries. This also holds true for selecting maps with atom keys.

2 Likes