Hello folks,
Today I encountered a strange bug within my project. The regarding project handles github commits and writes them to the database (postgres).
When I do a mix ecto.reset
I will drop and recreate the database, followed by an import script that will retrieve the last Github commits. The bug is happening during the last part.
During this route:
defp fetch_and_upsert_commits(opts) do
try do
opts |> Github.list_commits() |> Enum.map(&PROJ.Organization.upsert_commit/1)
rescue
err in [Ecto.ConstraintError] ->
# Logger.info(err.message)
IO.inspect(err, label: "Why is this happening? oO")
end
end
the rescue is triggered. The IO.inspect logs
Why is this happening? oO:
%Ecto.ConstraintError{
constraint: "commits_pkey",
message: "constraint error when attempting to insert struct:\n\n * commits_pkey (unique_constraint)\n\nIf you would like to stop this constraint violation from raising an\nexception and instead add it as an error to your changeset, please\ncall `unique_constraint/3` on your changeset with the constraint\n`:name` as an option.\n\nThe changeset defined the following constraints:\n\n * commits_sha_index (unique_constraint)\n",
type: :unique
}
The code of PROJ.Organization.upsert_commit/1
is this:
def get_commit(sha), do: Repo.get(Commit, sha)
def upsert_commit(%{sha: sha} = attrs) do
(get_commit(sha) || %Commit{})
|> Commit.changeset(attrs)
|> Repo.insert_or_update()
end
The schema looks like this:
defmodule PROJ.Organization.Work.Commit.Commit do
@moduledoc false
use Ecto.Schema
import Ecto.Changeset
@primary_key {:sha, :string, []} # I would expect that the same id here already indicates to update that entry instead of creating a new one
@derive {Phoenix.Param, key: :sha}
schema "commits" do
field :author, :string
field :date_time, :utc_datetime
field :parents, {:array, :string}
field :message, :string
embeds_many(:file_changes, PROJ.Organization.Work.Commit.FileChange, on_replace: :delete)
embeds_many(:deployment_infos, PROJ.Organization.Work.Commit.DeploymentInfo,
on_replace: :delete
)
embeds_many(:statuses, PROJ.Organization.Work.Commit.Status, on_replace: :delete)
timestamps()
end
@doc false
def changeset(commit, attrs) do
commit
|> cast(attrs, [:sha, :author, :date_time, :parents, :message])
|> cast_embed(:file_changes)
|> cast_embed(:deployment_infos)
|> cast_embed(:statuses)
|> validate_required([:sha, :author, :date_time, :parents])
|> validate_length(:parents, min: 1, max: 2)
|> validate_length(:file_changes, min: 1)
|> unique_constraint(:sha) #in my opinion this should enforce an validation error tuple
end
end
To give you all the info on my problem, this is the migration regarding that table:
defmodule PROJ.Repo.Migrations.CreateCommits do
use Ecto.Migration
def change do
create table(:commits, primary_key: false) do
add :sha, :string, primary_key: true
add :author, :string
add :file_changes, :map
add :date_time, :utc_datetime
add :statuses, :map
add :deployment_infos, :map
add :parents, {:array, :string}
timestamps()
end
create unique_index(:commits, [:sha])
end
end
What should happen in my opinion:
The repo should understand, that this is something to update, not something to insert. And if that does not work, it should at least mark the changeset as invalid, because it violates they unique_constraint
. Somewho my stuff avoids both these cases .
Of course I could keep that try ... rescue
as a work around, but that is stupid and I do want to understand what goes wrong here or more precise: Where am I wrong?
So any guesses?