Postgrex invalid byte sequence for encoding "UTF8": 0x00

Postgres is complaining that I am inserting invalid UTF8 into a text type:

(Postgrex.Error) ERROR 22021 (character_not_in_repertoire) invalid byte sequence for encoding "UTF8": 0x00

I checked the inserts with String.codepoints, and the only thing I found that was interesting was:

iex(2)> String.codepoints("1554942951-607301301834131-2-pdf_invoice_preview.png-original.png")
["1", "5", "5", "4", "9", "4", "2", "9", "5", "1", "-", "6", "0", "7", "3", "0",
 "1", "3", "0", "1", "8", "3", "4", "1", "3", "1", "-", "2", "-", "p", "d", "f",
 "_", "i", "n", "v", "o", "i", "c", "e", "_", "p", "r", "e", "v", "i", "e", "w",
 ".", "p", ...]

It truncates the ending “ng-original.png” Is that possibly the problem?

Full log:

INSERT INTO "documents" ("content_type","filename","new_filename","unique_id","inserted_at","updated_at") VALUES ($1,$2,$3,$4,$5,$6) RETURNING "id" ["image/png", "pdf_invoice_preview.png", "1554942951-607301301834131-2-pdf_invoice_preview.png-original.png", "1554942951-607301301834131-2", ~N[2019-04-11 00:35:51], ~N[2019-04-11 00:35:51]]
[debug] QUERY ERROR db=5.0ms
INSERT INTO "images" ("document_id","hash","uuid","inserted_at","updated_at") VALUES ($1,$2,$3,$4,$5) RETURNING "id" [12, <<38, 52, 51, 51, 51, 33, 37, 6, 0, 11, 122, 107, 171, 59, 158, 30>>, <<168, 246, 60, 242, 50, 132, 75, 81, 165, 27, 253, 122, 144, 194, 119, 156>>, ~N[2019-04-11 00:35:51], ~N[2019-04-11 00:35:51]]
[debug] QUERY OK db=0.2ms
rollback []
[info] Sent 500 in 173ms
[error] #PID<0.465.0> running DistanceWeb.Endpoint (connection #PID<0.464.0>, stream id 1) terminated
Server: localhost:4000 (http)
Request: POST /documents
** (exit) an exception was raised:
    ** (Postgrex.Error) ERROR 22021 (character_not_in_repertoire) invalid byte sequence for encoding "UTF8": 0x00
        (ecto_sql) lib/ecto/adapters/sql.ex:605: Ecto.Adapters.SQL.raise_sql_call_error/1
        (ecto) lib/ecto/repo/schema.ex:649: Ecto.Repo.Schema.apply/4
        (ecto) lib/ecto/repo/schema.ex:262: anonymous fn/15 in Ecto.Repo.Schema.do_insert/4
        (ecto) lib/ecto/multi.ex:579: Ecto.Multi.apply_operation/5
        (elixir) lib/enum.ex:1940: Enum."-reduce/3-lists^foldl/2-0-"/3
        (ecto) lib/ecto/multi.ex:563: anonymous fn/5 in Ecto.Multi.apply_operations/5
        (ecto_sql) lib/ecto/adapters/sql.ex:875: anonymous fn/3 in Ecto.Adapters.SQL.checkout_or_transaction/4
        (db_connection) lib/db_connection.ex:1415: DBConnection.run_transaction/4
        (ecto) lib/ecto/repo/transaction.ex:15: Ecto.Repo.Transaction.transaction/4
        (distance) lib/distance_web/controllers/document_controller.ex:24: DistanceWeb.DocumentController.create/2
        (distance) lib/distance_web/controllers/document_controller.ex:1: DistanceWeb.DocumentController.action/2
        (distance) lib/distance_web/controllers/document_controller.ex:1: DistanceWeb.DocumentController.phoenix_controller_pipeline/2
        (distance) lib/distance_web/endpoint.ex:1: DistanceWeb.Endpoint.instrument/4
        (phoenix) lib/phoenix/router.ex:275: Phoenix.Router.__call__/1
        (distance) lib/distance_web/endpoint.ex:1: DistanceWeb.Endpoint.plug_builder_call/2
        (distance) lib/plug/debugger.ex:122: DistanceWeb.Endpoint."call (overridable 3)"/2
        (distance) lib/distance_web/endpoint.ex:1: DistanceWeb.Endpoint.call/2
        (phoenix) lib/phoenix/endpoint/cowboy2_handler.ex:33: Phoenix.Endpoint.Cowboy2Handler.init/2

That’s the default behavior of IO.inspect; you can print the whole value with IO.inspect(some_value, limit: :infinity)

But the documents insert is followed by another, so I don’t think it’s the one that’s raising.

In the next insert, there’s a raw binary value for hash:

<<38, 52, 51, 51, 51, 33, 37, 6, 0, 11, 122, 107, 171, 59, 158, 30>>

which has a zero byte. What’s the type of the images.hash column?

3 Likes

images.hash was type :string (varchar in Postgres)

Had to change it to :binary (bytea in Postgres)

Thanks @al2o3cr!!

2 Likes