Postgrex / CockroachDB connection stopped working

Man. No idea what’s going on.

Somehow, a very stable postgrex config just seemed to stop working.

I isolated it to a new, empty project to test config…

I can connect to cockroach db via psql no problem, but no so much via postgrex.

config = [
  username: System.get_env("CDB_USERNAME"),
  password: System.get_env("CDB_PASSWORD"),
  database: System.get_env("CDB_DATABASE"),
  parameters: [options: System.get_env("CDB_OPTIONS")],
  hostname: System.get_env("CDB_HOSTNAME"),
  port: String.to_integer(System.get_env("CDB_PORT")),
  ssl: [cacertfile: Path.expand("priv/certs/ca-cert.crt")],
  stacktrace: true,
  show_sensitive_data_on_connection_error: true
]

[
  username: "*******",
  password: "**************",
  database: "************",
  parameters: [options: "--cluster=*****"],
  hostname: "*****.aws-us-east-1.cockroachlabs.cloud",
  port: 26257,
  ssl: [cacertfile: "/Users/***/dev/***/***/priv/certs/ca-cert.crt"],
  stacktrace: true,
  show_sensitive_data_on_connection_error: true
]

Postgrex.start_link(config)

I get this error:

{:ok, #PID<0.215.0>}

15:40:42.385 [error] :gen_statem #PID<0.217.0> terminating
** (ArgumentError) errors were found at the given arguments:

  * 1st argument: not a binary

    (stdlib 6.2) :binary.copy(nil)
    (postgrex 0.19.3) lib/postgrex/types.ex:137: Postgrex.Types.build_type_info/1
    (postgrex 0.19.3) lib/postgrex/protocol.ex:1134: Postgrex.Protocol.bootstrap_recv/4
    (postgrex 0.19.3) lib/postgrex/protocol.ex:729: Postgrex.Protocol.handshake/3
    (postgrex 0.19.3) lib/postgrex/protocol.ex:209: Postgrex.Protocol.connect_endpoints/6
    (db_connection 2.7.0) lib/db_connection/connection.ex:74: DBConnection.Connection.handle_event/4
    (stdlib 6.2) gen_statem.erl:3737: :gen_statem.loop_state_callback/11
    (stdlib 6.2) proc_lib.erl:329: :proc_lib.init_p_do_apply/3
Queue: [internal: {:connect, :init}]
Postponed: []
State: Postgrex.Protocol
Callback mode: :handle_event_function, state_enter: false

I’ve confirmed that the all the envs are fine, I also used env vars to connect via psql. Also, the cert file is readable, if I change that I get a cert file error.

Any thoughts or clue as to how to troubleshoot would be most appreciated!

As I said, this has been stable for months. I have no idea what could have changed.

Looks like something in the protocol - I’m sure someone who works on Postgrex can comment on what this code does (I haven’t the slightest), but:

It sounds like you’re using Cockroach’s hosted cloud product, and you haven’t changed anything about your setup, right? So it sounds like they probably deployed a new version and broke something. If that’s the case, perhaps you should consider contacting their support to let them know? You’re probably not the only one.

That is, of course, assuming it’s not something really weird (like a crash induced by parsing a particular date which happened to be today, or something).

1 Like

lol I found it reading their release notes, looks like it’s already patched on stable

Ask them to upgrade your cluster I guess?

1 Like

Wacky that I can connect using psql! Thanks for this reply!

I didn’t read too closely but I think they broke the protocol in a way that didn’t affect most clients but happened to affect Postgrex (maybe others? who knows).

They probably only tested their changes on more popular clients.

The real mystery is why they waited until now to deploy a verison with known bugs. The fix was stable two weeks ago!