Restarting Phoenix app gives me lots of Postgrex and DBConnection errors

On some of my app deploys, I have pages of Postgrex and DBConnection errors.

Feb 28 09:06:23 postgrex Postgrex.Protocol 14:06:23.313 mfa=Postgrex.Protocol.handshake_shutdown/3 [error] Postgrex.Protocol (#PID<0.3358.0>) timed out because it was handshaking for longer than 15000ms 
Feb 28 09:06:23 db_connection DBConnection.Connection 14:06:23.314 mfa=DBConnection.Connection.handle_event/4 [error] Postgrex.Protocol (#PID<0.3358.0>) failed to connect: ** (DBConnection.ConnectionError) tcp recv (idle): closed 
Feb 28 09:06:23 postgrex Postgrex.Protocol 14:06:23.688 mfa=Postgrex.Protocol.handshake_shutdown/3 [error] Postgrex.Protocol (#PID<0.3360.0>) timed out because it was handshaking for longer than 15000ms 
Feb 28 09:06:23 db_connection DBConnection.Connection 14:06:23.690 mfa=DBConnection.Connection.handle_event/4 [error] Postgrex.Protocol (#PID<0.3360.0>) failed to connect: ** (DBConnection.ConnectionError) tcp recv (idle): closed 
Feb 28 09:06:23 postgrex Postgrex.Protocol 14:06:23.927 mfa=Postgrex.Protocol.handshake_shutdown/3 [error] Postgrex.Protocol (#PID<0.3357.0>) timed out because it was handshaking for longer than 15000ms 
Feb 28 09:06:23 db_connection DBConnection.Connection 14:06:23.929 mfa=DBConnection.Connection.handle_event/4 [error] Postgrex.Protocol (#PID<0.3357.0>) failed to connect: ** (DBConnection.ConnectionError) tcp recv (idle): closed 
Feb 28 09:06:24 postgrex Postgrex.Protocol 14:06:24.332 mfa=Postgrex.Protocol.handshake_shutdown/3 [error] Postgrex.Protocol (#PID<0.3363.0>) timed out because it was handshaking for longer than 15000ms 
Feb 28 09:06:24 db_connection DBConnection.Connection 14:06:24.334 mfa=DBConnection.Connection.handle_event/4 [error] Postgrex.Protocol (#PID<0.3363.0>) failed to connect: ** (DBConnection.ConnectionError) tcp recv (idle): closed 

My database has plenty of connnections available, no errors, and when I try to connect to it manually it responds quickly.

How can I determine if this issue is with the startup of my new app, or teardown of my previous app? I’m deploying on Fly.io, targeting a single machine.

Thanks!

Well, it’s unlikely anything to do with your old application.

You can try something like fly m clone to setup another machine and see if it also has postgres connection errors. I’m thinking this might be something dns related, but there’s just not enough information here.

Yeah, that’s a good bet. I started getting a lot more of these after my DB moved from IPv4 to IPv6 and I changed the address accordingly. I found a single post about slow IPV6 connections on Fly.

I’m happy to add more info- I’m just not even sure where to start. This happens sporadically for my deploys, so even spinning up a new machine isn’t guaranteed to reproduce the issue.

I need a few more deploys to confirm, but it looks like this was an issue with DB pooling. I suspect on deploys I didn’t have enough spots for the shutting down and starting apps at the same time.

1 Like