Help with TLS connection unexpectedly closing

ijunaidfarooq · July 13, 2017, 5:57am

Thanks for so much of explanation and brief.

What could be the reason of this repo crash?

most of the time, I also see this kind of errors…

It also comes across when evercam server is trying to connect with snapshot db. (SnapshotRepo), this could be the reason of repo crash?

** (Postgrex.Error) ssl recv: TLS connection is closed - :closed
            (ecto) lib/ecto/adapters/sql.ex:412: Ecto.Adapters.SQL.sql_call!/6
            (ecto) lib/ecto/adapters/sql.ex:394: Ecto.Adapters.SQL.execute/6
            (ecto) lib/ecto/repo/queryable.ex:127: Ecto.Repo.Queryable.execute/5
            (ecto) lib/ecto/repo/queryable.ex:40: Ecto.Repo.Queryable.all/4
            (ecto) lib/ecto/repo/queryable.ex:64: Ecto.Repo.Queryable.one/4
   (evercam_media) lib/mix/tasks/add_camera_logs.ex:15: anonymous fn/1 in EvercamMedia.AddCameraLogs.run/0
          (elixir) lib/enum.ex:651: Enum."-each/2-lists^foreach/1-0-"/2
          (elixir) lib/enum.ex:651: Enum.each/2

UPDATE:

DBConnection.ConnectionError{message: "ssl recv: closed"

Update:

[error] Postgrex.Protocol (#PID<0.2430.0>) disconnected: ** (DBConnection.ConnectionError) client #PID<0.17156.11> timed out because it checked out the connection for longer than 60000ms

and our config for that DB is

config :evercam_media, EvercamMedia.SnapshotRepo,
  adapter: Ecto.Adapters.Postgres,
  url: System.get_env("SNAPSHOT_DATABASE_URL"),
  socket_options: [keepalive: true],
  timeout: 60_000,
  pool_timeout: 60_000,
  pool_size: 100,
  lazy: false,
  ssl: true

peerreynders · July 12, 2017, 7:38pm

Topic Background

Continued from

Leading to:

I would suggest that you start an entirely new topic that focuses on helping you diagnose the issues with

https://github.com/evercam/evercam-server/blob/master/lib/evercam_media/repo.ex#L9

It must have had problems for some time now since exists?/1 was added over a year ago in order to add the capability of determining whether or not the repo was still responsive.

It’s configured to start up with the rest of the application here:
https://github.com/evercam/evercam-server/blob/master/lib/evercam_media.ex#L22

the supervisor strategy being
https://github.com/evercam/evercam-server/blob/master/lib/evercam_media.ex#L36

:one_for_one means that it would simply be restarted once it crashes.

Just a guess: the failures you have been witnessing may have happened shortly after the repo crashed but before it was restarted by the application. Basically scour your logs to collect any evidence that may reveal why the repo is behaving so erratically.