App crashes after db connection is lost


a database out of my control shuts down from time to time. So elixir / phoenix crashes after a while (not sure how long it takes).

The follow code is for a function which does some db queries:

  def init(state) do
    schedule_work() # Schedule work to be performed at some point
    {:ok, state}

  def handle_info(:work, state) do
    # Do the work you desire here
    schedule_work() # Reschedule once more
    {:noreply, state}

  defp schedule_work() do
      Process.send_after(self(), :work, 10 * 1000) # every 10 seconds
  end`Preformatted text`

This is the error log:

    initial call: Elixir.RumblWeb.Periodically:init/1
    pid: <0.2223.0>
    registered_name: []
    exception exit: {timeout,
      in function  'Elixir.DBConnection.Poolboy':checkout/3 (lib/db_connection/poolboy.ex, line 112)
      in call from 'Elixir.DBConnection':checkout/2 (lib/db_connection.ex, line 920)
      in call from 'Elixir.DBConnection':run/3 (lib/db_connection.ex, line 742)
      in call from 'Elixir.DBConnection':execute/4 (lib/db_connection.ex, line 636)
      in call from 'Elixir.Ecto.Adapters.Postgres.Connection':execute/4 (lib/ecto/adapters/postgres/connection.ex, line 98)
      in call from 'Elixir.Ecto.Adapters.SQL':sql_call/6 (lib/ecto/adapters/sql.ex, line 256)
      in call from 'Elixir.Ecto.Adapters.SQL':execute_or_reset/7 (lib/ecto/adapters/sql.ex, line 436)
      in call from 'Elixir.Ecto.Repo.Queryable':execute/5 (lib/ecto/repo/queryable.ex, line 133)
    ancestors: ['Elixir.Rumbl.Supervisor',<0.2092.0>]
    message_queue_len: 0
    messages: []
    links: [<0.2093.0>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 6772
    stack_size: 27
    reductions: 10398

=SUPERVISOR REPORT==== 18-Jan-2018::09:08:23 ===
     Supervisor: {local,'Elixir.Rumbl.Supervisor'}
     Context:    child_terminated
     Reason:     {timeout,
     Offender:   [{pid,<0.2223.0>},

Greatly appreciate help with this.


1 Like

This is expected vehaviour of the Supervision-Tree.

To avoid App-crashes though, you can play around with the :max_restarts and :max_seconds options of the parental supervisor as described in

thank you

You can do the work in a separate process, for instance using Task.Supervisor.async_nolink, or catch the exception. I personally prefer the separate process approach because it means the GenServer stays responsive while the task runs.

I don’t think changing max_restarts / max_seconds is a good idea, there’s no cooldown between restarts so you’ll get a crash loop and useless logging no matter what settings you use.

1 Like