App crashes after db connection is lost

Hi,

a database out of my control shuts down from time to time. So elixir / phoenix crashes after a while (not sure how long it takes).

The follow code is for a function which does some db queries:

  def init(state) do
    schedule_work() # Schedule work to be performed at some point
    {:ok, state}
  end

  def handle_info(:work, state) do
    # Do the work you desire here
    schedule_work() # Reschedule once more
    {:noreply, state}
  end

  defp schedule_work() do
      Process.send_after(self(), :work, 10 * 1000) # every 10 seconds
      someFunction()
  end`Preformatted text`

This is the error log:

crasher:
    initial call: Elixir.RumblWeb.Periodically:init/1
    pid: <0.2223.0>
    registered_name: []
    exception exit: {timeout,
                        {gen_server,call,
                            [<0.2215.0>,
                             {checkout,#Ref<0.697226965.3478388737.82551>,
                                 true,15000},
                             5000]}}
      in function  'Elixir.DBConnection.Poolboy':checkout/3 (lib/db_connection/poolboy.ex, line 112)
      in call from 'Elixir.DBConnection':checkout/2 (lib/db_connection.ex, line 920)
      in call from 'Elixir.DBConnection':run/3 (lib/db_connection.ex, line 742)
      in call from 'Elixir.DBConnection':execute/4 (lib/db_connection.ex, line 636)
      in call from 'Elixir.Ecto.Adapters.Postgres.Connection':execute/4 (lib/ecto/adapters/postgres/connection.ex, line 98)
      in call from 'Elixir.Ecto.Adapters.SQL':sql_call/6 (lib/ecto/adapters/sql.ex, line 256)
      in call from 'Elixir.Ecto.Adapters.SQL':execute_or_reset/7 (lib/ecto/adapters/sql.ex, line 436)
      in call from 'Elixir.Ecto.Repo.Queryable':execute/5 (lib/ecto/repo/queryable.ex, line 133)
    ancestors: ['Elixir.Rumbl.Supervisor',<0.2092.0>]
    message_queue_len: 0
    messages: []
    links: [<0.2093.0>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 6772
    stack_size: 27
    reductions: 10398
  neighbours:

=SUPERVISOR REPORT==== 18-Jan-2018::09:08:23 ===
     Supervisor: {local,'Elixir.Rumbl.Supervisor'}
     Context:    child_terminated
     Reason:     {timeout,
                     {gen_server,call,
                         [<0.2215.0>,
                          {checkout,#Ref<0.697226965.3478388737.82551>,true,
                              15000},
                          5000]}}
     Offender:   [{pid,<0.2223.0>},
                  {id,'Elixir.RumblWeb.Periodically'},
                  {mfargs,{'Elixir.RumblWeb.Periodically',start_link,[]}},
                  {restart_type,permanent},
                  {shutdown,5000},
                  {child_type,worker}]

Greatly appreciate help with this.

Max

1 Like

This is expected vehaviour of the Supervision-Tree.

To avoid App-crashes though, you can play around with the :max_restarts and :max_seconds options of the parental supervisor as described in https://hexdocs.pm/elixir/Supervisor.html#module-start_link-2-init-2-and-strategies

thank you

You can do the work in a separate process, for instance using Task.Supervisor.async_nolink, or catch the exception. I personally prefer the separate process approach because it means the GenServer stays responsive while the task runs.

I don’t think changing max_restarts / max_seconds is a good idea, there’s no cooldown between restarts so you’ll get a crash loop and useless logging no matter what settings you use.

1 Like