Error/Logs: behavior differs between Heroku/Dokku/Local

Hi,

I have been struggling with this for a couple of days without luck. I even asked in FreeNode (elixir-lang and dokku).

I know that this is may be misleading since the problem only appears when using Dokku/Herokuish, but I would like to understand what could be the problem from the stack Elixir “point of view”.

Maybe any of you know what could be the reason for this, since there are many layers that could fail.

I start describing the behavior in case you know the solution. But if you want to reproduce it, I will upload a basic repo.

I have a basic phoenix (elixir) application, with a couple of routes (eg. navigating to /runtime-error-conn or /runtime-error-match) both produce a runtime error (stack trace, but the server is still alive - no stderr/app restart). The thing is that when I test this behavior in production both in my local computer and heroku, it shows the stack trace/error (stdout), but when I do it in dokku, the error is not shown most of the times. The same repo/commit is used in the three cases.

I show below the logs for the three cases when I do:

  1. navigating to /runtime-error-conn (or /runtime-error-match; 500 Internal Server Error) and then,
  2. navigating to /log route (200 OK).

It is a phoenix app running in prod mode.

When I go to the route with a runtime error, it returns the 500 but it does not show the error/stacktrace in stdout in dokku. It continues serving like always (you can continue navigating to /log without problem).

The problem is: why the error is logged only sometimes? I have done many tests, and in dozens of times the error is not shown, but eventually, it is. Why? Is there anything like a “max buffer” in the stacktrace/error/log system that is filled, and until it is freed i cannot get again it emitting? I am completely lost so I am just supposing things. The crazy thing is that it only happens in Dokku/Docker, not in Heroku or locally, so I don’t know if is some sort of parameter to have more memory (or whatever it is). Any clue?

Local:

$ SCHEME=http EMAIL_PASSWORD=xxx PORT=4000 SECRET_KEY_BASE=$(mix phx.gen.secret) DATABASE_URL=postgres://postgres:postgres@localhost/phoenix_elixir_prod HOST=localhost MIX_ENV=prod elixir --sname server -S mix phx.server

23:12:26.857 request_id=FgdU53tVREbtmCYAAAAF [info] GET /runtime-error-conn
runtime_error conn
23:12:26.911 request_id=FgdU53tVREbtmCYAAAAF [error] Custom error message before runtime_error conn
23:12:26.952 request_id=FgdU53tVREbtmCYAAAAF [info]
23:12:26.952 request_id=FgdU53tVREbtmCYAAAAF [info] Sent 500 in 95ms
23:12:26.952 request_id=FgdU53tVREbtmCYAAAAF [info] Converted error :undef to 500 response
23:12:26.965 [error] #PID<0.672.0> running PhoenixElixirWeb.Endpoint (connection #PID<0.671.0>, stream id 1) terminated
Server: localhost:4000 (http)
Request: GET /runtime-error-conn
** (exit) an exception was raised:
    ** (UndefinedFunctionError) function PhoenixElixirWeb.Router.Helpers.page_path/0 is undefined or private
        (phoenix_elixir 0.1.1) PhoenixElixirWeb.Router.Helpers.page_path()
        (phoenix_elixir 0.1.1) lib/phoenix_elixir_web/controllers/page_controller.ex:57: PhoenixElixirWeb.PageController.runtime_error_conn/2
        (phoenix_elixir 0.1.1) lib/phoenix_elixir_web/controllers/page_controller.ex:1: PhoenixElixirWeb.PageController.action/2
        (phoenix_elixir 0.1.1) lib/phoenix_elixir_web/controllers/page_controller.ex:1: PhoenixElixirWeb.PageController.phoenix_controller_pipeline/2
        (phoenix 1.4.16) lib/phoenix/router.ex:288: Phoenix.Router.__call__/2
        (phoenix_elixir 0.1.1) lib/phoenix_elixir_web/endpoint.ex:1: PhoenixElixirWeb.Endpoint.plug_builder_call/2
        (phoenix_elixir 0.1.1) lib/phoenix_elixir_web/endpoint.ex:1: PhoenixElixirWeb.Endpoint.call/2
        (phoenix 1.4.16) lib/phoenix/endpoint/cowboy2_handler.ex:42: Phoenix.Endpoint.Cowboy2Handler.init/4
23:12:31.462 request_id=FgdU6I3_FG6YJvEAAACi [info] GET /log
23:12:31.485 request_id=FgdU6I3_FG6YJvEAAACi [info]
23:12:31.485 request_id=FgdU6I3_FG6YJvEAAACi [info] Sent 200 in 22ms

Heroku:

2020-04-19T21:15:11.889630+00:00 heroku[router]: at=info method=GET path="/runtime-error-conn" host=phoenix-elixir.herokuapp.com request_id=4a7b8a28-a317-4c59-8fb5-292368456515 fwd="170.253.27.9" dyno=web.1 connect=1ms service=68ms status=500 bytes=493 protocol=http
2020-04-19T21:15:11.861770+00:00 app[web.1]: 21:15:11.855 request_id=4a7b8a28-a317-4c59-8fb5-292368456515 [info] GET /runtime-error-conn
2020-04-19T21:15:11.872165+00:00 app[web.1]: runtime_error conn
2020-04-19T21:15:11.872244+00:00 app[web.1]: 21:15:11.872 request_id=4a7b8a28-a317-4c59-8fb5-292368456515 [error] Custom error message before runtime_error conn
2020-04-19T21:15:11.888086+00:00 app[web.1]: 21:15:11.887 request_id=4a7b8a28-a317-4c59-8fb5-292368456515 [info]
2020-04-19T21:15:11.888389+00:00 app[web.1]: 21:15:11.887 request_id=4a7b8a28-a317-4c59-8fb5-292368456515 [info] Sent 500 in 32ms
2020-04-19T21:15:11.888392+00:00 app[web.1]: 21:15:11.887 request_id=4a7b8a28-a317-4c59-8fb5-292368456515 [info] Converted error :undef to 500 response
2020-04-19T21:15:11.899067+00:00 app[web.1]: 21:15:11.897 [error] #PID<0.661.0> running PhoenixElixirWeb.Endpoint (connection #PID<0.660.0>, stream id 1) terminated
2020-04-19T21:15:11.899068+00:00 app[web.1]: Server: phoenix-elixir.herokuapp.com:80 (http)
2020-04-19T21:15:11.899069+00:00 app[web.1]: Request: GET /runtime-error-conn
2020-04-19T21:15:11.899069+00:00 app[web.1]: ** (exit) an exception was raised:
2020-04-19T21:15:11.899070+00:00 app[web.1]: ** (UndefinedFunctionError) function PhoenixElixirWeb.Router.Helpers.page_path/0 is undefined or private
2020-04-19T21:15:11.899070+00:00 app[web.1]: (phoenix_elixir 0.1.1) PhoenixElixirWeb.Router.Helpers.page_path()
2020-04-19T21:15:11.899077+00:00 app[web.1]: (phoenix_elixir 0.1.1) lib/phoenix_elixir_web/controllers/page_controller.ex:57: PhoenixElixirWeb.PageController.runtime_error_conn/2
2020-04-19T21:15:11.899079+00:00 app[web.1]: (phoenix_elixir 0.1.1) lib/phoenix_elixir_web/controllers/page_controller.ex:1: PhoenixElixirWeb.PageController.action/2
2020-04-19T21:15:11.899080+00:00 app[web.1]: (phoenix_elixir 0.1.1) lib/phoenix_elixir_web/controllers/page_controller.ex:1: PhoenixElixirWeb.PageController.phoenix_controller_pipeline/2
2020-04-19T21:15:11.899081+00:00 app[web.1]: (phoenix 1.4.16) lib/phoenix/router.ex:288: Phoenix.Router.__call__/2
2020-04-19T21:15:11.899082+00:00 app[web.1]: (phoenix_elixir 0.1.1) lib/phoenix_elixir_web/endpoint.ex:1: PhoenixElixirWeb.Endpoint.plug_builder_call/2
2020-04-19T21:15:11.899082+00:00 app[web.1]: (phoenix_elixir 0.1.1) lib/phoenix_elixir_web/endpoint.ex:1: PhoenixElixirWeb.Endpoint.call/2
2020-04-19T21:15:11.899082+00:00 app[web.1]: (phoenix 1.4.16) lib/phoenix/endpoint/cowboy2_handler.ex:42: Phoenix.Endpoint.Cowboy2Handler.init/4
2020-04-19T21:15:12.149859+00:00 heroku[router]: at=info method=GET path="/favicon.ico" host=phoenix-elixir.herokuapp.com request_id=920d453f-f4b2-45fc-a5e4-55bfd9ee29c5 fwd="170.253.27.9" dyno=web.1 connect=0ms service=3ms status=200 bytes=1473 protocol=http
2020-04-19T21:15:14.074627+00:00 app[web.1]: 21:15:14.074 [info] [swarm on server@6cba8324-be5d-49a7-952f-89fadaff8f6e] [tracker:cluster_wait] joining cluster..
2020-04-19T21:15:14.074769+00:00 app[web.1]: 21:15:14.074 [info] [swarm on server@6cba8324-be5d-49a7-952f-89fadaff8f6e] [tracker:cluster_wait] no connected nodes, proceeding without sync
2020-04-19T21:15:20.072536+00:00 app[web.1]: 21:15:20.072 request_id=f368e3b1-1cfd-43aa-a431-b072d082ec51 [info] GET /log
2020-04-19T21:15:20.091615+00:00 app[web.1]: 21:15:20.091 request_id=f368e3b1-1cfd-43aa-a431-b072d082ec51 [info]
2020-04-19T21:15:20.091699+00:00 app[web.1]: 21:15:20.091 request_id=f368e3b1-1cfd-43aa-a431-b072d082ec51 [info] Sent 200 in 19ms
2020-04-19T21:15:20.091296+00:00 heroku[router]: at=info method=GET path="/log" host=phoenix-elixir.herokuapp.com request_id=f368e3b1-1cfd-43aa-a431-b072d082ec51 fwd="170.253.27.9" dyno=web.1 connect=0ms service=21ms status=200 bytes=10622 protocol=http
2020-04-19T21:15:20.255819+00:00 heroku[router]: at=info method=GET path="/css/app-e42852ff74f163354fb900e51cd3c018.css?vsn=d" host=phoenix-elixir.herokuapp.com request_id=c0ff025f-6e2f-4a5f-b9c3-38193d9143b5 fwd="170.253.27.9" dyno=web.1 connect=0ms service=2ms status=200 bytes=11594 protocol=http
2020-04-19T21:15:20.300630+00:00 heroku[router]: at=info method=GET path="/js/app-d255e0f04466ade472877808e02adefc.js?vsn=d" host=phoenix-elixir.herokuapp.com request_id=476d7d14-45c5-4fa2-99d2-9c9f4ced3881 fwd="170.253.27.9" dyno=web.1 connect=1ms service=5ms status=200 bytes=2125 protocol=http

Dokku:

And now Dokku, where the last message shown is the Converted error :undef to 500 response, until the next request is shown (/log). I have executed many times the /runtime-error-conn, and only 1 every X (5-30 times) is properly logged.

2020-04-19T22:19:48.365788055Z app[web.1]: 22:19:48.364 request_id=FgdYlHgNQ4K3uaQAAASx [error] Custom error message before runtime_error conn
2020-04-19T22:19:48.367036941Z app[web.1]: 22:19:48.366 request_id=FgdYlHgNQ4K3uaQAAASx [info]
2020-04-19T22:19:48.367688381Z app[web.1]: 22:19:48.366 request_id=FgdYlHgNQ4K3uaQAAASx [info] Sent 500 in 2ms
2020-04-19T22:19:48.367703834Z app[web.1]: 22:19:48.366 request_id=FgdYlHgNQ4K3uaQAAASx [info] Converted error :undef to 500 response
2020-04-19T22:19:49.094141761Z app[web.1]: 22:19:49.093 request_id=FgdYlKOHOVWSzv0AAATB [info] GET /runtime-error-conn
2020-04-19T22:19:49.095315252Z app[web.1]: runtime_error conn
2020-04-19T22:19:49.096036223Z app[web.1]: 22:19:49.093 request_id=FgdYlKOHOVWSzv0AAATB [error] Custom error message before runtime_error conn
2020-04-19T22:19:49.097691297Z app[web.1]: 22:19:49.096 request_id=FgdYlKOHOVWSzv0AAATB [info]
2020-04-19T22:19:49.098497482Z app[web.1]: 22:19:49.096 request_id=FgdYlKOHOVWSzv0AAATB [info] Sent 500 in 3ms
2020-04-19T22:19:49.098518328Z app[web.1]: 22:19:49.097 request_id=FgdYlKOHOVWSzv0AAATB [info] Converted error :undef to 500 response
2020-04-19T22:19:49.734157876Z app[web.1]: 22:19:49.733 request_id=FgdYlMml4_JtgxsAAATR [info] GET /runtime-error-conn
2020-04-19T22:19:49.734913826Z app[web.1]: runtime_error conn
2020-04-19T22:19:49.736221626Z app[web.1]: 22:19:49.733 request_id=FgdYlMml4_JtgxsAAATR [error] Custom error message before runtime_error conn
2020-04-19T22:19:49.737815292Z app[web.1]: 22:19:49.737 request_id=FgdYlMml4_JtgxsAAATR [info]
2020-04-19T22:19:49.738475577Z app[web.1]: 22:19:49.737 request_id=FgdYlMml4_JtgxsAAATR [info] Sent 500 in 4ms
2020-04-19T22:19:49.738491645Z app[web.1]: 22:19:49.737 request_id=FgdYlMml4_JtgxsAAATR [info] Converted error :undef to 500 response
2020-04-19T22:19:50.349229812Z app[web.1]: 22:19:50.345 request_id=FgdYlO4iv79YexIAAATh [info] GET /runtime-error-conn
2020-04-19T22:19:50.349287766Z app[web.1]: runtime_error conn
2020-04-19T22:19:50.349294655Z app[web.1]: 22:19:50.345 request_id=FgdYlO4iv79YexIAAATh [error] Custom error message before runtime_error conn
2020-04-19T22:19:50.349311000Z app[web.1]: 22:19:50.346 request_id=FgdYlO4iv79YexIAAATh [info]
2020-04-19T22:19:50.351010905Z app[web.1]: 22:19:50.346 request_id=FgdYlO4iv79YexIAAATh [info] Sent 500 in 1ms
2020-04-19T22:19:50.351026814Z app[web.1]: 22:19:50.346 request_id=FgdYlO4iv79YexIAAATh [info] Converted error :undef to 500 response
2020-04-19T22:19:50.365499744Z app[web.1]: 22:19:50.365 [error] #PID<0.645.0> running PhoenixElixirWeb.Endpoint (connection #PID<0.644.0>, stream id 1) terminated
2020-04-19T22:19:50.365534176Z app[web.1]: Server: phoenix-elixir.my-dokku.org (http)
2020-04-19T22:19:50.365539451Z app[web.1]: Request: GET /runtime-error-conn
2020-04-19T22:19:50.365543070Z app[web.1]: ** (exit) an exception was raised:
2020-04-19T22:19:50.365580905Z app[web.1]:     ** (UndefinedFunctionError) function PhoenixElixirWeb.Router.Helpers.page_path/0 is undefined or private
2020-04-19T22:19:50.365587070Z app[web.1]:         (phoenix_elixir 0.1.1) PhoenixElixirWeb.Router.Helpers.page_path()
2020-04-19T22:19:50.365591299Z app[web.1]:         (phoenix_elixir 0.1.1) lib/phoenix_elixir_web/controllers/page_controller.ex:57: PhoenixElixirWeb.PageController.runtime_error_conn/2
2020-04-19T22:19:50.365595200Z app[web.1]:         (phoenix_elixir 0.1.1) lib/phoenix_elixir_web/controllers/page_controller.ex:1: PhoenixElixirWeb.PageController.action/2
2020-04-19T22:19:50.365599066Z app[web.1]:         (phoenix_elixir 0.1.1) lib/phoenix_elixir_web/controllers/page_controller.ex:1: PhoenixElixirWeb.PageController.phoenix_controller_pipeline/2
2020-04-19T22:19:50.365602907Z app[web.1]:         (phoenix 1.4.16) lib/phoenix/router.ex:288: Phoenix.Router.__call__/2
2020-04-19T22:19:50.365606616Z app[web.1]:         (phoenix_elixir 0.1.1) lib/phoenix_elixir_web/endpoint.ex:1: PhoenixElixirWeb.Endpoint.plug_builder_call/2
2020-04-19T22:19:50.365622539Z app[web.1]:         (phoenix_elixir 0.1.1) lib/phoenix_elixir_web/endpoint.ex:1: PhoenixElixirWeb.Endpoint.call/2
2020-04-19T22:19:50.365842197Z app[web.1]:         (phoenix 1.4.16) lib/phoenix/endpoint/cowboy2_handler.ex:42: Phoenix.Endpoint.Cowboy2Handler.init/4
2020-04-19T22:20:33.772783870Z app[web.1]: 22:20:33.770 request_id=FgdYnwp88ORpQY8AAATx [info] GET /log
2020-04-19T22:20:33.776804400Z app[web.1]: 22:20:33.775 request_id=FgdYnwp88ORpQY8AAATx [info]
2020-04-19T22:20:33.776822873Z app[web.1]: 22:20:33.775 request_id=FgdYnwp88ORpQY8AAATx [info] Sent 200 in 5ms

Maybe it is something stupid that I am doing, but since dokku has many inner bits, it is out of my hands after many hours trying.

Well, yes. The logger has built in overload handling which will change it behaviour depending on amount of messages in the logger inbox:

  • normally it will use asynchronous messaging, aka the messages are sent to other process for computing
  • if there is more than :sync_threshold messages in the logger inbox (defaults to 20) then it will change to synchronous logging to force back pressure on logging processes (logging process waits until it’s message is handled)
  • if back pressure do not work and the messages queue anyway then when you reach :discard_threshold (defaults to 500) then logger will discard new messages until the messages amount will not drop below the threshold again

So if your sink (in this case stdout) is slow enough to not be able to ingest the messages quickly enough, then there exists chance that you are thresholded. However there should be additional log entry if there is discarding behaviour enabled.

1 Like

Hi @hauleth
Thanks for answering.

We have “isolated” a bit more the error, but we don’t know the solution yet. The problem is related with the cache of the web browser, since curl doesn’t show this problem.

Any request from curl or from a web browser with the cache cleaned show the expected behavior: exception in the server log (Dokku, just like in Heroku or Local). What is surprising, is that Dokku is the only one that have “this problem”. The only difference in req/res headers is that Heroku is using Cowboy instead of nginx (Dokku: the nginx is a proxy server that proxy pass to the phoenix/app container).

We don’t have anything related with the cache in nginx of Dokku.

I don’t know if this is the expected behavior and can be changed, but it doesn’t make sense to me that an user goes for the first time to / and then /runtime-error-match and the exception is not shown in the server.

With this new info, any thoughts?

==Response Headers==
HTTP/2 500 Internal Server Error
server: nginx
date: Sun, 26 Apr 2020 17:30:44 GMT
content-type: text/html; charset=utf-8
content-length: 21
cache-control: max-age=0, private, must-revalidate
cross-origin-window-policy: deny
x-content-type-options: nosniff
x-download-options: noopen
x-frame-options: SAMEORIGIN
x-permitted-cross-domain-policies: none
x-request-id: Fglu3jKRiw7hiSgAADch
x-xss-protection: 1; mode=block
strict-transport-security: max-age=15724800; includeSubdomains
X-Firefox-Spdy: h2

==Request Headers==
Host: phoenix-elixir-dokku.mydomain.org
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:75.0) Gecko/20100101 Firefox/75.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US;q=0.7,en
Accept-Encoding: gzip, deflate, br
Referer: https://phoenix-elixir-dokku.mydomain.org/
Connection: keep-alive
Cookie: experimentation_subject_id=Isd....9abd0f41; _phoenix_elixir_dokku_key=SFMyNTY.g3QAA...UgP0NyyXkE
Upgrade-Insecure-Requests: 1
Cache-Control: max-age=0

I add the headers in the three environments:

Dokku:

==Response Headers==
HTTP/2 500 Internal Server Error
server: nginx
date: Sun, 26 Apr 2020 17:30:44 GMT
content-type: text/html; charset=utf-8
content-length: 21
cache-control: max-age=0, private, must-revalidate
cross-origin-window-policy: deny
x-content-type-options: nosniff
x-download-options: noopen
x-frame-options: SAMEORIGIN
x-permitted-cross-domain-policies: none
x-request-id: Fglu3jKRiw7hiSgAADch
x-xss-protection: 1; mode=block
strict-transport-security: max-age=15724800; includeSubdomains
X-Firefox-Spdy: h2

==Request Headers==
Host: phoenix-elixir-dokku.mydomain.org
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:75.0) Gecko/20100101 Firefox/75.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US;q=0.7,en
Accept-Encoding: gzip, deflate, br
Referer: https://phoenix-elixir-dokku.mydomain.org/
Connection: keep-alive
Cookie: experimentation_subject_id=Isd....9abd0f41; _phoenix_elixirc_dokku_key=SFMyNTY.g3QAA...UgP0NyyXkE
Upgrade-Insecure-Requests: 1
Cache-Control: max-age=0

Heroku:

==Response Headers==
HTTP/1.1 500 Internal Server Error
Connection: keep-alive
Cache-Control: max-age=0, private, must-revalidate
Content-Length: 21
Content-Type: text/html; charset=utf-8
Cross-Origin-Window-Policy: deny
Date: Mon, 27 Apr 2020 11:33:08 GMT
Server: Cowboy
X-Content-Type-Options: nosniff
X-Download-Options: noopen
X-Frame-Options: SAMEORIGIN
X-Permitted-Cross-Domain-Policies: none
X-Request-Id: 0ed95f38-54f4-4b42-97ac-72bfdf1a5636
X-Xss-Protection: 1; mode=block
Via: 1.1 vegur

==Request Headers==
Host: phoenix-elixir-dokku.herokuapp.com
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:75.0) Gecko/20100101 Firefox/75.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US;q=0.7,en
Accept-Encoding: gzip, deflate
Connection: keep-alive
Upgrade-Insecure-Requests: 1
Cache-Control: max-age=0

Local production:

==Response Headers==
HTTP/1.1 500 Internal Server Error
cache-control: max-age=0, private, must-revalidate
content-length: 21
content-type: text/html; charset=utf-8
cross-origin-window-policy: deny
date: Mon, 27 Apr 2020 11:57:18 GMT
server: Cowboy
x-content-type-options: nosniff
x-download-options: noopen
x-frame-options: SAMEORIGIN
x-permitted-cross-domain-policies: none
x-request-id: FgmrQNyyw7z2XQkAAAED
x-xss-protection: 1; mode=block

==Request Headers==
Host: localhost:4000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:75.0) Gecko/20100101 Firefox/75.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US;q=0.7,en
Accept-Encoding: gzip, deflate
Connection: keep-alive
Cookie: locale=en
Upgrade-Insecure-Requests: 1
Cache-Control: max-age=0

If I use Firefox or Chrome there are a some more headers sent (like experimental_subject_id in Firefox, or sec-fetch-user/site/mode/dest in Chrome), but the response headers don’t vary.

Also, when I have my Inspector Tools with enabled cache, the request headers when going to /runtime-error-match are in all cases (D, L, H):

  • 1st request) no Cache-Control header.
  • 2nd and next requests) ‘Cache-Control: max-age=0’

If I set Disable Cache in the web browser, the req headers are always:

  • ‘Pragma: no-cache’
  • ‘Cache-Control: no-cache’

Only when the request is not cached is when the exception is shown in the server.

I found the same error in other apps. I tried with nginx without http2 and it happens the same.
It is quite uncomfortable to swallow errors…