I’ve had a pretty simple Phoenix app running in production on a DigitalOcean droplet for a few months and I just started to experience some periodic crashes. I’ve not been able to determine the cause of these crashes after looking at the production logs or the erl_crash.dump files. Everything seems fine (200s and 302s) until I see error: run_erl[14863]: Erlang closed the connection. in the systemd journal and then ngnix starts returning 502s. No errors showing in erlang.log.x and the Do’s resource graph isn’t showing abnormal memory usage.
I’ve got a hunch that it’s one of our users who is causing the crash and I’d like to see if I can get her to recreate it but want to have some better tools in place to figure out why it’s crashing. Any suggestions?
Sorry I can’t be more helpful, I am rather new to the topic myself and only recently started gathering some education material. But I think telemetry / logging is going to be your best bet – unless somebody else had that exact problems and chimes in with a solution.