How do you track/report BEAM crashes?

Neither Sentry nor AppSignal can track BEAM crashes, because they crash together with BEAM. The app automatically restarts, so external monitors also don’t catch anything.

In my case an Oban job was crashing BEAM because it was running out of memory. I only noticed it because the job stuck in the execution phase.

Is there any way to track and report BEAM crashes and causes? Similar to how we track exceptions.

I’d appreciate an advice

1 Like

Traditionally this is handled by integrating with whatever is tasked with running the BEAM itself. On a traditional VM this might be systemd, or in a more docker oriented deployment structure this would be something like K8s.

5 Likes

Not exactly what you asked; I also specify to the BEAM where to write crash dumps via an environment variable ERL_CRASH_DUMP and have a separate container mounting, sharing, and watching that location to upload the crash dumps to storage so I can easily inspect them.

This isn’t a notification, but still maybe helpful for you.

In any case, setting up a log sink with notifications on key words like “crash dump written” may get you what you’re looking for.

5 Likes

Would it make sense to build notification inside elixir app? So on app start it would check for crash dump, then report it and delete it?

What if the cause of the crash also prevents the app from restarting?

Then the whole instance would be rebuilt from scratch if it doesn’t respond to health checks for some time.

What I was facing were silent restarts due to running out of memory when processing large files. Neither exception monitoring nor health checks were noticing those restarts.

Besides if it is run from the first line in Application.start, wouldn’t it run before anything that could potentially crash the app?

I think that reporting it from systemd will increase complexity and make it harder to debug. Maybe i’m wrong.

I’ve had apps fail in {apply,{application,start_boot,[stdlib,permanent]}} and {application,start_boot,[kernel,permanent]}} when trying to use custom epmd modules. Just FIY. I don’t think normal apps would experience similar issues, but it is at least possible.

2 Likes