mattei

mattei

Debugging obscure process crashes?

Hi all, big Elixir fan (and newbie!) dropping by to ask something that has been puzzling me for a bit.

I’m building an app that relies on Oban for job processing. However, some jobs are getting killed/crashing with obscure errors by the BEAM.

These are the strange behaviors:

  1. Oban job gets killed. The Oban job is simple, it only does an HTTPoison request.

  2. Requests/responses seem to get stalled. HTTP requests take too long/forever, although the actual time elapsed doesn’t reflect in the response time metric in terminal.

Request gets stalled early in the plugs process, then after a few second resumes:

When it happens (all local, not production):

  1. High request rate – sending tons of requests, Oban job inserts from Postman into the API endpoint
  2. Suspected high memory pressure – although I doubt it’s OOM, because I’ve looked at activity monitor and sometimes it happens, even in the green.
  3. Randomly – sometimes I’m only sending a one off request

Here are some suspicions:

  1. Too many queries being sent, Postgres stalling.
  2. Out of memory/low resource behavior, but I thought BEAM handled this better.
  3. Oban job taking too long, although it’d be a TimeoutError, not a Killed.
  4. Infinite recursion somewhere, although I feel that’d also be a TimeoutError from Oban.
  5. HTTPoison bug, the process gets killed.
  6. Memory leak.

I have no leads other than these error messages and strange behavior.

Where would I start to debug this problem? How would I prove any of these theories? I’m new to the BEAM and it’s very different from a traditional language.

Marked As Solved

benwilson512

benwilson512

Author of Craft GraphQL APIs in Elixir with Absinthe

No, the whole VM is unloading the code that was there before, and then loading the new compiled code. The best thing to do is just let Oban retry the job after the crash.

Also Liked

lud

lud

Your process receives and :EXIT message from another process that is linked to the former.

Do you start a process from your job process with a SomeModule.start_link or spawn_link call? The process with pid <0.13246.0> was killed and as it was linked to your own process, your process exited with that same reason.

You may create a test and directly call your Job.perform function repeatedly, without starting oban. This will be easier to get a proper stacktrace.

emoragaf

emoragaf

normally you would wait for a connection to become available, but when you have the issue of having the connections not being released back into the pool you basically loose capacity silently until things blow up.

The worst thing is that since it’s very easy to just leave the :default pool, you can have totally unrelated parts of your app (that also use the default pool) causing this issue.

Your process receives and :EXIT message from another process that is linked to the former.

This kind of thing could indeed be the root cause of pool starvation, to the point where you have your jobs waiting forever for a connection that is never going to be put back into the pool

benwilson512

benwilson512

Author of Craft GraphQL APIs in Elixir with Absinthe

Hey @mattei as a couple of notes, please always copy and paste text instead of using screenshots. Those screenshots are barely visible on my screen due to resolution / saturation.

Secondly, when you say “hot reloads” are you referring to development code reloading, or are you using OTP hot reloading in production?

benwilson512

benwilson512

Author of Craft GraphQL APIs in Elixir with Absinthe

Gotcha, yeah I mean with development reloads crashes of background processes are pretty normal, code is getting loaded and unloaded without regard for whether there are live processes using that code. If that’s the only time this is happening I wouldn’t worry about it.

emoragaf

emoragaf

I’m throwing a dart in the dark here, but I’ve been bitte by HttPoison/hackney pools before, where if the request crashes it doesn’t release the connection, starving the pool in a short time.

Try setting pool: false and see if that helps maybe?

Where Next?

Popular in Questions Top

aadeshere1
I have a another noob question about loop. Since elixir is immutable, while loop is not directly possible. total = 10 while total != 0 ...
New
lastday4you
I wanted to check elixir version in phoenix because i found that my elixir is 1.5 but when i use Enum.chunk_by it said the function is un...
New
electic
Hi, I am new to Elixir. I am trying to use the DateTime component to insert a date into MySQL however the there seems to be no way to fo...
New
shahryarjb
Hello, I have map which I want to convert it to string like this: the map: %{last_name: "tavakkoli", name: "shahryar"} the string I ne...
New
shahryarjb
Hello, I get Persian date from my client and convert it to normal calendar like this: def jalali_string_to_miladi_english_number(persi...
New
JulienCorb
I am trying to implement my new.html.eex file to create new posts on my website. new.html.eex: &lt;h1&gt;Create Post&lt;/h1&gt; &lt;%= ...
New
script
If I have a string “1000 cfu/ml” . I want to remove the characters and / and space . So the string is like this "1000" What is the ...
New
nobody
Hi! In PHP: $_SERVER[‘SERVER_ADDR’] - in Elixir? Searched the docs for ip address and the web, no good results. Thanks!
New
komlanvi
Hi everyone, I was playing with phoenix liveView but I run into an issue. I have a form and want to validate each input text when the te...
New
shijith.k
I am trying to start a new phoenix project with elixir 1.9, but mix phx.new does not work. It says that ** (Mix) The task "phx.new" could...
New

Other popular topics Top

albydarned
Hello all! I am typing this post from my new MacBook Pro with the M1 chip. I’m loving it so far, and will probably use it as my daily dr...
New
AstonJ
Posting this to see if we can make things easier for people to get into Neovim. If you use Neovim and have a favourite distro please let ...
New
ovidiubadita
Hey all, I discovered Elixir and I love it. I always wanted to learn a functional programming and I intended to go for Haskell, but afte...
New
jononomo
I am trying to figure out how Mix knows whether the environment is test, dev, or prod – where is this set? Thanks.
New
AngeloChecked
What learn first? Rust or Elixir Hi Elixir community! I’m here because i want learn a new language. I’m a junior developer and mainly i ...
New
alice
Hey, Just curious what are the main benefits of Elixir compared to Clojure? When is Elixir more useful than Clojure and vice versa? Th...
New
nobody
Hi! In PHP: $_SERVER[‘SERVER_ADDR’] - in Elixir? Searched the docs for ip address and the web, no good results. Thanks!
New
nsuchy
Hi. I’ve noticed that Windows Powershell has it’s own IEX command and you cannot access Elixir’s IEX due to the conflict. This isn’t a cr...
New
PeterCarter
There are pre-rolled solutions for other frameworks that do work. However, Phoenix does not seem to have these. Have people had good expe...
New
AstonJ
Seen any cool LiveView demos, sample apps or examples? Please post them here! :003:
New

We're in Beta

About us Mission Statement