Understanding slow requests with telemetry event timeline

CTD · January 21, 2021, 11:06pm

We’re seeing the same thing, or at least something similar.
We’ll return thousands of a certain POST request daily in 200-500ms, and then one will take 30s.
There are others that take longer (1-2s), but those seem to correspond with payload size whereas the 30s requests do not.

We’ve instrumented all of our plug calls for AppSignal, and it’s consistently happening at Plug.Parsers.
This is not the first plug in our endpoint.

Our incoming payloads contain a base64 image string. We wondered if there may be an issue with parsing it (w/ jason), but again, most make it through just fine (including some that are much larger than those in the the 30s requests). Just mentioning this in case our requests have this in common.

We’re also on Heroku. Metrics don’t seem to indicate any issues with memory or dyno load before or after those long requests. And when the dyno survives the long response, it goes on to serve more of the same type of request (and others) right away with no trouble.

Any feedback @jmurphyweb or others?