derekbrown
GraphQL Performance Issues (using Absinthe)
Hey everyone! New to the forum, not new to Elixir. ![]()
In any case, recently had a discussion with @benwilson512 on overhead during GQL requests, and then saw a related question in the Absinthe channel on Slack, so decided to open it up here for input/guidance.
Problem
We’ve been tracing our API’s performance in Datadog and have seen some oddities that we can’t quite diagnose. Pictures are worth 1000 words, so here are some flame graphs for one request that demonstrates the issue:
The overall flame graph
1st Flame: Token Auth (in our Context)
2nd Flame: GQL Resolution
What’s happening in the gaps after auth/context and after the actual resolution?
This is just one request, but many of our requests have this shape, where the majority of the request has something unknown happening for at least 50ms or so. In some cases, we have 100ms or so unaccounted for, usually on the tail-end of the request post-resolution.
Environment
- App is released via
mix releaseto AWS (EC2 machine) and deployed behind an ELB (https). - Elixir 1.9.1, Erlang 22, Absinthe v1.5.0, Absinthe Plug v1.5.0-rc2, Absinthe Phoenix 1.5.0-rc.0
- The app is behind a Phoenix Router, Phoenix Endpoint, but is an API server.
Is anyone else seeing things like this? Happy to answer any other questions to help figure this out!
Most Liked Responses
benwilson512
Awesome. Do note that once OTP 24 is released we’ve seen significant improvements from the JIT as well, so we expect the overhead to drop a further ~60% from that alone.
benwilson512
One key thing to check on is what your scalars are doing. Scalars are serialized in the result phase, which happens entirely after the resolution phase. In the past, poorly optimized serialization functions have lead to lengthy post-resolution times.
To dig in further, some additional metrics would be handy. Can you provide an example query document? How large is the result JSON in bytes? How large is the schema (approximated by MyAppWeb.Schema.__absinthe_types__ |> map_size)?
benwilson512
Hi @derekbrown I still need some sense of the size of the returned document. This will help diagnose time spent after the resolution phase constructing the actual JSON result.
While
I recommend using Telemetry to provide a few more data points, specifically the [:operation, :start] event would be very helpful because right now although it is likely that Absinthe starts work immediately after the context is set, in theory there could be other plugs or issues within Absinthe.Plug that are introducing delay. See Telemetry — absinthe v1.11.0 for details.
The document is relatively small, so processing on the document should happen pretty quickly. I’m going to take that document and create a project with a similar schema to see if we can reproduce the observed performance issues.











