Analysis of Connection Pool Bottlenecks in Elixir/PostgreSQL App on Heroku

Hello everyone,

I’ve been conducting load testing on an Phoenix/PostgreSQL application deployed on Heroku, using Locust as our testing framework. Our primary focus has been analyzing the throughput capacity of a specific transaction under high concurrent load.

During initial testing, we encountered our bottleneck manifesting as:

** (DBConnection.ConnectionError) connection not available and request was dropped from queue after 136ms. (...)

Initially, this appeared to be a straightforward connection pool exhaustion issue. My first approach was to increase the connection pool size, but this didn’t yield the expected capacity improvements. (While I considered adjusting queue_target and queue_interval parameters, this seemed to address the symptom rather than the root cause.)

Further investigation steps included:

  1. Query optimization (achieved reasonable improvements)
  2. Isolated database load testing (confirmed queries weren’t significantly degrading under stress)
  3. Connection pool monitoring via :telemetry (revealed increasing queue_time, while query_time and decode_time remained stable)

The breakthrough came when experimenting with different Heroku dyno configurations. Surprisingly, deploying increasing number of small dynos with minimal connection pools (only 2 connections each) significantly improved throughput.

My working hypothesis is that the bottleneck stems from actual parallelism limitations in Heroku dynos due to small number of cores, despite Elixir VM’s high concurrency. This would explain why database queries were processing quickly, but Elixir processes couldn’t execute concurrently fast enough to process results and release connections.

I’d appreciate your thoughts on:

  1. Does this analysis sound reasonable to you?
  2. Are there alternative explanations I should consider?
  3. Any suggestions for additional testing approaches?

(While exploring alternative infrastructure options is on our roadmap, I’m keen to fully understand the current setup’s behavior first.)

Thank you for your insights.

4 Likes