Investigating 502s from load testing my Phoenix app on Dokku

Finally deployed my Phoenix with SQLite3 app on Dokku. First things first, I did a lil wrk test from my laptop.

Wrk command with results and server info

The server is “RS 2000 G12” from Netcups. The endpoint serves an inertia-phoenix-powered controller-scaffolded login page. Default Dokku on Debian setup. No Nginx or kernel tuning applied.

$ wrk -t12 -c400 -d30s https://example.com/users/log-in
Running 30s test @ https://example.com/users/log-in
  12 threads and 400 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   127.67ms  140.81ms   1.15s    89.49%
    Req/Sec    98.90     81.46   535.00     82.60%
  33595 requests in 30.07s, 64.23MB read
  Non-2xx or 3xx responses: 5365
Requests/sec:   1117.32
Transfer/sec:      2.14MB

$ wrk -t12 -c400 -d30s https://example.com/users/log-in
Running 30s test @ https://example.com/users/log-in
  12 threads and 400 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   202.20ms  287.43ms   1.96s    89.15%
    Req/Sec    96.50     92.09   640.00     84.52%
  32998 requests in 30.10s, 63.95MB read
  Non-2xx or 3xx responses: 4778
Requests/sec:   1096.33
Transfer/sec:      2.12MB

$ wrk -t12 -c400 -d30s https://example.com/users/log-in
Running 30s test @ https://example.com/users/log-in
  12 threads and 400 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   168.45ms  200.40ms   1.57s    86.71%
    Req/Sec   100.09     86.62   650.00     83.00%
  34932 requests in 30.10s, 64.79MB read
  Non-2xx or 3xx responses: 6704
Requests/sec:   1160.69
Transfer/sec:      2.15MB

During the test, the server’s CPU usage went up to 50-70%, and the process using it the most was either the Beam container or Nginx. In Nginx, I have an SSL certificate to encrypt traffic from Cloudflare to the server.

After some time of running wrk, the server started responding with a 502 error. And I’m really curious to see why that is. I thought it would keep fighting until the CPU reaches 100%.

Even after the wrk is finished, it takes about 5-15 seconds for the server to change from 502s to 200.

Checking Dokku and Docker logs shows lots of similar request logs, which makes it hard to spot any errors:

09:52:34.997 request_id=GIUJwWJwbI8MW6cAAwso [info] GET /users/log-in

Anything I could do to find the reasons behind those 502? My first guess is adding Sentry or alike to see if there are any error logs besides the request logs. Will post an update.

Are you hitting Cloudflare in your tests or bypassing it and hitting your server directly?

502 is typically Error 502 or 504 · Cloudflare Support docs

You also need to get familiar with your nginx configuration Nginx Proxy - Dokku Documentation

You need to consider all hops in your setup before reaching the BEAM.

Sentry will not help you in this particular case (i.e. networking and errors before your application sees a request).

2 Likes

Thank you! I set logger level to :error (as opposed to using Sentry), and indeed, there were no errors from Phoenix.

Just learned about dokku nginx:access-logs command:

$ dokku nginx:access-logs myapp
2025/12/27 13:12:40 [crit] 789557#789557: *208719 connect() to 172.17.0.3:5000 failed (99: Cannot assign requested address) while connecting to upstream, client: 172.70.163.108, server: myapp.exampleserverhost.com, request: "GET /dashboard HTTP/2.0", upstream: "http://172.17.0.3:5000/dashboard", host: "myapp.com"
2025/12/27 13:12:41 [crit] 789562#789562: *206810 connect() to 172.17.0.3:5000 failed (99: Cannot assign requested address) while connecting to upstream, client: 172.70.163.108, server: myapp.exampleserverhost.com, request: "GET /dashboard HTTP/2.0", upstream: "http://172.17.0.3:5000/dashboard", host: "myapp.com"
2025/12/27 13:12:42 [crit] 789562#789562: *206810 connect() to 172.17.0.3:5000 failed (99: Cannot assign requested address) while connecting to upstream, client: 172.70.163.108, server: myapp.exampleserverhost.com, request: "GET /dashboard HTTP/2.0", upstream: "http://172.17.0.3:5000/dashboard", host: "myapp.com"
2025/12/27 13:12:42 [crit] 789558#789558: *178015 connect() to 172.17.0.3:5000 failed (99: Cannot assign requested address) while connecting to upstream, client: 172.70.163.109, server: myapp.exampleserverhost.com, request: "GET /dashboard HTTP/2.0", upstream: "http://172.17.0.3:5000/dashboard", host: "myapp.com"
2025/12/27 13:12:43 [crit] 789557#789557: *206809 connect() to 172.17.0.3:5000 failed (99: Cannot assign requested address) while connecting to upstream, client: 172.70.163.109, server: myapp.exampleserverhost.com, request: "GET /dashboard HTTP/2.0", upstream: "http://172.17.0.3:5000/dashboard", host: "myapp.com"
2025/12/27 13:12:44 [crit] 789558#789558: *178015 connect() to 172.17.0.3:5000 failed (99: Cannot assign requested address) while connecting to upstream, client: 172.70.163.109, server: myapp.exampleserverhost.com, request: "GET /dashboard HTTP/2.0", upstream: "http://172.17.0.3:5000/dashboard", host: "myapp.com"

So “99: Cannot assign requested address” is what actually happens underneath.

The bad gateway error is coming from my host/server/nginx but is under Cloudflare:

Looks like it’s about time to explore Nginx tuning. I’ll try a few things and get back with an answer :grinning_face_with_smiling_eyes: (adding keepalive to upstream section)

(Load) testing can be fun and uncover a lot of interesting lessons!

One tip is reading what Gil Tene wrote on the topic and several talks on YouTube (in particular about coordinated omission).

At some point you need to consider also what your goals are with the load testing, and when to stop.

Do you expect to serve 400 concurrent users with your current setup?

Hitting your controllers is only one part of the test. Consider also WebSocket persistent connections (there will be client-Cloudflare, Cloudflare-nginx, and nginx-BEAM connections open). Consider database writes according to your expectations.

Have fun and learn a bunch!

3 Likes