Hi @LostKobrakai - thank you for your support.
Actually I have breaking news!
After searching for the cause for several days in the os, in the load balancing system in the jails in the phoenix application - I now switched from Bandit to Cowboy.
And now this issue is gone - the app now runs for several hours and totally fine.
I had Bandit since 6 to 8 months or so, but never experienced that issue before. Just now - probably an update in Bandit or in combination with the environment this issue did arise…
I am now definitely happy to know the cause and have a means for mitigation.
I now have to go through the changelog of bandit what changed and now causes that issue.
Increasing the acceptor pool size might fix it - I will check - but there was barely traffic on that app and the unresponsiveness started always pretty much exactly after 60 minutes…
And wouldn’t Bandit log something when he has an issue and starts dropping connections?
It all went extremely silently… that’s why it was hard to debug
UPDATE
- I checked if it’s maybe an introduced bug in
Bandit
and downgraded to the latest version I had without trouble -1.5.7
- But this also fails with the updated environment and haproxy 3.0
- So I assume its a new bug that’s valid for all versions of Bandit in combination with
HAproxy 3.0
I will ask for creating a bug in the Bandit
repo. Maybe this is helpful for the owner.