adammokan
Odd slowdowns with concurrent HTTPS requests / HTTP Client Concurrency
I have a system that is making a lot of HTTPS requests. I’m using HTTPoison (which has Hackney under the hood). All is well when I run things real slow, like 500 requests a minute.
However, when I start ramping up my work pools to make more transactions and increase concurrency, I notice a big jump with slow turnaround from pre-request logic until I get a response. I’m not aware of a way to get the actual request/response timing, but I just keep track of the time right before the request and right after - which it goes from 4-5 seconds on average to the high 30s or more. Sometimes I end up with spikes of 150-250 seconds. I’m unsure of is if the actual VM is being bottlenecked in some way or by something by the OS (Ubuntu 14.04).
What I also notice is that when I run into a chunk of slow requests, it seems to ‘cascade’ and effect the handling/response time of subsequent requests for a while. I could understand if I were seeing CPU or memory spikes, but I’m not. The VM queues look good as well when watching in observer.
My code is making the requests in a pool to limit the concurrency. I also know hackney is pooling requests. I’ve used both the default hackney pool as well as creating my own hackney pool with a large enough capacity, but tempted to disable the hackney connection pooling altogether. I have added some sleep logic to avoid making too many requests start up simultaneously, but that hasn’t helped a whole lot either.
If anyone could give me some ideas for monitoring these connections better or figuring out the long delays, I’m all ears.
Things I have done so far:
- increased ulimits on Ubuntu
- increased
+Qand+Asettings on the erlangvm.args - watched the
tls_connection:initcalls in Observer (they tend to jump up in memory usage when I hit these slowness spikes) - split connections across multiple hackney pools
Server Specs:
- 8 cores
- 16gb RAM
I’m hardly moving the CPU at all and ram usage is normally below 1gb on the VM side
Most Liked
stocks29
I had a similar experience and the solution was to force hackney to use the default pool, which I believe allowed my app to reuse connections and not have to redo the handshake for each request. I documented it here: http://coderstocks.blogspot.com/2016/01/sqs-throughput-over-https-with-elixir.html
Hopefully that is helpful for you.
OvermindDL1
If you are running ubuntu in a virtual machine then the VM could definitely have some overhead depending on how its network is configured. But still, I’d say replicate the case in python/ruby/whatever and see if you have the same performance characteristics. It would definitely rule in one way or the other. Curious as to your results. 
OvermindDL1
And you are certain it is not the remote server as the first thing to check? Have you tried the same thing with python or something as a test to verify?







