Cowboy1 and cowboy2 performance in a synthetic benchmark

So I wanted to see how much overhead plug actually adds to cowboy and I tried to add cowboy to the https://github.com/tbrand/which_is_the_fastest benchmark. It didn’t go quite as expected since cowboy 2.0-pre turned out to be much slower than plug there. So then I tried cowboy 1.1 which plug and phoenix use, but it still was a bit slower. Can maybe somebody with more experience with cowboys look into the code and maybe see ways to improve it?

my fork of the benchmark for

I think I’m using the same ranch options as plug (https://github.com/elixir-lang/plug/blob/master/lib/plug/adapters/cowboy.ex) and splitting the path the same way (https://github.com/elixir-lang/plug/blob/master/lib/plug/adapters/cowboy/conn.ex#L92-L95), but something’s off.

Also, here’s the PR thread with some additional details https://github.com/tbrand/which_is_the_fastest/pull/58

did you set max_keepalive 5_000_000 - important option in these keep_alive do nothing tests - see http://theerlangelist.com/article/phoenix_latency

a quick update https://github.com/tbrand/which_is_the_fastest/pull/58#issuecomment-305425183

i think i’ve tried, and it didn’t change anything, will try again, thank you

no significant difference in req/s, but that might have improved latencies somewhat

# cowboy1 with `keep_alive: 5_000_000`
> wrk -t30 -c40 -d60s http://localhost:3000
Running 1m test @ http://localhost:3000
  30 threads and 40 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.45ms    3.65ms 128.25ms   96.48%
    Req/Sec     1.01k   218.63     2.09k    72.75%
  1802528 requests in 1.00m, 156.43MB read
Requests/sec:  30003.77
Transfer/sec:      2.60MB
# without
> wrk -t30 -c40 -d60s http://localhost:3000
Running 1m test @ http://localhost:3000
  30 threads and 40 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     2.71ms   11.18ms 202.02ms   97.33%
    Req/Sec     0.98k   235.23     1.84k    77.06%
  1756141 requests in 1.00m, 152.72MB read
Requests/sec:  29232.42
Transfer/sec:      2.54MB

for cowboy2 there is almost no difference in latencies https://github.com/tbrand/which_is_the_fastest/pull/58#issuecomment-305429582

Very cool followup! So basically cowboy2 will be a tiny bit slower than cowboy1 because it adds a unified interface for http1.1 and http2, but in doing so it gains a lot of ease of use, plus http2. :slight_smile:

And of course cowboy2 is not tuned yet, still in dev. ^.^

EDIT: Cowboy dev has put some very interesting information on how likely to get cowboy2 faster than cowboy1, this will be useful for when plug is updated.
https://github.com/ninenines/cowboy/issues/1169