Plug scalability?

Hey guys,

How does plug handle each new request/response cycle? it spawns a new processor for each request!

If not then how I can make sure that plug is not a bottleneck in my app?

Plug doesn not do any of that. Cowboy is the webserver people use with plug/phoenix (but could be another as well) and it’s spawning a process per request.

6 Likes

Great, so is there a way to spawn a custom genserver?
I have a worker logic and i want it to recv the query and then handle the request/streams to the endpoint.

Edit: looking at phoenix source code to get a full overview.

1 Like

What are you trying to do exactly?

Essentially a Plug transforms data throughout the request stack. These transformations can be setting a header, layout, get request parameters, etc. Maybe you can handle what you need to do in a plug without a GenServer.

1 Like

The request have to go through query_parser and cast it to the DB and have to wait for response and then the response will go as tcp packet/stream to the client.
So as you see the request will do some expensive work and i cant do that in the plug self() PID,
I believe websocket /http 2 and http over Quic are the only solutions to not block on a request through the plug pipeline.but we still dont have support to http2 neither http over quic which what i will work on.

cowboy has support for http/2, I think plug does as well. Generally, at least with http/1.1, if you are using a single tcp connection, the best you can do is to pipeline the request, which cowboy and other servers do by default. Unless you have a separate connection with which you intend to return the results to your clients, you won’t get any latency / performance improvements by spawning an additional process for each request. With cowboy 2, however each request gets a new process (due to it being mostly tailored towards http/2), and not just the connection.

I still can’t see why you do not want to do your stuff in the Plug process, it had to wait for you anyway.

To be honest, if you have long running work to do, I fear the client will timeout earlier than you could send him something back, HTTP2 wont change much in this regard.

Have you thought about just triggering the work via an HTTP endpoint and then streaming back through websockets?

Also remember, Plug.Conn.t implements the Collectable protocoll, so you can just sstream “into” it, out of your data source (hoping it is streamable at all).

2 Likes

Because Plug is just one PID serving all incoming requests, while with websockets/http2 stream active i only have to open one connection and the processor will act like a contract between the client queries and the data stores. And i prefer to have only active socket per client where both server and client can stream_requests/stream_responses to many objects_ids and data from the db and keep the processor alive,
My only issue with cowboy is i cant spawn a custom worker behaviour… anyidea?

Thank you all

And this is not true.

The webserver (eg. cowboy) spawns a process per request and calls a request handler in that process. Thats where plug takes control and then works through its pipeline.

Those processes are not even pooled, they are spawned “life” as they are needed.

3 Likes

This sound great, anyidea how to implement a custom worker handlers(i.e handle_cast, call, info) in the plug pid for the current request/response cycle?

Why don’t you implement a simple plug? Or in the context of phoenix a controller/action?

That’s how cowboy 2 works, in cowboy 1 (which all phoenix < 1.3 apps use) the process was created for the connection to avoid message copying.

The cowboy process that handles the http requests in spawned with :proc_lib, you shouldn’t create custom handlers on it, you might easily introduce deadlocks into the system (just see the genserver source code and how many edge cases there are).

Still, each plugs pipeline had its own process.

Really? That’s quite inefficient. I didn’t notice it when I was studying its source code about a year ago.

I was able to implement a custom elli adapter without plug spawning any additional processes for each request, so you might be wrong.


Just checked https://github.com/elixir-plug/plug_cowboy/blob/master/lib/plug/cowboy/handler.ex and couldn’t find any new processes spawned … The only process that plug creates is for Plug.Upload, but that’s not relevant here.

But hasn’t that always been advertised as a big feature? A million of servers handling a single request vs. a single server trying to handle a million of requests?

1 Like

Because requests and responses are fully asynchronous, Cowboy creates a new process for each request, and these processes are managed by another process that handles the connection itself.

This is from the cowboy 1.0 user guide. The reason why plug_cowboy does not spawn processes is probably because plug does not handle that part at all. It just works with the process the underlying webserver is providing I’d imagine.

The quote you provided is for SPDY, not http.

From the same source:

Cowboy implements the keep-alive mechanism by reusing the same process for all requests. This allows Cowboy to save memory. This works well because most code will not have any side effect impacting subsequent requests. But it also means you need to clean up if you do have code with side effects. The terminate/3 function can be used for this purpose.

Anyway, if you look into the sources of cowboy 1, you’d see that it (or rather ranch) creates a single process per connection.

Cowboy 2, however, creates a process for each request as well to better fit http/2 semantics, which also results in about 20% decrease in performance for http/1.1.

The process per connection which cowboy 1 used is the most efficient approach, so I’m not sure what has been advertised as a big feature in plug …

A million of servers handling a single request vs. a single server trying to handle a million of requests?

It’s actually about connections, in case of cowboy 1 and all apps in phoenix < 1.3 it’s “a million servers serving a million connections”, which scales rather nicely.

And what exactly is the difference between a connection and a request in the context of HTTP/1 (not upgraded to anything)?