I’m designing an application where multiple node have to communicate but I’m not using the Erlang distribution only raw tcp messaging. And I want to set backpressure for the incoming messages to avoid overflow of the system and be able to scale with multiple processors.
I ended up to define something like this using Broadway:
- Listener which spawns a process for each connection
- Each connection spawns a Broadway pipeline with a producer which queue the tcp messages and a given number of processors to deal with the request and send back data
This is working fine, but I’m wondering if it’s a good design to have
n Broadway Pipeline or how will it be better to have a single pipeline for the entire app (listener) and provide more processors to scale.
Depends on the scale. If you’re expecting a million connections then you’re going to have a million * processor concurrency processors running. As far as I understand it, you define the level of concurrency in your pipeline that is ideal for your system. If you have a changing number of pipelines, then you’re providing backpressure to each connection individually but not the system as a whole. Can you pass an id or the pid of the process as part of the message and respond in
ack or something?
I think you are right about the global concurrency limit and control for the upstream messages. For now I have 5 processors by requests but if the system will get 1_000 connections, this will end up With 5_000 processors instead of maybe 100 global processors .
In fact I think I did like this also so each pipeline would have a given transport and socket in its context but I guess I can pass this a message from the producer (which is a forwarder a incoming upstream messages)