I have an expensive CPU action that I want to serve through a public internet endpoint. It doesn’t matter what it does, just that it usually takes some time (seconds) and that it costs CPU (say a few percent of CPU usage). For a real use-case you can think of an expensive hashing algorithm like PBKDF or any other on a signin endpoint.
What I want to avoid is being hit by a flood of requests and suffer from resource starvation where everything would become unresponsive. Like the signin case, a public endpoint without authentication makes it a perfect target for attacks that want to bring the server down. Think that all network solutions are in place like rate-limiting, WAFs and so on.
I thought about 2 strategies in the BEAM:
1- Using a pool of processes. I’ve implemented this with poolboy and works fine but is hard to tune it. I have to benchmark the cost of the function in a production server and reach a pool size that will be a good enough “sharing” of resources. I am thinking about switching to wpool and having a bigger than needed pool with a callback module that would check CPU usage before dispatching to the pool. This seems to me very unreliable and prone to error… If we get several concurrent requests and the CPU usage from all of these is under control, then they would all start at the same time and the CPU would spike anyway…
2 - Using a slave node for doing just this operation and fight with the emulator flags to have it use other logical cores. Suppose I have 8 cpus available, I could dedicate 2 or 3 to the slave node and the rest to the main system. Though, with this strategy, it would still be vulnerable to resource starvation and make all calls to it fail which is not my intention here. I’d rather have timeouts than a denial of service.
So, I’d like to know if there are any other algorithms or strategies to deal with this. I appreciate your time