Disclaimer: I am not sure if this is the correct category, please let me know otherwise.
We have a dispatcher instance group that receives around 700 requests per second per active VM. This dispatcher is behind a Load Balancer that auto scales. Thus far all our VMs are regaular VMs, however we have been studying the possibility of making them preemptive.
The problem with preemptive instances
According to the documentation GCP can terminate a preemptive instance at any time.
Let’s assume that each dispatcher VM holds no state. It receives a request, processes it and makes an HTTP request to some other machine.
At any given time, each VM will be processing around 700 requests concurrently, while receiving data from the load balancer.
What happens if my preemptive VM, processing 700 requests, receives a signal to be terminated?
Well, in theory one should have a shutdown script that makes sure processing those requests finishes and then kills the app (clean exit). This leads us to the big question:
- But does the load balancer know that my VM is shutting down? Will it keep sending requests to the terminating VM?
If yes, then it means some requests will fail because once the app shuts down, the machine is still up and the load balancer keeps on sending requests to the machine, not knowing the app is already down.
Ideally, these requests would go back as failed requests to the load balancer and it would send the requests to another machine. However GCP load balancers are not smart enough to do this, and so they don’t. In this scenario, I would have to create an OTP load balancer myself that creates machines in GCP and kills them eventually.
If somehow the load balancer knows this VM was selected for preemtive termination than nothing special needs to be done.
Which one is it?