Graceful shutdown for Phoenix

Hi there,

We have transaction requests that can take 30-70 seconds and aborting them is a “very bad thing” but we need to be able to deploy with zero downtime and ideally we can detect when phoenix is done serving traffic so that we can shut down the old version in the case of blue green deploys or move on to the next server in the case of rolling deploys.

Does anyone have an example of gracefully shutting down phoenix (without aborting existing request). In other words, drain stopping phoenix. At the moment, if you trigger a shutdown with, say, exrm's stop command, any existing long running request will be aborted.

Any advice and/or examples on zero downtime deploys (besides hot upgrades, we’ve tried them and they require too much overhead for continuous deployment and there are enough cases where they just do not work yet) would be much appreciated.

Thanks!

5 Likes

Are those 30-70 seconds transactions being performed straight from Phoenix workers?

There is a somewhat nice solution that uses not one but two background job queues. Basically when a job request comes in, you throw it on a intermediate queue. You have a background job worker that takes job from intermediate queue, and puts it on second, working queue. Second worker takes jobs from the work queue and processes them, removing from the work queue when it’s done.

Now, if you want to gracefully shut down your app, you first shut down your intermediate queue worker. New job requests come in and land on intermediate queue as previously. But nothing moves them to working queue.

In the meantime, the worker takes tasks from working queue, removing them one by one. You can safely shut down your app when working queue is empty.

7 Likes

That’s an interesting idea. At the moment they’re actually synchronous web requests, so we need a way to tell phoenix to stop receiving requests. We’d also need a way to know when all of the pending requests in phoenix finished. We’d need this regardless of whether or not we had a queue or queues in place to do the actual work.

I came across https://gist.github.com/etrepum/2655724 which is a direct cowboy solution, maybe something like that would work?

3 Likes

For what it’s worth, we implemented this for Phoenix here: https://gist.github.com/aaronjensen/33cc2aeb74746cac3bcb40dcefdd9c09

5 Likes

Very cool. I wonder if something like this should be the default behavioir.

1 Like

Probably. It would need to be built into the adapter, Chris said.

1 Like

Thank you for sharing the code.

1 Like

@aaronjensen Thanks again for sharing this. It’s being considered as part of the Phoenix.Endpoint.Handler behaviour. You can track the discussion here https://github.com/phoenixframework/phoenix/issues/1742

2 Likes

No problem. Awesome, I’m looking forward to it. I don’t know if gists need licenses, but please consider all of that code MIT or WTFPL :smile:

1 Like