Pausing Broadway/GenStage Pipelines?

I’m working on building a pipeline that requires a “pause” feature. I am unsure as yet as to whether to use Broadway or build this more manually using GenStage.

The job of the pipeline is to sort messages into buckets according to rules. Currently, messages arrive via SQS. The rules must be fetched externally, and one of the gotchas is that if the rules change, we may need to re-allocate the messages into different buckets. For example, if each message were a playing card, the rules might require that we sort the cards into buckets according to their suit (hearts, diamonds, etc), but when the rules change we might instead need to sort according to values (aces, kings, etc). It is important that the messages don’t immediately get sent – we want the sorting to accumulate the messages for a time so they can be inspected, and then the contents of the bucket can get sent onward to their destination (usually another SQS queue).

There are a couple things here that I’m having a hard time wrapping my head around.

  1. How to DELAY message acknowledgment? ideally, we would NOT acknowledge the incoming SQS messages until AFTER a bucket had been inspected and approved. That way if the upstream rules changed, we could just restart the pipeline and re-fetch the messages – in other words, we could let the AWS SQS queue handle persistence of the incoming messages. I like this idea conceptually because the queue can be inspected and it’s very clear that if the queue has messages that they have NOT moved past this “sorting” phase.

  2. How to pause a consumer? Usually pipelines drain a queue until it is empty, but that’s not what is required here. I’m not sure exactly how or where to slow down the pipeline here… how could Broadway’s handle_batch function be delayed? There’s a distinction worth mentioning: the “buckets” we are sorting into may be many thousands of messages – it’s not a 1:1 with the size of an AWS batch.

I admit that I have a harder time thinking about this in Broadway than I do manually piecing together GenStage components. Perhaps this works better as a 3-stage GenStage pipe:

SQS Producer → Sorting ProducerConsumer → Ack + Downstream transmission Consumer.

Thanks for any insights!