Broadway SQS delivery time longer than expected

Hello, I am not sure the problem is in the library, but maybe you could help me in what way should i research.
We use broadway_sqs library for amazon sqs.
we measure time to deliver for each message separately on the application side. Time to deliver may be up to 20 seconds!
So the question is, what may it take so long?

Can you elaborate exactly how you’re measuring delivered?

Also keep in mind that by default broadway_sqs is configured to process messages in batches, up to some time limit. If your volume of messages is very low, you may be waiting for the time limit to hit.

measuring is very simple:

  • track time before we put in queue via ex_aws library
    {:ok, %{body: %{message_id: message_id}}} = ExAws.SQS.send_message(queue_url, message) |> ExAws.request()
  • track time inside method process_data
    def handle_message(_, message, _) do
    |> Message.update_data(&process_data(&1, message.metadata.message_id))

Can you show the code you have that sets up broadway?

      name: __MODULE__,
      producer: [
        module: {
          queue_url: Keyword.get(config, :queue_url), config: Keyword.get(config, :credentials)
        stages: 10
      processors: [
        default: [stages: 100]
      batchers: [
        default: [
          batch_size: 10,
          batch_timeout: 2000

Right, so at a minimum if you aren’t sending more than 10 messages you’ll have a wait time of at least 2 seconds, since that’s the batch timeout. The other delays you mentioned of up to 20 seconds sound a lot like the max value for the SQS wait_time_seconds option. You aren’t setting it explicitly, but the queue itself will also have a default. What is the queues ReceiveMessageWaitTimeSeconds value?

1 Like

ReceiveMessageWaitTimeSeconds = 0

If there are batchers, the acknowledgement is done by the batchers, using the batch_size

I thought batch_size needs only for acknowledgement, to collect all messages in a batch

I don’t understand your question sorry.

Even if you are not interested in working with Broadway batches via the
handle_batch/3 callback, we recommend all Broadway pipelines with SQS
producers to define a default batcher with batch_size set to 10, so
messages can be acknowledged in batches, which improves the performance
and reduce the costs of integrating with SQS.

Sorry for my english, batch_size and batch_timeout do not block producer to receive more messages. These params used only for acknowledgment (request to sqs in the end of pipeline)

We have observed the same thing, especially when we only have 1 or 2 messages being sent to the queue every now and then.

For us the issue was broadwaySQS’s default polling (short polling) and receive_interval (5 seconds).
The default behaviour sees a broadway producer polling sqs for messages and on an empty receive, waiting 5 seconds before trying again.

If there are very few messages in the queue short polling does not guarantee you will get any messages since it only queries a subset of the servers that make up your SQS queue. That means it can take multiple attempts to get a single message. In our case, that meant our broadway producer would take between 2 and 60 seconds to receive a message when there was only 1 enqueued.

We needed the best response time possible and were able to achieve it by reducing the receive_interval to 50 ms and setting the wait_time_seconds to 20.
Note: for receive_interval doesn’t seem necessary with long polling and we wanted the value close to 0, but it would probably work with 0.
for wait_time_seconds: any number above 0 enables long polling SQS long polling docs. Long polling is guaranteed to not return an empty receive if there are any messages in the queue and it will return as soon as any messages are available.

For example:

producer: [
        module: {
          queue_url: Keyword.get(config, :queue_url), config: Keyword.get(config, :credentials), 
          wait_time_seconds: 20
          receive_interval: 50
1 Like