Broadway Producer does not receive incoming_demand from the consumer

Hi everyone,

I’m having an issue while using Broadway: the producer is not receiving any demand, so handle_demand/2 is never called. I’ve tried debugging but haven’t been able to identify the cause yet.

Issue Summary

I have a custom UDP Producer built on top of GenStage and 1,000 Consumers. The system is initialized following Broadway’s recommended configuration. Each Consumer is set up to process only one demand at a time. I then send 1 million UDP messages over the course of one minute. Initially, some messages are processed smoothly, although slowly; however, processing eventually slows down and stops entirely. Upon checking, I noticed that the handle_demand/2 callback in the Producer is no longer being invoked, which causes the system to stall.

You can find the sample code for the UDP implementation in this repository:
GitHub - thachtanapsi/my_udp_app

Environment

  • Erlang/OTP 27 [erts-15.2.7]
  • Elixir: 1.18.3
  • OS: Ubuntu 24.04.1 LTS
  • Broadway: 1.2.1
  • GenStage: 1.3.1

Hello,

Welcome to the forums :slight_smile:

I’m tinkering with your app. I can observe packet loss when sending a million messages but the handler are called, meaning the demand is properly registered and served.

Now if the producer process is overloaded with UDP messages, as it is in active mode, it will take time to receive the demand messages.

2 Likes

Hi, thanks for your response and for checking it out!

In my case, it seems a bit different — the handle_demand/2 callback is not being triggered at all, even though messages keep arriving and piling up in the internal queue. So it’s not just a matter of delay; it appears that the demand itself is never received or processed by the producer after some point.

I’m still investigating, but it feels like demand propagation is somehow getting stuck under high load.

Appreciate any thoughts you might have!

:hammer_and_wrench: Root Cause Found: Missing Demand Due to Hidden Recursive Call

After receiving valuable guidance from my manager TamLeDuc :folded_hands:, I was able to identify the root cause behind the issue where handle_demand/2 was not being triggered, even though messages were continuously arriving.

:magnifying_glass_tilted_left: Root Cause:
There was a hidden recursive call in the message handling logic that wasn’t immediately obvious. The function lacked a clear stopping condition and had no timeout mechanism, which led to:

  • The process getting stuck in an infinite loop.
  • It never returning control to receive incoming demands.
  • Broadway not being able to push new demand, causing the system to appear frozen.

:light_bulb: This was particularly tricky because it didn’t produce any crash or clear stacktrace. The process simply hung, leading to the false impression that the system was overloaded or disconnected.

:bullseye: Solution:

  • Refactored the recursive logic to ensure controlled flow.
  • Added limits or timeouts to prevent long waits.
  • Placed logs at the start and end of handler functions to better detect processing bottlenecks in the future.

Thanks again to TamLeDuc for the direction and to everyone who took the time to look into this — your support helped save a lot of debugging hours! :raising_hands:

:backhand_index_pointing_right: And big thanks to everyone who stopped by or took a moment to read this post — I really appreciate your time and attention! :folded_hands:

3 Likes

Maybe you need more unit tests!

1 Like