How GenStage / Broadway structure for something like this would look like?

Hi,

I’m trying to figure out what would be the best solution for a following task.
There is an Event with SubEvents, information about which is carried by RabbitMQ.

System logic must respect following conditions:
Event can go through different states Start -> Active - Finish and can receive multiple updates while in Active state.
An Event is considered successfully processed only if and when all SubEvents are successfully processed.

Now, I wonder if GenStage/Broadway will allow to do something like that:

  1. Consume Event from RabbitMQ
  2. Fetch SubEvents from 3rd party service (preferably in parallel :slight_smile: )
  3. Process them independently (network operations will be involved)
    3.1. Retry SubEvent processing stage if something went wrong, for example a network error occured.
  4. Acknowledge Event as processed only and when all SubEvents were successfully processed
  5. Prevent race conditions, for example when one of the SubEvents from Create message is still being processed, but there an Update message is already here.

Thanks!

Hi there!

You could do the following:

  1. Have 2 queues. One for Events and One for Sub-events.
  2. When you consume from the Events queue, you create an Event record in the database (assuming you use SQL, but you can use any database) storing the Sub-events, their status and other relevant data. You need to store it in the database to keep track of its status and which sub-events have already been processed. After storing this data, publish one message for each sub-event to the sub-events queue.
  3. When you consume from the sub-events queue, you pull the data from the 3rd-party service and process them accordingly. You can have your retry logic here.
  4. Once a sub-event is successfully processed, you update its record in the database and its status to ‘done’
    5 .For an Event to be ‘done’, all its sub-events must be ‘done’.

Does that approach make sense?

Regarding your questions:

  1. Consume Event from RabbitMQ
  • Yes, both can do that. Broadway was built specifically for events.
  1. Fetch SubEvents from 3rd party service (preferably in parallel :slight_smile: )
  • Yes, you can do this in parallel. Will this be via HTTP? Will you be fetching sub-events from multiple endpoints?
  1. Process them independently (network operations will be involved)
    3.1. Retry SubEvent processing stage if something went wrong, for example a network error occurred.
  • I think this may be logic you’d have to write yourself. You can take advantage of re-queueing in rabbitmq.
  1. Acknowledge Event as processed only and when all SubEvents were successfully processed
  • Looks like you will need to keep track of an Event’s sub-events and its status. I would store this data in a database as I suggested above.
  1. Prevent race conditions, for example when one of the SubEvents from Create message is still being processed, but there an Update message is already here.
  • This is logic you will have to handle. You need to make sure that the order of events and properly handle this case.

I hope my answers help.

2 Likes

Hi Gabriel,

Interestingly, the limitation is to do this without adding new queues or having a persistent storage.
It’s as straightforward as

  1. Read Event from RabbitMQ
  2. Read SubEvents for that event from database
  3. Process

This is definitely doable on an OTP level, but I have a hope to learn something new hehe ;))

There was no mention of those limitations in your first post. XD

2 Likes

I think a lot (if not all) of what @gjaldon suggested is still valid, because you could keep some minimal track of state in TermStorage. You wouldn’t need an additional substantial DB. Additionally, depending on the shape of things, using TermStorage might prevent race conditions depending on how you cache the pending insert or update (i.e. you’ll be able to know whether you need to requeue the update or even override the insert with new data or something).

This very much depends on the scope and shape of what you’re doing.