Genstage with sanity check

Hi!

Just wanted to ask your opinion on near real time ingestion genstage with sanity check. Currently I have the following setup:

  P (gets data from db) -> PC (transforms to correct format) -> C (push up to db)

I’ve noticed that there are some rows that can be missing. I think there is a delay of those rows getting committed into the source db.

I need to run a post check daily or hourly to fill in the missing parts (or even dups???)

Someone suggested earlier to use a PC for rolling up to hourly but now I think about it I cant do this since I need to have another producer to do this post check thing and query the db again.

Or should I just run another set of genstage that runs hourly along side the one that runs every minute. Not sure if this is the best way. Maybe I just need a separate hourly producer that does the post check and have it subscribe to the same pc and c that the minute one subscribes to?

P -> pc -> c

Then I thought maybe I can use flow but I’m not sure where to put it either. Now I’m just confused. =p

Thanks again in advance!

Most of the times you don’t need a producer consumer, unless you need to do load rebalancing.

Just have a producer and do the work of transforming and pushing to the DB In the consumer. It is fine for the consumer to do a bunch of sutff, you can break those into modules and functions. Basically, don’t use stages for code organization purposes but to model runtime properties.

If the hourly thing can be a separate pipeline, then do it because it is simpler. Use module and functions to share the code.

7 Likes

Thanks for the help =) I honestly haven’t thought that PC should be used for load balancing and runtime properties. Most of the examples I’ve seen seem to do some transformation on that stage. It’s good to learn this so I can use properly. =)

Thanks! I will put the hourly thing on a separate pipeline that uses the shared code from the minute pipeline. =)

I was also trying to figure out the flow + genstage and I couldnt find any good resources.

Thanks so much for building elixir! I still have so much to learn but I :heart_eyes: it =)

I was thinking since its a post check maybe I should have it done after the minute consumer?

p -> c (becomes producer) -> c (for post check)

Is this even possible?

P.S. I really liked the elixir videos. I think I watch it more than netflix. haha sad but true