Kafka(_ex) with GenServer/GenStage stream processing

Hi guys.

I while ago I was asking a lot of questions on Kafka_ex Slack Channel. I still very much appreciate help provided there and willingness to help me with all the valuable experiences.

Since I still think of me as a newbie in Elixir, I hope I will manage to show the dilemma that bothers me.

So let say, I have 3 node Kafka cluster, constantly getting messages into it. We are using consumer groups, recently implemented (big thumbs up for that)… and the workers contacting/communicating with Kafka are implemented with GenServer behavior and are fetching data out of kafka on a defined interval. Solution that we developed first had a design that this worker spawned through a supervisor as many micro workers as there was JSON to work on from that kafka batch/fetch. It worked really nice (I was really impressed). There we had some poolboy, redis, etc… after work was done. Microworkes killed them self.

NOW… we are thinking of going a little bit another way… with the design. And I am asking here, because if I am not mistaken GenStage was described as not most appropriate behavior to use in the combination with Kafka, though people implementing GenStage will probably say the opposite. That this was the whole purpose of this. Bear with me a little bit more… So this new idea was, that we would still use those kafka workes as GenServer, but the next step would be, let say, 1st phase of GenStage setup/configuration … so from kafka worker (GenServer) to a producer (GenStage), than this would be sending data to consumerproducer, etc…

What do you think about this… ? We would gain that backpressure but … spawning those lets say 1000 microworkes… was really working. I am still not sure if I can go with this design here and maintaining best practice / approach. What do you think best practice would be?

Won’t I get the bottleneck in this genstage implementation? Sure, I will get more “elastic” approach to fetch data …. but still? Is there “starting/spawning” like 1000 GenStage “units” a valid approach? Would I get the equal or better throughput with this GenStage approach vs. spawning so many microworkes as there are “JSONs” to work on?

Does GenStage approach enable me the same control over “what happens with data” if i stop the node? How do you control what happens, can you “drain” the GenStage flow… before stopping everything?

What do you guys think about this? I would really appreciate some help/suggestions here

3 Likes

I used GenStage to implement a job processor on top of PostgreSQL listen/notify. GenStage made it very simple: a single Producer receives notification of new jobs, and dispatches according to demand. A ConsumerSupervisor subscribes and spawns a worker task for each job.

The tricky part for Kafka would be managing the consumer offsets. If you need at-least-once delivery then each message may need to be acknowledged by the GenStage consumer, and only update the Kafka offset once a complete batch of messages has been handled.

2 Likes

Another tricky bit is that GenStage works on message counts, but Kafka can only fetch an indeterminate amount of messages (whatever fits in the buffer you allocate/specify).

1 Like

Would be cool to build something lightweight to use in place of Kafka maybe even without all the replication just the log thingy that would integrate well with genStage and could absorbs spikes etc.

1 Like

I’m waiting for Redis streams to land, it will hopefully be a good alternative to kafka for many cases.

1 Like

I think ideal solution would be just a simple elixir dep. that writes the log and exposes log reader as GenStage producer. So you can stick it into your GenStage pipeline to absorb spikes provide temp persistence

1 Like

Hi Tomaz & All,

I’m a newbie in kafka_ex & using the same worker processes pool for parallel handling of fetched batch of messages from the broker. So I’m curious to know your next step or best practice that helped you to do better. Thanks in advance