What Kafka client do you recommend?

tomekowal · May 31, 2022, 7:18am

We’d like to start using RedPanda at work and I am hunting for a Kafka client library.
There are three options I’ve seen so far:

kafka_ex. It looks like it is in a transition period between 0.x to 1.x. There is KafkaEx.New.KafkaExAPI.
brod. Made by Klarna, looks stable
kaffee. A wrapper around brod that in the README claims to be experimental.

I don’t mind using experimental or unstable code. I could potentially contribute but is there someone who used at least two of them and can compare? Do they differ in architecture? Is one of them easier to use? Do they make different trade-offs?

While searching the forum, I saw @keathley shared a lot about Kafka usage. Thanks for the slides from your presentation about it! Is it available somewhere as a video? I hope you don’t mind the ping

keathley · May 31, 2022, 12:07pm

I prefer to use Brod, but Brod is kinda weird. I ended up writing my own wrapper at B/R that never got open sourced. You want to avoid a lot of the wrappers that use GenStage because they don’t handle partitions correctly and will allow messages to be processed out of order. Obviously, this only matters if you rely on kafka’s partition ordering (we did). But overall Brod will do what you need and have the fewest surprises. Build your own wrapper once you know what features or improvements you need for your use case.

dimitarvp · February 26, 2024, 9:33pm

Sorry for necro but IMO this is relevant still.

I am open to try spreedly/kaffe and will report what I have ended up with. :brod seems to be stable and mature but is not easy to navigate and while I recognize that’s not a strictly good argument between techies, I also don’t want to hunt for mysterious errors or, much worse, silent failure modes.

anuaralfetahe · February 26, 2024, 9:40pm

I’ve been using kafka_ex and it works well for my use case.

It feels simple enough and I can build my domain logic around it as needed.

I think the other options you mentioned are solid libraries too.

dimitarvp · February 26, 2024, 10:00pm

Ah, kafka_ex is indeed another contender. I’ll take a quick look there as well. Though my potential problem with it is that it’s still in transition to v1.0 and you have to use KafkaEx.New which I am not sure I like – but it is 99% likely to be a minor issue.

I really need only consuming and I want the minimum amount of boilerplate. That’s going to be my main criteria.

anuaralfetahe · February 26, 2024, 10:37pm

I found kafka_ex to be a bit more ‘low level’ compared to other libraries. This meant that I had to write more code, particularly the boilerplate code for the processes. However, it provided me with more control over what is happening, which I needed.

It’s definitely worth checking out if you encounter issues with other libraries.

dimitarvp · February 26, 2024, 10:43pm

Hmmmm, thinking of it further, my most pressing issue is in fact being able to mock Kafka… Sigh, I can’t find a library that does this well. I found two that hook to kafka_ex but… Nevermind, indeed let me fight with this for a bit and see where I arrive at.

lud · February 27, 2024, 1:07am

I really need only consuming and I want the minimum amount of boilerplate. That’s going to be my main criteria.

I don’t have the code anymore so I’ll only from my bad memory but I have used spreedly/kaffe for a year and it was a solid basis to setup the project.

For consuming, if you need to customize the listeners architecture you will need to understand brod. And you will need to understand brod and kpro etc. to debug some unpleasant errors.

Before leaving the project I was seriously considering replacing that whole stack with erlkaf and made a quite convincing replacement PoC in a few hours. There is not much boilerplate either but I had problems with dependencies, hopefully fixed by now.

dimitarvp · February 27, 2024, 1:17am

I’ve set it up in a new project fairly quickly as well and so far my tests have been convincing (though let’s see how will it act in production…).

Do you remember how did you work around that?

I will avoid any C dependencies if I can help it.

lud · February 27, 2024, 1:29am

Do you remember how did you work around that?

We did not It was not a frequent error. Just that brod is slow to start and to setup the consumer group. Or we did something wrong but erlkaf booted way faster.

For the mocking part I put a private API in front of the kafka libraries to abstract the consuming/producing API and then I implemented and maintained a simple pubsub system mimicking the behaviours of Kafka. And some test helpers on top of that.

This mock was only for tests but it had it’s own unit tests just to be sure. You need to implement a lot of things. Even if you want to only consume, you will need to produce to your mock from the test.

Anyway, a lot of work, not recommended…

dimitarvp · February 27, 2024, 1:40am

Yep, that’s why I would prefer a library that comes with a Kafka mock out of the box but oh well. In my current employer colleagues maintain a thin wrapper on top of :brod but it only supports producing and I have no free time to contribute to add consuming to it – not yet I don’t have the time anyway, Soon™ I should have it.

Ah well, I guess I can just ignore it then, though I am not keen on the idea of losing data which shouldn’t happen anyway because the Kafka topic will be filled by other apps, and the consumer group I’ll use will remember the offset so technically nothing should be lost.

lud · February 27, 2024, 1:11pm

Regarding the C library, I know the risks but it’s based on the official client, it’s working well. I tend to prefer pure BEAM implementations too but in this case it’s the most up to date client you can get.

egze · February 27, 2024, 1:29pm

For what it’s worth - we use :kaffe at work and it’s OK.