What would it take to create Kafka like solution in Elixir?

Just went through this great discussion thread on Big Data with Elixir and looking at this diagram I couldn’t help but ask can Broadway be a replacement for Kafka ? But on further research it turns out Broadway is a way of creating efficient data processing pipelines. And as such it sits more downstream on the consumer side of the equation. Though one can see a consumer further producing something which then is picked up by another data processing pipelines.

There have been attempts like ErlBus, EventBus and Phoenix’s own PubSub which do what Kafka does (except fault tolerant persistence among others). ErlBus supports distributed PubSub architecture.

The above being the background/context. My question is why do the various message bus stories in BEAM land don’t even come close to Kafka and what would it take to create Kafka competitor in Elixir? Is there some limitation in BEAM that does not easily lend itself to Kafkaesque architecture?

Would love to hear your thoughts/insights?

4 Likes

Hello, I’m not sure if it is useful but maybe this post can offer some insights as well:

https://elixirforum.com/t/can-we-beat-kafka-if-we-build-it-in-elixir

Probably nobody thinks it’s worth duplicating effort. Erlang’s BEAM is quite well suited for append-only logs like Kafka’s otherwise.

1 Like

RabbitMQ

1 Like

RabbitMQ is just a messaging queue. There is difference between the two and they cater to different use cases

2 Likes

I’ve played around with prototypes for Kafka like messaging systems in Elixir. It’s definitely possible and I believe it to be a simpler implementation than its JVM counterpart.

Main difference I’d imagine to be CPU-bound workloads, where’d we’d likely fall short.

However, a lot of message processing workloads tends more towards the IO side of things anyway, for which a BEAM implementation would fit nicely.

3 Likes

Yeah that’s what my intuition say too.

Yes whole problem domain seems IO bound and BEAM would be a good fit. But what I don’t understand is why is the persistence story, in all these attempts ErlBus , EventBus and Phoenix’s own PubSub, is so weak. In fact because of this incompleteness many CTO/Architects won’t consider Elixir/Erlang based solutions and default to Kafka system.

2 Likes

I’d look into using Mnesia backed by a disk-based storage engine such as levelDB, rocksDB or possibly even DynamoDB for the event log.

2 Likes

Phoenix’s PubSub does not do what kafka does.

I had need for a lightweight topic based log system and we created https://github.com/erleans/vonnegut/

It is compatible with kafka on disk and on the wire but uses chain replication instead of Kafka’s ISR based replication. It is not ready as a kafka replacement, There is plenty to still do around membership and concensus but the internals are there. Figured I’d mention it in case anyone wanted to work on such a thing and it could be a useful base layer :slight_smile:

10 Likes

@tristan thanks for sharing your contribution I will give it a try. So apart from membership and consensus is there a list of TODO items or roadmap that will nudge this to production ready release?

1 Like

That begs the question, why not just use CouchDB for event logs? seems like its good alternative to Kafka

1 Like

Persistence is hard. After all it is the ultimate expression of state.

Unrelated to the question of persistence, I did implement this which might be of interest to anyone who wants to slowly evolve kafka-like extensions to Phoenix pubsub:, and it’s been running in prod smoothly for several months now. https://hex.pm/packages/pony_express

4 Likes

I personally think it makes a lot of sense to implement lighter versions of such 3rd party products as BEAM libraries, with the goal of simplifying the operations and reducing the amount of moving parts. There are already many examples where we can opt for BEAM libraries instead of external tools & products, such as nginx, cron, or redis. The alternatives from the BEAM ecosystem don’t necessarily match these tools in terms of features or performance, but in many cases they can work just fine, and help simplifying the system architecture.

I’d like to see the ecosystem growing further in this area. For example, I’d love to see a relational database as a BEAM library. Something I can add as a lib dependency, start an instance (or multiple instances) somewhere in a supervision tree, and have SQL based persistence without needing to manage a separate database instance, roles, handle language ↔ db type mapping etc. When I occasionally mention this during my talks, I get some skeptical feedback along the lines of “Why would you want to reimplement databases such as PostgreSQL, MySQL, etc.?”. The point is not to reimplement or compete with established databases, but to have a lightweight alternative which would be more fitting in simpler scenarios.

Ideally, if I want to build a small to medium web-facing CRUD, I should be able to start using nothing but Elixir, and get the basic skeleton working within 15 minutes or so, with everything implemented in a single language (say Erlang and Elixir), inside a single project, running as a single OS process per each node in the cluster. As long as we’re not able to do this, I think there’s a lot of potential for improvements in our ecosystem, and implementing alternatives to established products makes a lot of sense :slight_smile:

29 Likes

I completely agree and that’s why I resumed my efforts to make an Elixir port of Rust’s sqlite library – not the same thing I know, but sqlite is embedded in-process and is rock-solid and never crashes so, close enough.

To clarify, my comment wasn’t that it’s not worth it – I believe it’s very, VERY much worth it. As you said, less moving parts == sanity and predictability. I believe the .exe monoliths with zero external depedencies need to make a comeback and stay for good this time!

My comment was more aimed at an observation of mine that people usually go as you said: “why would you try and duplicate something that already exists and is battle-tested?”. My standard answer for the last 5 or so years is: “because they introduce complexity we wouldn’t otherwise have to handle if everything was in-process”.


EDIT: I also completely agree that OTP translates itself to the idea of a majestic monolith perfectly. Supervision trees would make external dependency tracking much easier.

6 Likes

There isn’t. The project and company we created this for is dead and gone. The easy parts are what is done :). I’d like to work on it again some day for fun since it will be interesting to work on the hard parts, but I don’t know when that’d be. But maybe I’ll start with doing docs and rethinking some design decisions while documenting what needs to be done.

1 Like

Extending this tangent a bit further. I couldn’t help but ponder on implementing light weight Kubernetes (control plane) in Elixir as from n00bish mind it seems a well designed GenServer + Supervision trees could do what K8s does in terms Nodes sync & Pods management. From what I see BEAM is already an OS for your code so why not an OS for your applications (microservices) ?

4 Likes

Omg, I have thinking about this all week. In my mind Ideally it would present itself with graphql as it’s query language, skip the SQL, and store as Apache arrow columns under the hood. Fiddly stuff like parsing and distribution would be done in BEAM, high performance stuff would happen in nifs… Since I wrote zigler and all of I’d have half a mind to use zig to JIT compile “best of” queries on the fly, and remount them into BEAM modules.

Sadly my day job is basically “reimplementing vSphere in beam” so I have very little time to work on a database (if anything I’d be writing a FUSE filesystem first). One can dream…

2 Likes

I’m not really familiar with k8s, but I have a hunch that some form of it could be implemented in beam, and exposed as a library. I think this would require having something similar to etcd, which would in turn require a solid raft library. ra seems like a promising option here.

This is pretty much my thinking too! It’s worth noting that there are things which are simpler to do at the OS level. For example, partial deploy (update only one part of your system) is much simpler with std. microservice (just deploy and restart one microservice). Technically, this could be done with beam too, but it’s more tedious. Also, at the OS level, you can have two different microservices use different versions of the same dependency, which won’t work in beam.

But either way, I think that we could get solid approximations of these heavyweight products which would work just fine in simpler situations, and it’s worth keeping in mind that there are many companies that operate on a relatively small scale and deal with much less complexity than Twitter, Facebook, Netflix & co.

5 Likes

We are 3 developers having less than 1000 users. I’m tired of “this is how Netflix does it” or “Google uses” etc.

As far as I experience it there are few software talks, courses and systems that focus on doing stuff at small scale yet still professional.

I like these E languages not because I have to solve 2 million concurrent users but because I’m hacking away trying to reap the benefits of the microservices style within my application.

7 Likes

I’d love to see development of lower level, infrastructure like open source projects in elixir. I have free time and lots of interest, though my skill level in elixir is still kind of green.

Here are some ideas:

  • better storage engine to make mnesia mainstream. Instead of rocksDB I’d suggest to use lmdb, the code size is so much smaller and NIF friendly
  • use postgres jsonb as underlying storage and write a distributed layer in elixir on top to present a redundant and distributed K/V store, sort of like riak or couchdb
  • use one of the storage options from above, combined with some NIF friendly rust based search engine library (there are several on crates.io) and present something like elasticsearch
2 Likes