What is the Elixir/OTP way for long queues that in JVM world would live on RabbitMQ or Kafka?

artem · December 29, 2022, 4:07pm

Hi All

I am still learning Elixir in semi-hobby mode, so sorry if the question is stupid. Imagine an API that processes survey responses, possibly in computationally expensive way (maybe fashionable AI is being trained from them or something like this). When survey gets published to some popular facebook group or twitter nearly simultaneous submissions can easily overload a small server.

I come from JVM world and one of the usual ways there for surviving against connection overload is whenever possible to accept a request, put it to the queuing service such as RabbitMQ and then slowly (e.g. in 1-3 threads) pull them to process. Or if we are in a less serious service, then maybe a homemade queue in just the heap memory and/or persisted to database.

My understanding is that Elixir/OTP are so great regarding number of connections, that it is not really a concern. However, processing tons of feedback still needs CPU, RAM (especially is machine learning is involved), saving to database, etc.

What is the Elixir way for handling such situations?

Do we just post messages from every API call and some genserver slowly processes then one by one?
Do you use database persistence in such situation somehow (it wouldn’t be nice to drop queued survey results if service dies)? With some standard libraries?

P.S.
I realize that you can perfectly use RabbitMQ or Kafka from Elixir as well. Still certainly if I can have just one elixir box to do it for a small business SaaS, then it would be great not to add more nodes to the system.

derpycoder · December 29, 2022, 4:15pm

I have the following book in my backlog, and a general idea of how things can be done in Elixir:

I would tackle it by using GenStage. It has concept of Producer and Consumers, which helps with consumption of data in batched fashion. If the machine learning Consumer gets overwhelmed, it can ask for less work from Producer… (i.e. Backpressure)

If requirements get more complicated, then we can go for Broadway.

For persistence across restarts, I guess Oban will do the trick.

hauleth · December 29, 2022, 4:27pm

Have you checked what is the implementation language for RabbitMQ? The thing is that the problem is persistence in presence of restarts, not queuing itself.

yolo007wizard · December 29, 2022, 4:33pm

I’ve been looking at some Postgres Queues myself:

Queues then feed Genstage pipeline. I’m also hobby-learing so interested to see other ideas.

cjbottaro · December 30, 2022, 12:22am

For persistent queues, I’m becoming a big fan of NATS Jetstream. It has nothing to do with Elixir btw. Before that, I was using Faktory a lot (which is based on Redis).

Jetstream is distributed and highly available. You can create both queues and streams.

The protocol to talk to Jetstream is extremely simple. Most Jetstream concepts are simple, but the feature set is huge, despite its simplicity.

Jetstream also comes with an amazingly robust pub/sub system.

Jetstream is much more simple than Kafka, but the feature set is greater.

lucaong · December 30, 2022, 12:58pm

Hi @artem ,
I am mostly just repeating things that were already mentioned by other people in this thread, but I hope to offer some points on which to decide what to choose based on your specific needs.

As you mentioned, you can definitely use Kafka and/or RabbitMQ from Elixir. RabbitMQ is actually written in Erlang, and runs on the same VM as Elixir. That said, it very much depends on whether you only need scalable and concurrent execution of tasks, or if you need to persist your jobs on disk to recover from a restart of a worker instance without loosing messages.

If you do not need to persist tasks on disk, Elixir/OTP has already very good facilities for managing concurrent tasks, and to scale to a large number of concurrent requests, without reaching outside of the language ecosystem. That’s because OTP relies on exchanging messages asynchronously across lightweight processes. This is really one of the use cases that Erlang is built for. One thing to be aware is that GenServer will not automatically provide a back pressure mechanism: it’s possible to do that, but does not come by default. There are some libraries, such as the already mentioned gen_stage, that facilitate the implementation of pipelines to process tasks in parallel, with back pressure, throttling, and more.

If you instead want to persist those jobs to disk, to recover from a restart without loosing them, I’d consider something like RabbitMQ (with durable queues enabled) or like Oban (based on Postgres), depending on your specific needs.

artem · December 30, 2022, 1:14pm

Thanks, Luca!
I am going to process many small items of work (think customer feedbacks coming in via API or parsed from the uploaded excel file). I quicklooked at Oban and it sounds like it’s made for for quite big tasks (like processing whole large excel file in batches) rather than on large amount of small items.

But I’ll look closer again. As for persistence, that’s something to think about indeed. Certainly I’d love using it if it comes for free, but rarely things are free
So I’ll need to think how much I can live with just large batches or focus on small items perhaps arriving via API (if large excel is already imported, then I can have primitive persistence by just adding “processed” flag to it and reprocess whole file in case of failure)

kokolegorille · December 30, 2022, 1:35pm

In that case, GenStage or Broadway as mentionned by @derpycoder are good fit.

In the BEAM, processes are (almost) free. If You need to do some work, You can throw some processes to do it. But it’s more complicate to control them.

GenStage and Broadway have back pressure mecanism.

You can use ETS table (in memory) and Stream to optimize the pipeline.

yolo007wizard · December 30, 2022, 4:48pm

NATS looks cool, just did a quick read up on it. Wonder how hard it would be to get db persistance.