Do we still need redis if elixir has stuff like supervisor, worker, async

vernomcrp · August 4, 2016, 5:28am

I used to deal with redis, celery, mq for long run process. What the benefit of using redis or mq in elixir

Qqwy · August 4, 2016, 5:50am

First and foremost, Redis, RabbitMQ, etc. are made to interface with many different languages.

Elixir(Erlang)'s built-in distributed data stores (such as ETS) and message-queues (such as the new GenStage) do not typically have a gateway to the outside world. As long as everything you build is on the BEAM (Made with Erlang, Elixir or other languages on the Erlang Virtual Machine), this is not a problem, and it will really simplify your stack and increase maintainability and portability (You can easily create an ‘all-inclusive’ distribution of your app).

The main reason to use Redis, RabbitMQ et al. is to interface with environments that cannot communicate in this way. The advantage is interoptability with other languages. Another important reason why Redis is still used in Elixir projects, is that Heroku, which is a place where many people deploy their applications, does not (as of now) allow Elixir applications to talk to each other in the BEAM-style.

Interestingly, both RabbitMQ and Riak, a common alternative to Redis, are built using Erlang.

nerdyworm · August 4, 2016, 4:01pm

It really comes down to the requirements of your system.

Durability
If your machine(s) is unplugged and the disks are destroyed, is it OK to lose all the jobs that were in the queue and currently running? If not, you need to use something like redis/rabbit/sqs.

Distribution
Do you need to be able to split the load across multiple machines? If so, then it is much easier to use a system like redis/rabbit/sqs. You can do this yourself, but I’d recommend against it because it’s not a trivial problem to solve yourself. (I’d love to be proved wrong on this.)

If you are building something super small and it just needs to be good enough, then please do use elixir’s facilities. However, if the above two issues matter then you should look into adding a real message queue to the system.

Cheers,
Ben

Qqwy · August 4, 2016, 4:19pm

@nerdyworm I find your arguments a little flawed; here is why:

Both Persistence (what you call durability) and Distribution can be managed without problems from within the BEAM. In fact, the BEAM was built exactly for this purpose: To have multiple systems communicate and divide the work, and to keep the service running even in case of a hardware failure.

This has been used by Ericcson to create a phone-handling server with 99.999999999% (nine nines) uptime. It has also been used by WhatsApp to create a messaging application with one billion users that is maintained by just 50 developers.

nerdyworm · August 4, 2016, 7:31pm

@Qqwy Like I said it does depend on your system’s requirements and use cases.

However, I’d love to see a real example of a roll your own message queue that can deal with machine failures, net splits, and data loss. The only projects that I can find that handle these things well are rabbit, sqs, disque, etc.

Please, please, please. I’m begging you. If this is actually trivial to implement as a production-grade system in erlang/elixir I want to see it.

I really hope to one day change my opinion on this matter.

Thanks for the response!

Cheers,
Ben

P.S. I skipped the benefits of using such systems. You don’t need to deal with implementing any of the stuff that I’m saying is nontrival… you get that for “free” That is all I’m trying to convey.

Qqwy · August 4, 2016, 7:57pm

@nerdyworm Thank you for your reply!

Yes, you’re very right. It depends on your project’s requirements. Of course there still are things that you cannot do (or can only do badly) using Elixir/Erlang. A good example of this would for instance be math-heavy calculations on large chunks of (big) data: While Elixir is very good (and arguably better than the alternatives) at distributing and paralellizing work, performing math is slower than in low-level (imperative) languages that can re-use the same memory location.

Another problem Elixir faces right now, is that there are many outside resources that only expose bindings to other languages. To interface with these resources, either a new interface needs to be made, or you have to work with an intermediary program in another language, and interface with that.

As for the subject of message queues, RabbitMQ itself is actually built using Erlang, and will (as far as I know – I am not an expert and haven’t used RabbitMQ in production) happily work alongside as well as inside the BEAM.

And in the cases you only need to connect nodes that all run Elixir/Erlang, you can just use the built-in OTP tools to do the job: Message passing in Elixir/Erlang works transparent between nodes; it works exactly the same regardless of if two processes are on your local computer or one of them is someplace else. You can completely configure what should happen in case the connection with another node is lost; failover, takeover, etc. As OTP is part of the standard library of Elixir/Erlang, I would consider it ‘built in’, rather than ‘trivial to build’.

The best source for more information about these built-in tools in OTP is probably the Learn You Some Erlang for Greater Good book which can be read for free online, and the Elixir in Action book, which is a very nice and comprehensive guide, specific to Elixir (but cannot be read for free).

Relying on external systems for things you don’t want to build yourself is great! However, relying on too much work made by others :

makes it very hard to reason about what is going on; The main problem being that these tools are often written in their own (domain specific) language or at least have some very specific settings that are hard to grasp if you don’t know exactly what they do.
It becomes harder to deploy/relocate your application, because you have to move, install and configure many separate parts, each with their own caveats.
You need to place trust for the correctness of your application in third parties. In some cases this becomes a problem.

Sola dosis facit venenum. Using external dependencies isn’t ‘bad’, but you can definitely overdo it. (I have seen too much applications that have…) Often, adding a dependency is a lot easier than removing one.

Thank you very much for your reply!

~W-M/Qqwy

OvermindDL1 · August 4, 2016, 8:27pm

If all you need redis for is for pubsub messages, Phoenix already has a fantastic pubsub system built in. That is what many people use redis for I’ve noticed. Just pubsub to send notifications, emails, etc…

mkunikow · August 4, 2016, 11:03pm

Question what will happen if speed of incoming messages is >> speed of how fast you can process these messages?
How big buffer you can have in BEAM? Can you persist queue on disk (for example you kill machine where queue is)?

Question if for some reason you need to reprocess messages again. Let assume you have some bug in code. You applied fix in code and you want to reprocess some messages again.

For example in Kafka you can configure to store messages for x time after they are deleted.

dom · August 4, 2016, 11:52pm

I wish this nine nines figure weren’t quoted so much - it’s not representative of actual usage in the field.

See e.g.: http://stackoverflow.com/a/26447543

sergio · August 5, 2016, 1:54am

I actually ran into this exact same question. For my use case we need to crawl URL’s in a controlled fashion, not as fast as the machine can handle.

So using exq, redis and GenServer allowed us to generated different queues, control concurrency and schedule work. (GenServer is only used to schedule the initial “burst” - exq queues up all the other url hits after).

andre1sk · August 5, 2016, 2:25am

It’s a bit irrelevant you can build system in Erlang that will fail every n seconds on the other hand if you had to design a system with “nine nines” I would guess it’s not a bad starting point

aaronjensen · August 5, 2016, 4:15am

There are a few things to consider that haven’t been mentioned:

You have more options for deployment if your cache is deployed separately (like Redis). In other words–you don’t need to do hot upgrades, which while neat and all, take extra effort.
Things like Redis and RabbitMQ have a ton of work put into them. They’re highly optimized, patched for security issues, etc.
Many people have worked with Redis/RabbitMQ, but only your team has worked with your custom solution. This can actually reduce the overhead of learning a new system. There are also plenty of docs/examples for them.
Some of the bigger hosts offer easily deployed and managed versions of Redis, which allows for easier deployment and scaling. Amazon has Elasticache and Rackspace has Object Rocket, for 2 examples.

Maybe it’s because I’m still new to the Erlang world, but the idea of tying any sort of in-memory persisted data to the thing I’m deploying multiple times a day seems much worse than tying it to the thing that I upgrade once a quarter or so.

Qqwy · August 5, 2016, 6:29am

Let me stress that using tools that exist inside the Erlang Virtual Machine does not at all mean that you’re limited to storing everything in memory. Nor does it mean that you need to (re)build everything from the ground up.

hubertlepicki · August 5, 2016, 6:35am

There are also security concerns to take into consideration. Sure you can build your system as a large cluster, but that has some serious implications on security. If your nodes are able to communicate directly and execute code on each other, meaning single point of failure gives attacker full access to the system.

Having something like RabbitMQ, Riak, redis in betweens (depending on the context) allows you to separate the nodes, set up various security precautions that you can’t do otherwise.

Of course you can do it all yourself, but why when there are good and proven existing solutions to the problem?

Donovan · October 13, 2016, 3:48pm

Thanks @Qqwy for touching on this. Can you elaborate a bit more on the suitability of Elixir for math/analytics-oriented applications?

I’m just learning Elixir and considering using Elixir/Phoenix for the backend of a Mint-like web app. The app will principally track budget/actual entries, perform Monte Carlo / scenario analysis and produce datasets for front-end rendered charts.

I’m excited to learn and apply Elixir to this task; but would like to qualify the suitability of Elixir/Phoenix to this type of app.

Thoughts?

brightball · October 13, 2016, 4:41pm

Wrote a blog post semi-covering it. http://www.brightball.com/articles/elixir-ets-and-mnesia-vs-redis