I am working with a company that has a Rails SaaS that automates the sending of invoices and payment reminders.
The system works good and has gained a lot of traction lately. This means that the importing-part that retrieves the up-to-date invoice data from a wide range of different accountancy-solutions has become too slow.
Therefore, we are going to rewrite it in Elixir!(Wooh! Elixir in production!) Both because of plain efficiency and also because there are a lot of things that can be obtained and transformed concurrently and are not, right now.
One of the things we are wondering about, is what would be the best way to have this new Elixir application talk to the existing Rails application, both to accept import requests and to return the obtained data.
The main thing we are wondering about right now, is how to have them communicate without losing data if one of the systems is (temporarily) down, i. e. not listening. We thought about using something like RabbitMQ, but that adds another single point of failure to our stack, besides being yet another technology with its own quirks we need to manage.
As I’m sure that other people have faced similar situations, we hope you might have some bright ideas, tips ans maybe gotcha’s to create such a system/connection.
The main thing we are wondering about right now, is how to have them communicate without losing data if one of the systems is (temporarily) down, i. e. not listening.
If you need temporal decoupling between the two applications, then you need a buffer. And if you need persistence, RabbitMQ is a good choice. Otherwise I’d use something like ZeroMQ.
To avoid the SPoF, and assuming the Rails and Elixir apps will run on different hosts, you can run one instance of RabbitMQ in the Ruby host (instance R), and another instance in the Elixir host (instance E). You can then use the RabbitMQ shovel plugin to move messages from a queue in instance R to a queue in instance E and vice-versa. The shovel plugin takes care of reconnecting to the other end automatically.
In this case RabbitMQ is used as a persistent queue subsystem on both ends, not a broker that acts as a SPoF.
RabbitMQ also supports clustering, where queues are replicated across different broker nodes. Your clients have to be able to reconnect to another node in the RabbitMQ cluster if the connection drops. To upgrade the RabbitMQ cluster though you still have to bring the entire cluster down, since you can’t run different versions across nodes. With shoveling the instances are just independent, so they can run different versions.
We use RabbitMQ for the very use case. Yes, it is third single point of failure. But it behaves pretty good when one client is down and is not listening, at least for short period of time (which is good enough for us, for restarts etc.).
In our case it’s been an alternative to building internal HTTP API, and we use it as such.
It seems that I severly underestimated the amount of work that RabbitMQ handles for you (persistent queues, automatic reconnecting, data synchronisation) and the abstractions it provides. It also seems that if we are in need in the future to horizontally scale one of the two systems, RabbitMQ will not need much extra configuration. Thank you very much for your help, everyone!
RabbitMQ is back on the table, and we will indeed probably create two RabbitMQ daemons, one on the Elixir/import server and one on the Rails server, and then synchronize these using the Shovel.