Recommended way to deploy a LiveView+Queue based system to multiple servers

duncanphillips · March 11, 2023, 7:51am

Hi,

I’m looking at how to deploy in production using multiple servers to ensure that I have resilience against a single server failure.

I’m starting to understand that Elixir/Phoenix tends to be deployed quite differently to apps in other languages that I’ve worked with - my understanding so far is that deployments tend to be that a number of smaller apps which are bundled up into a single monolithic app which are then deployed and scaled as a unit. For something like Liveview, I’m happy that you can deploy a few instances behind a load balancer and use pubsub/channels to pass state between servers. Something that I’m less sure of is how the systems which include something like queue-driven processes might work.

As a simple example, maybe we have a queue for sending email/sms/etc to a customer. In the web app, we would push a message onto a queue, and then this could be picked up by a consumer somewhere else in the stack. Is the idea with Elixir/Phoenix that these kinds of consumers scale with the web app? So if I have some business logic, and a liveview app, and a kafka consumer, then these would all be deployed together, and scaled together as one.

Note: I know that for email I could perhaps spawn an async task in the background, but I’m more interested in the general pattern of how to incorporate a queue-driven system.

thanks

marcin · March 11, 2023, 1:41pm

Hi!
Broad topic but for starters:

indeed elixir releases bundle multiple apps but you can still compare them to software modules in other frameworks.
high availability requires all software layers to be highly available, which means 1. Web app redundancy 2. Db redundancy (eg PostgreSQL in some primary/secondary setup or cluster setup) 3. Queue system (rabbitmq in cluster mode or use postgresql for this - like Oban library does) 4. Workers redundancy (least critical as the work is done asynchronously anyway.
The minimal setup would be elixir apps with web apps and workers that run on storage less nodes and DB cluster that doubles as queue system that provides redundant storage
don’t know anything about live view state sharing. In theory it would work to share assigns per view+user and be able to restart a live view on different node

kanishka · March 11, 2023, 2:00pm

If you choose Oban, I think there isn’t that much different about how you deploy. Even with non postgres backing like Broadway, I don’t expect much difference.

If you are aiming for high throughput, then maybe you need to factor out your worker queue nodes from general purpose web request handling nodes. Then, I would expect dedicated nodes for subsets of workers.

At work, we run a moderately large app (no particular availability or throughput demands) use a few nodes with ecs, using Oban for async, with some minimal config for Oban.

The main thing that is different is that people seem to ignore AWS serverless as an option and focus on everything running in beam nodes that are self managed, because serverless does not currently treat elixir as a first class citizen.