Production Phoenix LiveView on cloud-agnostic Kubernetes

For context, my quest was and remains to design, implement and self-host a new live-view app in such a way that whether it grows slowly or rapidly I can scale both horizontally and vertically using whatever cloud or bare metal resources the app needs and the revenue it should generate would pay for.

I’ll happily elaborate on my journey on request, but it’s a long and complicated tale of many mistakes and aha-moments. With some notable exceptions there are ample tutorials, how-to’s and developer documentation out there to cover the details of what I’ve tried and failed with as well as what I ended up getting success with.

In this discussion I’m hoping to maintain a mile-high, inch-deep perspective on the viability of various approaches, platforms and packages required for a fully operational phoenix app exposed to the world, the mesh of interrelated challenges, solutions, opportunities and risks that comes with the territory.

In short, let’s discuss what are the right things to do rather than how to get them done.

Though I started out chasing self-hosted OpenStack, then Charmed Kubernetes using juju and MAAS with Ceph for storage management, I recently found that all of the elements of Charmed Kubernetes that I figured on actually using was available to me by running microk8s in clustered mode. my pfSense firewalls allowed me to set up the Calico CNI with MetalLB in BGP mode. The database is a “production grade” PostgreSQL HA cluster run by Percona’s version of CrunchyData’s Postgres Operator and the entire cluster is monitored using Percona Management and Monitoring (PMM) running in an off-cluster docker container.

It’s all working very nicely, including (eventually) getting cert-manager to work properly to issue and update Let’s Encrypt certificates for the site. But all the how-to’s and documentation on cert-manager had focussed on using it in conjunction with an Ingress controller such as nginx-ingress.

I really struggled to get cert-manager to work for me, and in the end I discovered that the version of it that ships with microk8s is quite old and buggy, so I had to upgrade that myself in the end. But that leg of the journey, going through many different guides, examples and how-to’s made me realise that the generic ingress-controller concept in Kubernetes has a massive functional overlap with what standard phoenix apps using cowboy does anyway. Especially if you’ve already using a load balancer such as MetalLB (or whatever the cloud provider offers) I started to suspect that I could cut out nginx-ingress entirely. That’s what I did and with very pleasing results I daresay. Perhaps someone, or some event, will make me regret that choice, but so far so good. To get that done though I had to dig into parts of cert-manager the how-to guides don’t cover and deploy a DNS01 solver with a delegate zone hosted at AWS Route53 and mounting the certificate as a volume-mount in the manifest for the certificate and key files to become available to the pod.

2 Likes

My recommendation is to take a step back from hype and how things are done with concurrently inferior technologies like python, js and to get more into the basics.

I’m assuming this discussion is about small to medium size products with a max of 100k concurrent users, so my assumptions will be based on that.

Elixir/erlang scales insanely well on a single instance, the reasons being:

  1. All the data passed between concurrent constructs is immutable, so there is an almost zero chance of memory corruption, shared memory errors that are hard to track;
  2. The VM is fault tolerant, allowing it to have isolated breaking parts that will not take the entire VM down should something happen, so you don’t have to create physically isolated services for multiple small features;
  3. The VM has some great observability tools to track the state of the VM at runtime, so you don’t have to depend on third-party tools like k8;
  4. The way the concurrency is designed in OTP, you are guaranteed that processing will distribute computing on all cores and not throttle separate subsystems.

I have never used k8’s on a production project however from what I see what it tries to achieve is the following:

  1. Fix the issue of technologies that scale poorly like js, python by creating multiple instances of that server/product;
  2. Dynamic scaling when demand is rising - the value of this feature is very questionable taking in consideration the pricing of using orchestrator vs baremetal and it is very product dependent;
  3. Have some limited fault tolerance offered by the orchestrator that will log and restart containers should they die, needless to say this is ages more inferior to OTP;
  4. Offer redudancy on multiple locations should something go wrong, let’s say a datacenter goes down and your product is mission critical - this is in my opinion the feature you need an orchestrator for and it is tricky to achieve with baremetal by default.

If we look at all the points above, it is clear that there is little to gain from using an orchestrator like k8 with elixir, and on small to medium projects I would use it only for redundancy, which is rarely a hard requirement.

3 Likes

My principle consideration had been to avoid reliance on single-node performance, i.e. vertical scaling by designing and implementing a distributed application I can deploy in local points of presence I can rent or buy anywhere in the world where the traffic and local regulations require me to do so. If all goes well, the audience would grow many orders of magnitude beyond a few hundred thousand concurrent users and I’m hoping that I wouldn’t have to redesign the entire architecture if and when that ramps up. So I’m starting with a tiny cluster of old and slow machines rather than one fancy server that will eventually run out of steam anyway.

1 Like

My opinion on this is that this is premature optimisation, but once again it is unclear of what kind of product we are talking about and I always opt for not using cloud services whenever possible.

2 Likes

That’s valid (the premature optimisation perspective, which I’ve often spoken with teams over the years) so I can assure you that the decision wasn’t made lightly or because of irrelevant hype. I had to find a way to scale the product down to a ridiculous minimum while allowing demand to potentially escalate at the most inopportune time (when I’m supposed to be focussed on value and content as opposed to redesigning the system or trying to keep it from being overrun).

I also share your take on avoiding cloud services because cloud service providers are bent on locking people into their particular cloud. But cloud tech has intrinsic value and opportunity if you can harness it without getting locked in.

I’ve been a long-time fan of first Erlang/OTP and now Elixir and Phoenix for all the reasons you’ve quoted and more. Without those contributions to the technology landscape my life’s work would still demanded many lifetime’s worth of hard slogging to implement. But we’re not talking about a small to medium application here. It’s small now but it’s scope is to grow into a mission critical system used daily if not constantly even by billions of people. I’m well aware of the adage that one shouldn’t tackle Google sized problems unless you’re Google but I had no choice in the matter really - it had to be done, and now it is. Well, almost.

As someone deeply invested in crafting a robust, scalable live-view app, this discussion resonates with my own journey. Striving for cloud-agnostic solutions led me through a maze of trials and triumphs. From exploring OpenStack to embracing Kubernetes on microk8s, every step taught valuable lessons. It’s refreshing to see how choices align with the concept of cloud agnostic applications, as highlighted in this insightful discussion. Cutting out nginx-ingress in favor of a more streamlined approach speaks volumes about the adaptability and versatility of modern architectures. It’s a testament to the evolving landscape of tech solutions. For those intrigued by the concept of cloud agnostic applications, this discussion offers a wealth of perspective. Link to cloud agnostic concept

It means I’m not alone after all which is wonderful news. It seems we’ve independently (and at some cost) made very similar pragmatic choices motivated by similar objectives.

I suggest we semi-formally pool our objectives, work product, rationale and findings so that we’d jointly decide what to use and what to steer clear of as we respectively meet our evolving requirements.

There’s no shortage of vested interests and divergent agendas in this space, most of it clouding the waters. I’m sure if we can define and maintain a specific arrangement of technologies to focus on with explicit motivations for each choice, clear objectives to being met and add to that all those little secrets, scripts and yaml file entries which unlocks the features we require the number of people sharing the load of keeping up to date with the constant change in all the moving pieces would increase.

If you’re keen, I could start things off with a full description of my objectives, what I’ve considered, what I chose and the snippets of yaml etc required to make it work. You could then indicate where your perspective differ and why, and we can work towards one common toolset we can help each other stay current on.

I’d be in favour of making this a little more binding than what is typical in the open source community, in the sense that once we’ve agreed on a shared arrangement of products and technologies, we’d consider ourselves obliged to consider the other’s evolving requirements as much as our own when proposing future changes to the agreed set of components and techniques. It being open source others would benefit from it as well, as long as their objectives align wth ours, but if someone with slightly different objectives wishes to join and assert some influence, they’d have to negotiate with us to have their objectives incorporated into ours at the price of their commitment to also consider our objectives and share the load of keeping the emerging standard updated.

How keen would you be on something like that?

Ey @MarthinL I’m on the same travel you were, trying to avoid cloud services and performing some local tests similar than yours, so far with really poor results and horribles bottlenecks in NodePorts, session affinity problems, etc. I’m working on this yet, but in the meanwhile I would like to know what was your conclusion , recommendation and thoughts so far… Did you resigned nginx? Did you get some (approx) measurements about how many nodes, vps and resources had you need per hundred thousand users? I know that every app is different but just to have an idea.

I don’t want to self-promote too much, but this is part of what we have been building at Batteries Included. Using Elixir, Phoenix, K8s, Kubernetes, and others, we’re building an automated solution that will simplify running everything on Kubernetes with open-source solutions as the base (Postgres, Redis, Istio, etc.). Everything from database to networking to serverless web hosting is fully integrated, so you don’t have to learn how to do that. Single sign-on is a few button clicks away for every part (Keycloak, Istio Ingress, Oauth2 Proxy, etc).

You bring your computers or your cloud account, and we run the whole platform. Full automation means you are the only one with access to your data. There are no third-party data leaks since we don’t get keys to any OAuth or database.

This also means that you can run it alongside your existing infrastructure. No unneeded processes or data move around since it runs on your hardware or cloud account.

For web hosting, you build a docker image, and then we take care of SSL, a default DNS, network routing, resources, and scale instances with a slider. One caveat with Phoenix live view applications: you need session sticky HTTP routing to ensure that the initial cold HTTP and the WebSocket hit the same VM. I haven’t found a good cloud-agnostic way to do that yet, so we’ll have to build that into Istio/Envoy. For now, we can use fixed routing (fixed number of hosts) or scale from 0 to 1 with serverless (KNative Serving). For most other web service architecture, this doesn’t matter.

We are building all this as a source available with time-bound open source, or fair source as Sentry and Codecov have called it. Our license is below production size/scale, it acts like Apache 2. Above that, we charge a small hourly rate per pod we run, monitor, and maintain. Additionally, after two years, the license grant becomes Apache 2. This means you can use it at home as if it’s Apache 2, but businesses will have to pay for the most recent versions, even they, know that older versions will become open-source.

We haven’t launched yet, so there are many to-dos on the public-facing site and product demos. However, we should launch in October. You can sign up, and we’ll update you, or watch GitHub or email me for a demo.

1 Like

I don’t think that is the case, phoenix liveview was designed to be distributed from the inception. Can you formulate better on what is the actual issue?

The problem we have encountered was with K8s limitations when it comes to “session affinity”. Sticky sessions may solve single end-point issues with LiveView when the user hits a refresh or a back button, but it does not cut it when you have more end-points that should preferably connect to the same VM instance (e.g. a LiveVIew connection + a webhook).

Providing support for an arbitrary identifier (a token_id or a user_id in our case) for stickiness would solve this issue, but K8s doesn’t seem to care much about stateful apps.

Appologies, I found this reply apparently un-sent when I visited forum again. Don’t know if it was sent and cancelled again because it came from me or I never pushed the button.

Hey @julismz,

It certainly is a journey, not one event or decision. I’m sory to hear of your performance hurdles and poor results, but I’d need to get quite a bit deeper into what you’ve really been experimenting with.

There never was and still isn’t any nginx in my solution. How/why did nginx enter the mix for you?

As for the customers per node question you’re right on the money identifying that as the key question. It’s a lot like engineering a race-car’s drag coefficient. You can have an idea of the Cd for simple shapes but when many shapes overlap and interact it becoes a rabbit hole which if you do make it out of will leave you with unreliable predictions anyway.

The only application that performs without bugs is the null application. From there, everything your app has to do in order to add value will lower it’s performance. It’s up to you as designer to try avoid giving up large chunks of performance for no or questionionable value to the end-user. You cannot increase performance, only decrease it less by making smarter choices which avoids unnecessary work. In most cases, unnecessary work equates duplicated work so that’s usually a good place to start.

Perhaps your application is somple enough for you to know in advance what every user will be doing which means for you the only parameter becoes how many users are active. If that’s the case we can discuss strategies for that but there are probably many oters that can help you more than I can.

Each of my users use my application in their own unique way which I can neither predict nor directly and it varies day by day. That variance impacts my “optimal” ration between nodes and users far more than my code does so as long as I don’t waste computing cycles I’m doing as well as I can.

That then engages the third element we can call pre-scaling if you want. As you might have seen above in this thread I’ve been criticised rather heavily for it as premature optimisation. The essense of it is to design your application assuming it will need to scale bigger than you could have imagined and then implement it at the smallest possible scale you can pull off. The rationale is simple. Even if you cannot predict how many live customers a single node will service you can measure how it’s doing and if it starts to struggle you can add additional nodes without delay. You can only do that if even for your initial version you’ve made provision for the application to run on multiple nodes and in multiple regions. If you don’t do that from the start you will get trapped. Your users will demand better performance but to give them that you need to redesign your app or find a way to give your server more resources (vertical scaling) which will in the end put you exactly where the cloud providers want you - dependent on them for scaling.

I’ve seen too many designers and managerial types with unrealistic expectations of the agile and “first make it work, then make it fast” to not say anything about it when asked. I’ve even worked with a company selling an “Agile product” that declared war against Architects as the enemy of Agile. But as an Architect using Agile I did things with their product they couldn’t believe, so I developed this analogy: When tasked with bulding a dam it’s easy to find a hole in the ground, fill it with water and say that wasn’t hard at all, now we just need to evolve it into a dam at the desired scale. If you cannot see the issues with evolving a filled dam or the simple matter of where the water will come from if your initial hole in the ground is not in an opportune geographical location, you deserve the disappointment that will follow. If you need a dam, your starting point is to find the right location, then to design whole dam, then the way to divert the water while you build the dam and only then to start construction of the dam even if you build it in phases on top of the full foundation started in the right spot.

Design your application’s value to your customers and let that guide you as to how to deliver that value at a lower cost than what your customers pay you for it.

A lot of Kubernetes’ facilities are there to accomodate legacy applications such as the monolythic web servers of yesteryear and “micro service” stuff written in C++, Java and go. Don’t get distracted by all that noise. Erlang, Elixir and Phoenix already provides a tonne of facilities which those applications never had and had to either implement at application level or get from a container environment when they needed to head for the clouds. As a result, a lot of the concepts like nginx (as Ingress controller) and NodePorts have wide-ranging capabilities in order to adapt to the enormous variance with which application designers had gone about they business before they involved Kubernetes. Your Phoenix app needs surprisingly little mangling to do well in Kubernetes.

Away from the public cloud providers on your own clusters MetalLB is your friend and you have a few good options as to how to create your kubernetes cluster. Once you are using a public cloud, use their kubernetes stacks and load balancers directly. It’s silly to implement your own stack on cloud-based servers.

You also need to choose a database strategy and a storage strategy to suit your cloud-agnostic ideals. It largely depends on your applications’s actual needs and the capabilities of your own hardware. For my purposes I chose to a “standalone” PostgreSQL deployment in Kubernetes using Percona’s Operator for PostgreSQL local to each regional kubernetes cluster and share data betweeen them at the application layer only. As for storage I’ve gone and stuck with Ceph because that’s what best suits my hardware but because Kubernetes abstracts that so well through storage classes and providers I’d use whatever the cloud provider offers where I need to use a public cloud’s kubernetes stack.

I don’t know if I’m really answering your questions and I don’t even know if all your questions have the kind of answers you’d like them to have. Both cloud-native and distributed computing (very different things, but related anyway) can overwhelm you exactly like the monsters they’re reported to be. They certainly are powerful but they’re mostly misunderstood which makes them scary. Once you know enough about them they’re incrediby good friends to have on your side. It just takes a little work and empathy to get to know and understand them.

Good luck with your journey and dpon’t hesitate to reach out (again) if you feel stuck or uncertain. Just bear in mind that the bulk of the information you will encounter on the interwebs have been written for purposes other than your own. This too. My purpose is the hope spark a kinship among fellow travellers on the good ship Phoenix sailing towards large scale independent federated systems such as my own. At some point we’re going to need others who look at the problems we’ll face through the same colour lenses.

Keeping state in a web app is a taboo from the monolithic web server era where it was deemed simply too resource intensive for anything beyond perhaps a corporate intranet application. It’s no surprise that K8s’ default model is aligned with stateless web services.

By virtue of the BEAM’s extremely efficient lightweight processes with states as small as you like Phoenix in Elixir has made a mockery of that old taboo. LiveView exploited it even further to the point where it took things even corporate intranet applications with limited users wouldn’t dream of and made it easy and effective for hundreds of thousands of users per server node.

Not everyone in the industry has recognised the paradigm shift and/or adapted to its implications. Kubernetes wasn’t conceived with Erlang/OTP/Cowbow/Phoenix/LiveView applications and backends in mind but rather for a world where stateless applications are the norm.

Luckily the story does not end there. Kubernetes also provides for “Stateful Sets” which as the name suggests are specifically for applications that maintains state. In traditional web-service terms it was only the database that was “qualified” to carrry state so in many minds and articles “state” became conflated with persistant data, but it’s not the same thing.

Because Phoenix and LiveView apps maintain state it is best to host them in Kubernetes as Stateful Sets. You’ll still need to do some work within your application to make it cluster-aware (the easy part because of libcluster) and for your specific needs choose and implement handing over sessions between pods in the set and nodes in the cluster.

You could also choose to do no handover at all because LiveView applications use “slow poll” which establishes and maintains a connection between the server and the browser. Most if not all load-balancers would ensure that established sessions (confusingly called states at the network level) are preserved meaning that when JS code in the browser sends something back on that established connection it would by default go back to the exact pod and process that established the session. That only breaks when the user reloads the page and the liveview session needs to be established again.

It can be argued that when a user reloads a page it’s a good time to give the load-balancer a chance to pick a new node, pod and process to service that session. Having extremely long-running sessions (as in hours or days) might not what you’re looking for since it may load some nodes more than others. Here’s why. Unless you’ve implemented a mechanism to feed back how busy each of your nodes are to the load balancing algorithm it’s going to come down to something random. Randomness and round-robin techniques are quite OK when you’re dealing with large volumes of similarly sized sessions. But if you’re too eager to insist that each client is served by the same pod on the same node it got allocated to randomly you’re going against the natural self-levelling order of things and some clients might end up getting served by a pod that’s very busy while other pods are idle.

The choice you app needs to make is based on the cost of initialising a user’s session. (That was also the case in the old monolithic server use-case, except that most session-based implementations ended up having to rebuild the session for every request coming in from the client, which is what made it so impossibly expensive on resources). If that only happens occasionally like at login or reload it might be cheap enough to fill in the user’s state from the database, but if it’s quite a demanding job to do that it might be better to first see if another pod in the system has state loaded for that user and obtain it from there.

From there you need to decide whether to redirect traffic to the original pod or transfer the state to the new pod and resume serving that user from there (i.e. kill the old slow-poll session and establish a new one from the new pod). For the reasons I’ve explained above and because redirection can be so tricky depending on your load-balancer environment whihc may vary between stacks, my choice was to move the session “behind the scenes” using direct calls to the node that used to serve the client.

[Edit]
I should also add, so I will. Unless you do so for well-understood reasons unrelated to performance its always advised to stick to one BEAM instance per physical server.

BEAM is encredibly good at what it does, which is to “run” massive amounts of very lightweight processes interacting with each other as efficient as possible. You’re unlikely to improve on that by running multiple instances on the same physical hardware. If you only have one server and need multiple pods and nodes to help with ilve upgrades and such, sure, do that, but as a rule of thumb only cluster when you can do so across physically separate servers connected by high-speed LAN.

1 Like

I’d be rather surprised if you do find a one-size-fits-all solution. I also don’t see why you’d want the initial cold HTTP request to go back to the same VM rather than to whichever VM is best suited to servicing the request when it arrives. The previous serving VM might no longer be in the best position.

The issue is this: How was the VM to serve a specific client session selected in the first place? If it was random or round-robin then the only possible advantage of that VM over any other would be that it (might be) cheaper for that VM to resurrect the user session than it would be for a fresh VM to load it from the database. Since that so dependent on application specific conditions there’s no way to assume anything on behalf of an application. The least you could (or should) do is to allow the load balancing layer that made the decision in the first place to make the call again, thereby distributing the fresh cold HTML requests evenly across the available VMs. Then it’s an application decision whether to retrieve the old state from whoever previously served the same customer or rebuild it from persistent storage.

A better approach would be to base the distribution of load on the current load on each VM or more accurately which VM would be able to get to respoding to the request soonest. But as we’ve seen, that equation for stateful applications is not simple because it varies depending on what it would require from each VM to have the required state in memory to start processing with. But if could resolve that complexity somehow and control the load-balancer with that data it would solve the whole problem - for as long as it remains best for a cold request to go back to a specific VM it will be what happens but the moment it’s better for the system as a whole and the individual client’s request to get served by anothe VM that too will happen seamlessly.

I don’t have all the answers yet as to how to control MetalLB or any of the public cloud providers’ load balancer yet (and using something like nginx reverse proxy or even a load balancing app in Phoenix is a step backwards again), so that level of optimisation remains out of reach for me, but that’s the direction in which I believe sensible answers lies. Getting cold requests to return to some randomly chosen VM every time is not only of little practical value but could work against overall application effeciency and user experience.

Can you formulate better on what is the actual issue?

Distribution adds latency. The first request in will compute assigns and the rest of the process state. Say the load balancer sent that to host foo. So foo starts that process. Now if the load balancer sends the web socket start request to bar with distribution bar will have to ask foo for the response, before passing it on. That adds latency and additional points of failure. bar can be overloaded, have a bad NIC or go down for maintence.

Yes distribution handles some of this, but no it’s not the total solution.

You’re totally correct. A great session needs access to a user or something similar. That allows blue/green deployments and other fun parts to be integrated into the HTTP routing. Using header/cookie values to determine the server_shard or consistent_hash_ring_location, and then remembering that shard in a signed cookie will likely be our first pass. If we have a good set of header and cookies to use as the default it should provide a good baseline.

Controlling only the load balancer at layer four will not be enough since it doesn’t know enough about the service health or the user who created the request. We will need most of a service mesh for stable routing with load and the health of downstream services. From previous experience, it takes many systems working together to make reactive long poll work at scale.

At Batteries Included, we’re using Istio for service mesh and mTLS, Istio ingresss gateway for layer 7 HTTP(s) input, cert-manager + internal CA for ssl, and some custom envoy/istio smarts (Istio takes WASM compiled plugins so its very extensible). We also use MetalLB for lower layers or the cloud provider’s load balancer. That gives us something pretty close to the state of the art (none of the publicly available cloud agnostic solutions can do dsr or direct service return, which is a bummer).

1 Like

That sounds the oppisite of cloud-agnostic, like you’re being or trying to be the cloud yourself and get a piece of every pie that way?

My own attempts to estract value from the service mesh concept brought me to conclude that there is too much overlap and competing notions between service mesh and the baseline facilities everyone on this forum is used to where remote procedure calls and process identity are integrally parts of the environment. All the examples I could find were in go because that environment is so bare it actually adds value. But in the Erlang/Elixir/Phoenix domain all it added was complexity. That’s why I chose to walk away from that world and approach my distributed solution in a much simpler and effective manner.

It runs on any cloud that runs Kubernetes, on already-running Kubernetes clusters, and on a single machine with a docker-like API. It is cloud agnostic.

You’re assuming ill intent rather than considering that I have a different point of view, which makes for a poor conversation.

I have seen different in production and at scale. For a truly scalable and distributed balancing, there’s a need for layer 4 balancing (MetalLb can do this sometimes, but not in the cloud since arp is controlled so you have to use the cloud for this sometimes), a need for layer 7 (session based routing, internal network outages, etc all require a deep understanding of the complete Kubernetes cluster), and application load. All of that adds up to a service mesh being very useful in this case of reactive hosting at any scale.

Then, it’s worth its weight in gold when debugging an outage at cloud scale. These systems at scale aren’t a monolith; they are many different systems used by many other teams. Service mesh allows common ground in metrics and tracing. (Assuming all of this is done well and tuned well and used at appropriate scale)

After that, the cloud or any hosting location with shared fiber or network hardware you didn’t build yourself is a very hostile place these days, and it’s great to have security built in. A well-configured service mesh will use mTLS and SSL cert identities for lots of security goodness (forward secrecy from eavesdropping, crypto attestation, identity validation, etc.). That security, along with the debugging and metrics provided, is worth much when facing true adversaries.

After that service meshes are a great way to get shadow traffic. Meaning they can be used as development tools for testing new versions before deployment, or debugging why performance is changing.

From there, service meshes are helpful for very advanced distributed systems paradigms, such as scatter-gather machine learning and data provenance attestation. Since the full chain of all request, responses are running the same mesh that stuff gets much more manageable.

Each of those is a complex issue that service meshes make tractable. They are faced in production at scale today. Not everyone needs to solve every one of those at the same time and trying to would make your production systems an unstable nightmare. However, when needed and when applied in the correct way, service meshes are a great tool.

2 Likes

Every process step, buffer latch and meter of conductor adds latency. But if using distribution results in increased overall latency you’re either doing it wrong or shouldn’t be using distribution. If you send all your traffic through some choke point to orchestrate the distribution you’re going to struggle for sure. But if you’ve correctly utilised distribution principles in your application the possible extra steps to get the session state to the serving VM will be offset by a far greater reduction in latency because you’re responding to the request from a local server rather than one on the other side of the world.

Distributed processing is a strange animal - when used where something inherent in the problem space definitively determines where each piece of processing must happen the objectives are clearly defined, the metrics are easily undersood and applied and the whole implementation becomes simple and performant.

In general terms, if it’s not patently obvious which cluster or node would be in a position to service the request so much quicker than any other that the extra steps to make that possible is dwarfed by the saved time then distributed processing does not belong in that solution. It’s often a dead giveaway that you’re on the wrong track with distribution the moment you have to rely on round-robin or random allocation of work load to processors.

Too many have caught out by the apparent opportunity to improve performance with distribution but then having to come up with some determinant by which to distribute the load. That rarely yields the expected results and usually additional compleity outweighs the performance gains if there are any.

Don’t conflate distributed processing and load balancing or parallel processing and be highly suspicious of anyone offering one-size-fits-all solutions for distributed processing. The proper way to distribute processing and/or data storage is entirely application specific and therefore cannot be offered as a service by a platform without a means for the application to install its specific domain knowledge into the distribution algorithm.

The service mesh concept is great for its intended audience, but with my single Phoenix app I’m not part of that audience. A lot of service mesh is about service discovery, but once I know how to reach each of my clusters securely I know exactly what services they offer because they are copies of me. Another big part of service mesh as you mentioned is common observability which is awesome when you have services written in different languages and frameworks but when your entire suite of applications runs in Elixir the tools at your disposal through that are more than enough and comes without the overhead and complexities of additional layers of libraries and concepts which are not native to Elixir.

Bottom line, I trust your anchor tenant(s) are happy that your product is aligned with their needs but its value to me would be marginal at best and more likely negative. Everyone’s use case is different so there might be some on this forum who are involved in multi-technology environments and therefore could find your product of value, but in general the founding tennets of LiveView specifically had included the vision of avoiding the cost and complexity of working in different languages by making it viable to write everything, front end and back end, in one language and framework. Those like myself who are using Phoenix and Liveview for that purpose - one cohesive environment for everything are unlikely to step outside those boundaries any time soon. Your target audience are the unfortunate ones whose working environments made it impossible for them to stay inside the single language world. So thanks for helping taking care of them. With any luck, I won’t become one of them any time soon.

That’s simply not how LV works. Both the initial static request and the connecting websocket request compute assigns independently. There’s no state shared between them server side. Therefore it doesn’t matter if the websocket connection hits a different node than the initial static connect.

That’s also the reason why you want to use live navigation, because that one doesn’t use an additional http request, but navigates purely over an already established websocket connection.

What you describe does indeed matter though if a client falls back to long polling over websockets, because each individual poll would get routed by the load balancer. In that case the LV state is indeed transferred within the cluster depending on where the load balancer routed the users request to.

More generally I’d wonder if that really is worth optimizing for. Websockets work in many places and that percentage hopefully continues to rise. Also I’d hope your load balancer does not randomly send users to nodes in different regions. Because then it’s not just the LV adding latency, but also your load balancer sending traffic farther away from the user as well. And if the load balancer balances between nodes in the same region then those nodes should be able to talk to each other without much latency.

2 Likes

Well @MarthinL first of all, thanks for answering and answering the way you do, with a lot of concept, experience and dedication. I took your message seriously and read it as a paper post :stuck_out_tongue:

So far the issue is that baremetal servers are expensive just for the sake of future proofing, and having 2 K8S nodes on a single server is really, I don’t know how to put it, weird and nonsensical.

So I think I’ll run the app in standalone mode until we start having issues and spikes, and then I’ll add another baremetal, k8S, MetalB, etc.

As for nginx, I was using it as a load balancer… I thought it solved the sticky session issues with liveview, but maybe the K8S headless service takes care of that, I don’t know how, because it doesn’t know the IP of each pod, I should read a bit more about that.

In the current case, I won’t use it, and I trust that Bandit will resolve all requests like a champ. If at some point I want to put an assets subdomain that handles caching js, css, etc, I will have no choice but to forward the traffic to bandit from nginx since it will be using port 80 (possibly it will be handled with cloudflare for the moment).

So, will this message be discarded? At the very least, while the product is working I will do some tests based on everything you told me (except MetalLB that doesn’t work on VPS) in stage environments with some DO VPS to have everything ready and configured when the day of horizontal scaling comes.

And here is an additional question for all who knows and want to answer:

In the described context, It’s better to have only one elixir/erlang node with full resources which take advantage of all the CPU (Intel Xeon-E 2386G - 6c/12t), or it’s better to have at least 2 libcluster nodes?

Could I perhaps find a limitation on the endpoint with Bandit no matter how much processing power the application has?

It is a rather difficult adventure at times, especially for those of us who are not devops, but it has great satisfactions.

Thanks again.