BEAM Devs: Asking for Feedback on the Software Architecture Draft

In my first two UK roles, software architecture always included a failover system, an independent, exact copy of production running in another cloud or on-premises data center. This differs from redundancy within the same provider.

In this approach, the switch from production to the failover happens by manually switching the IP for the server in the DNS, which has a very short TTL. In the case the cloud provider is having an outage/issues or a catastrophic production incident that is not easy to solve immediately or roll back effortlessly, we can switch the DNS and use the failover system, or having clients switch automatically to the failover when production doesn’t respond after a certain timeout.

:warning: Fail-over not the same as Blue-Green Deployments. While a blue-green deployment gradually replaces an older version, failover runs continuously alongside production. Ideally, both strategies should be used together when possible.

In my second role, we also implemented a request duplicator. This tool allowed stress testing of new releases by amplifying live requests (e.g., x2, x4) to find breaking points. It also helped validate major architecture changes before going live by running them in parallel with production.

The request duplicator only relied on production responses but on my case it could be coded to consider the first response from production or failover. For strong consistency guarantees, it could wait for both before returning a response, backed by a TTL and a request failure-handling strategy.

:wrench: Key Consideration: Applications using this approach must ensure side effects (e.g., emails, billing) only occur in production. A flag-based system is required to enforce this.

Bear in mind that I wasn’t in the DevOps team, nor did I have input on the architecture. Thus, the diagram is trying to reflect what I was aware of and can recall.

I am thinking of also using this approach for BEAM Devs, as per the diagram image. However, in my case, I have a CRUD application from the user perspective, whereas in my previous roles, they were read-only for external users and CRUD internally based on background jobs or request metadata collection and analytics.

As with everything in software architecture, it’s about trade-offs. Thus, this will have some, like added complexity to ensure no side effects occur in the non-production systems and to guarantee that both production and failover are in the same state (strong consistency).

So, my challenge is to be able to use the failover and request duplicator approach in conjunction with blue-green deployments and keep strong consistency guarantees for my CRUD application.

I could start with a non-distributed traditional Phoenix app, but I want to use this project as an opportunity to use distribution for real, and to start with a good base for building a very resilient architecture.

What would you do differently?

Feel free to ask any questions.

If this project resonates with you then don’t skip to subscribe now for updates and/or early access at:

1 Like

If I may (and I hope this does not convey as harsh, this is not my intent but English is not my native language, intent is sometimes lost in translation), since you wrote multiple times that you are looking to build your own job with BeamDEVS (and that’s a superb goal), I would focus on finding a market fit first before spending too much architecture tokens.

Paying customers / users are vital to keep you working on this platform. You can always start with a single node. The needs of the various parties that your system wants to engage (recruitment agencies, companies, devs looking for work, and maybe others) will influence your software, and maybe how you architect it, both in terms of application code, and in the roles your nodes take and how they talk to each other.

If you have extensive DevOps experience and infra isn’t consuming too much work tokens for you, for sure, start with your infrastructure design, but otherwise, maybe start with customers or users.

I hope you’ll succeed !

2 Likes

No offence taken at all, and I appreciate that you took the time to give me your feedback.

I know the usual approach is to just dive in and write code, but all projects pay a very high price for doing so, but businesses are so used to this approach that they don’t even notice it and just accept the consequences as doing business as usual. However, I understand that sometimes it cannot be done any other way due to time constraints and speed to market being a very important factor.

While it is important for me to find paying customers as fast as possible, I also need to ensure that I start with a good architectural base to avoid the mess I have seen a lot of projects drown in.

In software development, you can choose the fast lane and build tech debt to ship faster, or you can choose the middle lane to balance speed with quality. Otherwise, new features and bugs will become increasingly difficult to work on as the project grows via the fast lane.

That being said, I will not spend my initial time trying to have a perfect base solution as per the diagram, but I will spend some time ensuring I can build it progressively. I will start with the skeleton and slowly add all the bits required to fully flesh it out.

2 Likes

I cannot agree more, debt is something that must be paid back but is too often left alone and velocity only can decrease…

Often through user research, we discover a huge gap between what we thought we would be building and what actually gets built as we understand users better. At the start of a project this gap is maximal and if everything goes well, our understanding improves and reduces this gap.

2 Likes

I agree completely with user research, and that’s one reason I am sharing my journey to build this project and collect at least 1,000 subscribers for early access. This will help them to understand what I am building for them and for me to align with their feedback and expectations.

When possible, the best approach is to have Event Storming or Event Modelling sessions with users, but this is only practical when building software for a client or for the company we work for.