What is easier to scale, Go with Docker and Kubernetes or Erlang / Elixir + OTP?

pillaiindu · April 14, 2019, 8:08am

And which of the above two options is easier to deploy?

lpil · April 14, 2019, 8:14am

You can’t scale Elixir and OTP in isolation, you will also need to scale the hardware/infrastructure, possibly using kubernetes or similar.

Both Go and Elixir are very capable languages and which one is easier to scale will depend on which you are more experienced with. Neither has a particular advantage over the other.

I would wager that Go would be easier to deploy for most people as it’s very easy to work with Go’s statically linked binaries.

tty · April 14, 2019, 9:34am

How are you scaling ?

vertical scaling by adding more cpu/ram ?
horizontal scaling by adding more servers/nodes ?

There are 2 aspects with vertical scaling: adding cpu/ram in production and augmenting the code to make use. With Elixir there isn’t anything that needs to be done with the code until up to around 48+ cores. Someone else will have to answer how Go vertical scales.

Naturally adding cpu/ram in production would be easier with a container or virtualization.

Horizontal scaling can be trivial if the application does not need to coordinate. In which case either would be just as easy.

If application coordination is needed Elixir would require no changes if you originally wrote it to scale to 2 servers. This is because all the underlying coordination is handled by the BEAM. Not certain about Go.

lpil · April 14, 2019, 12:39pm

I wouldn’t say this is entirely true. It’s possible to write Elixir code that does not scale vertically well, for example by creating a bottleneck by using a genserver to hold state that is accessed by multiple other processes.
Writing an application that scales well vertically requires the programmer to learn the patterns and limitations of their programming language, neither Go or Elixir have a silver bullet here.

Elixir does provide excellent tools for identifying and removing bottlenecks in your application though

ferd · April 14, 2019, 1:48pm

I’m always confused by these questions. OTP does not replace Docker or Kubernetes, and K8s and Docker do not replace OTP. The same way neither of them replaces handling exceptions.

Kubernetes and Docker can be use to provide isolation and an ability to restart individual nodes when they fail, but it is not a replacement for isolation and fault handling within your own software.

OTP lets you handle specific types of faults through supervision trees, and relies on primitives such as linking and monitoring which Go does not provide in goroutines. As such, using Erlang or Elixir with OTP does give you additional flexibility that is not made available to Go.

But any place where you’d want containerisation and a scheduler to make use of all your hardware to run all of your Go nodes and restart them when they fail, well that can and would still be useful to Erlang and Elixir components.

You can plainly use OTP and run your BEAM instances within Docker, and orchestrate that with Kubernetes.

Whoever is trying to tell you that one replaces the other does not properly understand the comparison being made.

peerreynders · April 14, 2019, 3:00pm

While it’s often stated that way I think it’s an oversimplification of the real issue.

As far as I can tell an “architect” employing a Kubernetes type product would tend to drive Pod design towards smaller units of work, capable of clearly advertising failure to the surrounding environment so that K8s can take corrective action.

From that perspective I think there would be little inherent interest to have fault resilience and isolation on a lower level of granularity than a Pod. That type of mindset would see K8s as the “Grand Supervisor” orchestrating all the minion Pods.

tty · April 14, 2019, 3:13pm

This should be an architectural decision to trade temporal data integrity vs speed. When your GenServer holds state in this manner you are ensuring state modification is serialized. The typical design here is to have the GenServer accept the request then spawn a process to fulfill the request and potentially change its state.

The general design pattern in Elixir is to spawn processes, and this behaviour scales beautifully vertically.

ferd · April 14, 2019, 3:42pm

I stand by what I said. There is no reason while those need to replace each-other. The scope of faults and errors is different for distinct errors. Nobody would get rid of links and monitors for failed HTTP requests by offshoring that to Kubernetes, yet the primitives still make sense, and do not negate any other level of fault handling.

peerreynders · April 14, 2019, 5:54pm

You’re preaching to the choir.

But to someone unfamiliar (or only vaguely familiar) with the BEAM looking at the options:

A) K8s + Docker + (Java, C#, Go, Python, or JavaScript)

B) K8s + Docker + (Erlang/Elixir with OTP)

will judge option (B) as more unfamiliar and complex and they won’t choose it unless they perceive some extremely compelling value over option (A).

So while the notion that K8s + Docker replaces the BEAM is bogus, there seems to be the persistent perception that there isn’t enough value added by using the (exotic/niche) BEAM in an K8s + Docker environment over other more mainstream alternatives.

I think the misunderstanding is based on K8s emphasizing highly available clusters - leading people to believe that “high availability” is already covered, so the BEAM’s contribution is not longer needed … I would go as far as hypothesising that the emergence of Docker and K8s has made it more difficult to convince anybody to adopt BEAM based systems.

keathley · April 14, 2019, 6:21pm

The problem is that if you’re using a single process to manage state because you want to serialize writes then you need to do coordination across your other nodes. There are very few times where I’ve seen this lead to a better place than keeping your nodes stateless and scaling horizontally. Once you need to eek out more performance or take load off of your database than you can start to get fancy with ETS caches and similar strategies.

Personally I think this is a contributing factor to why elixir scales well. You can (and IMO should) start off by building stateless apis and applications. That strategy can take you a really long way. Once you need to start decreasing latency or shedding load when your database backs up you’ll be able to do so within your application code.

As to the actual question, the only real comparison is between Go and Elixir since you can run either in docker and k8s or on bare servers or wherever else you want. And in that case it’s almost certainly going to come down to your team and their knowledge of those runtimes and problems you’re solving. My experience is that elixir systems tend to scale very well with a limited amount of effort. This means that you can get by with less people and spend more time focusing on features and other business tasks. I haven’t worked on a Go project with similar scale so I can’t speak to that.

Laymer · April 14, 2019, 9:11pm

I am not deploying such systems at large scale so I will not discuss the pros and/or cons, but I would very much like to highlight a facet of the extremely compelling value of BEAM-enabled systems.

I believe that concurrency will be a very desirable property in a very close future and Erlang/OTP’s unmatched capabilities designed 20 years ago for Telecom purposes will be a compelling factor. By 2025 it is guaranteed that at least 50b devices will be connected to Internet, and infrastructures as in (A) will have extremely costly/complex scaling processes with increasing latency. On the other side, Erlang’s implementation of the actor model ensures that both vertical and horizontal scaling can be achieved through process communication.

But I do think that a measure of scalability can not be made based on a comparison between the application software and the infrastructure that supports it. As @ferd very well explained, these are separate components of an architecture, and should not compete against but complement each other.

Best,

Igor

sribe · April 15, 2019, 3:24am

I’d say that they are sufficiently similar in that regard that you should probably make your choice based on other criteria. I have a huge focus on lines of code required for a task and maintainability, and I think Elixir beats the heck out of Go in those areas. Other teams might need to focus on availability of developers, or breadth of ecosystem, where Go generally wins.

lpil · April 15, 2019, 6:36am

100% with you there but when compared to Go the BEAM doesn’t have a clear advantage. Go’s goroutines are actually more lightweight than BEAM processes, using less memory and spawning faster, so Go will be able to get higher concurrency on the same hardware.

Laymer · April 15, 2019, 9:33am

I have never used Go nor have I benchmarked both against each other on those aspects but naturally I imagine each one has its advantages and disadvantages in specific use cases

sribe · April 15, 2019, 1:56pm

Process isolation is a HUGE advantage. In fact, it’s the reason you can’t implement a true OTP analog in Go, but can only replicate some of the bits and pieces.

On memory I believe you’ve actually got it backward, BEAM processes have less memory overhead. Spawning speed, I’m not clear.

lpil · April 15, 2019, 2:28pm

Ah yes! You are right. I’m not sure where I got that idea from but upon checking the Golang documentation I am quite wrong. Thanks for the correction

Totally with you on process isolation, it is the killer feature. If Erlang threads were more expensive I would still prefer them for that aspect alone.

peerreynders · April 15, 2019, 4:31pm

Which is why Go is gaining popularity. And with Go the mainstream doesn’t have to give up

mutability
sharing state

maintaining their established habits - for better or worse.

Not being familiar with concepts such as links and monitors, they aren’t missed despite how controversial the use of context.Context may be (Context is for cancelation).

On the other side, Erlang’s implementation of the actor model ensures that both vertical and horizontal scaling can be achieved through process communication.

I’m skeptical that using distributed Erlang across pod boundaries as a general practice is a good idea. There are likely certain scenarios where it makes sense to do so but it likely should not be the default approach. If that is the case then scaling for the most part is limited to the inside of a pod.

And architecturally speaking communication between pods should be system communication which may be implemented with process communication; permitting any process to communicate with any other process, regardless whether it is located inside the same pod or not seems like a bad idea (moving towards a BBOM)

Likely the same forces at work as with Y Store now C++.

Process isolation is a HUGE advantage.

Sadly others view that as a hurdle.

The other matter is that the Erlang/OTP ecosystem was historically designed to operate on a largely static hardware infrastructure. K8s and cloud computing in general has moved, at least from a “consumer” perspective, towards a dynamic hardware infrastructure. This has created an environment that favours ephemeral, impermanent computing, i.e. a great deal of development is moving towards code that needs to spin up quickly, accomplish its singular objective and then exit.

While that sounds perfect for a short lived BEAM process, it doesn’t work that well in a provider/consumer scenario.

The Erlang/OTP ecosystem was developed for long running, resilient systems with high availability requirements. The current technology trend seems to drive towards “hit and run” transaction scripts that enable the “pay only for what you use” (rather than the “always on”) model.

dimitarvp · April 15, 2019, 4:53pm

Let me first clarify that I understand you are simply playing the devil’s advocate here.

I understand their aspiration but I have never seen that work in practice. I know much more sysadmins / devops compared to programmers and every single one I asked in the last year tells me working with K8s to make dynamic hardware configuration is much harder and error-prone than it should be. They all laugh and say this gives them long-term employment but they are not sure how is this not better fixed by simply spinning up a new server and a load balancer on burst demand.

To me, the aspiration of “spin up as much hardware resources you need as you need them” is still mostly an aspiration with some reality behind it – a reality that is still trying to mature and is not an uncontested and factual reality just yet.

I will be happy to be proven wrong.

If their projects are of the microservices/lambdas kind and they are OK with the extra hosting cost then more power to them. But so many people have proven that AWS is making crazy money from everybody trying to use these arcane new cloud offerings. They are definitely not made with the buyer’s interest in mind so it’s kind of baffling that people persist in using them.

Only good reason would be – microservices / lambdas are much easier to maintain from programmers’ perspective. And that’s a very valid point of view. (Especially having in mind the eternal truth of “hardware is cheap, programmers are expensive”.)

Also valid, though I have to point out that this seems like companies who harvest a ton of privately-identifying information and then cash out – but that might be my cynicism.

Producer/consumer scenarios can definitely be served very well by Erlang/Elixir though. Just not in the format of dynamically spawned containers.

Cynically speaking, I’d say that’s the top reason.

peerreynders · April 15, 2019, 5:44pm

Actually it’s more about me going WTHIGO.

Personally some time ago I had independently arrived at a conclusion similar to this:

But these days I have to admit that at the time I had a blindspot regarding the impact of Go/Docker/K8s/cloud-computing largely because I hate wasting my time on learning/knowing “products” (OSS or not) given how quickly that information goes stale.

a reality that is still trying to mature and is not an uncontested and factual reality just yet.

Agreed but the problem is that while businesses have to be economical in the long run they rarely need to be frugal. They’re often willing to spend money up front with just a promise of a greater return - because conceptually in their mind it makes sense. Often it’s a separate process to gather and analyse the evidence as to whether or not a practice is pulling its weight. Before long a practice is established and stays in place until it becomes necessary to cut costs.

it’s kind of baffling that people persist in using them.

See above.

but that might be my cynicism.

I was actually thinking more along the lines of the Firebase model (Call functions via HTTP requests) of building a web application.

As developers we often malign PHP but that doesn’t seem to impair its market share. FaaS is to computation what PHP is to web pages - so it could succeed and perhaps even dominate. I simply don’t know where Erlang/Elixir fits in a FaaS world - the likely answer is “elsewhere”.

dimitarvp · April 15, 2019, 6:29pm

Or you had a blind spot for hype-driven tech? That’s a good quality to have!

I’ve been struggling to understand this mindset my entire life and I still cannot; everything in my life has shown me that it NEVER is as simple as “spend XYZ money only once and reap the benefits for N years” – it simply never happens! Buy an expensive phone, it craps out 3 months later, you get free replacement but still have to go a few days without a phone; get a new desktop machine, turns out its PSU needs extra watts to even start and so you can never turn it on and gotta go buy an UPS; buy a business software for $1000 and then spend $100,000 to add customizations. Examples are everywhere and yet businessmen are just adamant in their persistence in believing that you only spend money on software once.

But that might be related to them viewing IT as a cost center and not as a profit center. Which is kind of weird because well-made software definitely enables profits… but how much % of software is well-made? Not a whole lot. So that might explain their skepticism and illusion that money on software must be spend only upfront. I am still not convinced I know the answers to this day.

PHP and WordPress boomed during exactly the right time in history – no other good options (a combo of a backend language + SQL + HTML/CSS templates) existed at the time so the tech choices made during the initial period of huge interest in “everybody has their own site and every business has a site” were bound to linger for decades. And that’s exactly what happened.

I don’t think PHP’s success should give us pause, or even humble us. PHP’s success was more like the story of many rich people – they mostly had luck but their survivorship bias and filter bubble senses prevented them from seeing the truth so they were convinced they did something right. Which often times isn’t the only factor. Getting rich – or a technology becoming hugely popular – is much more correlated to it being at the right place, at the right time.