Elixir/Erlang and the Cloud Native Ecosystem

yordisprieto · June 15, 2020, 12:02am

Hey folks,

What are your thoughts about the usage of Elixir/Erlang in Cloud Native systems (whatever that means, let’s use cncf.io for now, I think you get what I am trying to say, I hope so)?

Golang (predominantly), Java, and C++ dominate the ecosystem of k8s, cloud-native, and all those fuss words of nowadays (or honestly most software development nowadays).

I only found one project from cncf.io written in Elixir, and it wasn’t that well-maintained compared to the other ones.

It seems that the combination of Go CLIs for everything, fast language, small binary, and gRPC for interoperability makes a strong case to choose such toolkit for most projects like the CNCF projects.

From my experience, Elixir projects are focused on specialized solutions than embracing interoperability and reusable components.

I would like to know your thoughts about the reason why Elixir is not become widely adopted in those areas or ecosystems.

Thoughts?

gregvaughn · June 15, 2020, 12:54am

I want to yell BINGO! because my buzzword bingo card just got a full row. No, I’m not sure what you’re trying to say and I’m guessing at what “Cloud Native” implies.

Go, Java, C++ all share the status of being OO/imperative mutable languages. You’ve got a “Cloud Native” environment optimized for that type of language, yet you’re surprised why languages that do not fit that category do not fit in well? Do you also ask why Haskell, Ocaml, Prolog, Forth, etc. are not well represented in your environment? Interoperability and reuse all require some common denominator of assumptions to work smoothly - the lowest common denominator. They discriminate against languages whose strength is something they never considered.

The Erlang VM and programming model, which Elixir adopts, was first developed in the late 80’s, and is effectively a progenitor to “Cloud Native”. They did it 30 years ago. Each GenServer is a microservice. Instead of JSON, they speak in the shared vocabulary of Erlang terms. They can be distributed among multiple physical servers or live within a single server with no change to the programming model. Instead of coding in YAML (via k8s) to manage restart policies, you use Elixir/Erlang – the same language as your primary business logic.

Elixir/Erlang represents a “cloud” within a single OS-level process. To wrap a single GenServer as a “microservice” managed by others is a step backward, IMHO.

mpraski · June 15, 2020, 8:15am

Both Kubernetes and Elixir/Erlang facilitate building a service based architecture. In Kubernetes world the “microservices” are separate OS processes communicating over REST/GRPC, in Erlang VM those are lightweight processes living in the same memory space exchanging messages.

Both have their advantages and disadvantages, as @gregvaughn noted Erlang precedes current “cloud native” solutions and implements some of its ideas like process separation, supervision policies and rolling updates out of the box. Beam also handles fault tolerance and distributed execution among multiple nodes, so we could say its functionalities are a subset of those of Kubernetes, where you write your services in a Beam language.

Kubernetes on the other hand supports more complex solutions like service meshes, simplified metrics import and cronjobs, but then again you may not need those if you run a homogeneous Beam setup.

Kubernetes is language agnostic, there is no reason why you wouldn’t run a dockerised Elixir application (I do this myself). You can send/receive gRPC traffic in Elixir - see https://github.com/elixir-grpc/grpc, idle memory consumption is surely higher than Go due to VM memory allocation, but it’s probably not a deal breaker. Elixir in my mind is much more expressive than Go and lends itself perfectly to writing APIs or working with data transformations - common use cases of “microservices”.

yordisprieto · June 15, 2020, 1:32pm

I am more focused on why the adoption is low for tackling the problems that they are trying to tackle: distributed systems.
Since:

Right, so why would those big tech companies invest so much money in something that has been solved since the 80s?

Why wouldn’t they develop on top of Erlang VM if so?

Maybe a way to easily deploy tiny sections of the program since it seems that most people just follow the Cloud Native way to dockerize, shutdown the VM, and replace it, instead of bothering with Erlang clustering, hot code replacement and so on.

Even the recommendation from leaders is to avoid the Erlang cluster and go with what people do today: docker, shutdown, replace.

Right right, completely understand, but since those protocols are Erlang protocol you lose some interoperability as I mentioned.

I dont buy into services (microservices for others) for everything, but it feels really good to docker run keycloak and have a system that does really complex things and deal with the networking, in any language, no lock-in to a language (I am looking at you Java).

Or they are not developed in a way that either you code it in Erlang or the highway.

Right right, totally understandable.

Right, that is what most people propose to do, and sometimes even avoid the Erlang clustering all together unless you are dealing with things like Phoenix Channels and stuff like that.

I saw that package, I never use it, but definitely would be something I would like to use since the gRPC tooling is really useful.

Personally, I can’t take Go imperative style and complexity compared to Erlang/Elixir, but I love that the ecosystem embraces interoperability.

LostKobrakai · June 15, 2020, 1:42pm

There was a similar discussion here on the forums, where I tldr’ed the likely reason:

dimitarvp · June 15, 2020, 2:54pm

To unite the reasons @LostKobrakai enumerated before and my own thoughts:

Corporate inertia. It’s unbelievably hard to push for new technology like Elixir in bigger organisations – I experienced that first-hand many times during the last 5 months while casually looking for a new job.
People know tools X and Y, they don’t want to learn Z. It’s quite normal and natural. I am not judging them and I can see where they are coming from but they aren’t helping matters either.
Some scenarios require injecting custom code in an already running runtime and the BEAM isn’t really good at that (for which I am grateful to be honest; we have enough security holes everywhere as it is!).
The BEAM is meant for long-running server services. It’s really bad for one-off “start and exit” tasks. There are other languages that handle CLI / one-off programs much better. There’s no universally good language / runtime / framework.

Resume-driven development is a thing: people invest in a skill that’s more likely to get them hired next year. For these career-driven individuals, using old and proven tech is not viable in terms of future employability. Can’t blame them really but don’t put so much weight on their thought process; it’s entirely egotistic, hype-driven and is thus not objective.

This is an ideal theory that I’ve never seen work in practice. Practically everyone who knows Docker and K8s well is cursing them for using YAML as a programming language and not something like [a subset of] Lua. Maybe you can show me that I am wrong and that K8s clusters are the end-all be-all of distributed automatically scaling software? I’d love to be proven wrong, I am not sarcastic.

Don’t tackle Google-scale problems unless you are Google. Those “tiny sections of the program” eventually grow to 500_000 LoC monstrosities that have to work together with others like them, communicate with each other, recover from errors in any of the others that inevitably crash, have to deal with temporary network outages, have to deal with temporary storage volume unmounting, not to mention the general temporary loss of capacity while upgrading the cluster.

It feels nice to pretend that all those problems don’t exist – but they do, and trying to tackle them yourself (even if it is with K8s) – explodes your complexity and likelihood of bugs and errors.

I happen to believe that people use K8s many times where they really should use AWS Lambda, or AWS Batch, or Google Cloud Run, but that’s for another forum thread.

gregvaughn · June 15, 2020, 2:56pm

Simple. Because technical decisions are rarely made for purely technical reasons. Erlang “looks weird” and managers were/are afraid of finding developers to work in it. OTOH, some companies have invested very deeply in it and consider that a competitive advantage, so they don’t publicize it much.

The “most people” and “leaders” you are listening to are apparently different than the ones I have been listening to.

hauleth · June 15, 2020, 3:36pm

For the same reason we are still using coal-based energy plants instead of nuclear-based.

Not really. There is concept of C Nodes in Erlang. Despite the name such node can be written in any language, so you have defined protocol which you can use instead of for example gRPC, with service discovery (via EPMD).

But there is a problem - it hides complexity of everything. It is nice as long as it works, when it stops working, well, good luck, you are mostly on your own. And gods forbid you if you have some special needs and your environment is at least slightly “non-standard” (i.e. not what application developers are using). Then you feel really hard what “law of leaky abstractions” is about.

yordisprieto · June 15, 2020, 4:52pm

What exactly is your argument? I dont think I disagree with you but at the same that is not the point. Of course, it is a given and takes situation, I don’t assume that everything works just fine.

Which companies you are referring to? Would you mind sharing a list of companies that you know?

I would like to know who you listen to since this recommendation comes from the very people that create Phoenix or Elixir.

Yeah, like I said before, Erlang/Elixir applications tends to be a monolithic application, which is totally fine, but I can see where this could become a problem is some organizations, and for those, we don’t have an answer; or we don’t have the necessary tools to mitigate the complexity.

I totally agree with your response.

mpraski · June 15, 2020, 4:59pm

YAML templates surely are tedious. You can use Helm to template those using Go’s markup language. While it’s not ideal because you still have to write the templates first, it allows to leave the complexity at the template level and extract “important stuff” to a more readable form.

Otherwise I agree with your points, Elixir/Beam doesn’t get the recognition it deserves, at least as a tool for building distributed systems.

frumos · June 15, 2020, 7:23pm

This is exactly a question I asked myself since I started to learn Elixir 1 year ago. And I really don’t understand why Elixir/Erlang/OTP are such drastically undervalued, taking the fact that it is so easy to build cloud apps on bare cloud environment where you just need computation capacity and distributed FS, no need in any other services (mostly). I work hard to build couple of POCs to show uniqueness of this gem to my leadership. When will be ready will try to push it as much as possible. Do hope our org will get some attention and involvement.

ityonemo · June 15, 2020, 10:42pm

Right, so why would those big tech companies invest so much money in something that has been solved since the 80s?

Well also don’t forget that Sun literally spent billions of dollars marketing Java, which overshadowed any technical advantage that erlang could have brought to the scene in the 90s. Google was originally a Java shop, and kubernetes, while written in go, is built on top of java technology (https://youtu.be/4VNDjwzzKPo), and even if it weren’t, cargo culting google is totally a thing in tech.

hauleth · June 16, 2020, 1:23pm

People really forgot about that. That is the reason why we have JavaScript instead of ECMAScript.

LostKobrakai · June 16, 2020, 1:34pm

Only if they were already involved with programming at that time. I guess there are many people, which started their careers well after that time.

patrickdm · June 16, 2020, 3:15pm

I’ve heard from different sources that half the programmers today has less than 5 years of experience. So yes, that’s highly probable.

hauleth · June 16, 2020, 3:28pm

I wasn’t, and I still know and remember how it went.

lawik · June 20, 2020, 5:33pm

For things like cloud functions and such I think Lumen might give us some faster-to-start binaries and lighter code sizes. I’m fairly hopeful that Lumen will be a way to use BEAM languages where the BEAM is not the appropriate tool.

shanesveller · June 22, 2020, 2:41pm

Sure, but if you’ve already accepted WASM into your heart, why limit oneself to BEAM languages? Functions-as-a-service approaches in particular generally don’t need things like supervisor hierarchies or strong concurrency that are the differentiators of the languages/runtime - they just need to be fast, correct, and productive.

lawik · June 22, 2020, 3:08pm

You are prpbably right. Especially with current solutions.

So mostly because I enjoy working with BEAM languages. I’m also not certain that cloud functions don’t benefit from concurrency. And I don’t think we necessarily know what we want out on the “edge” yet. And I’m curious to see what Lumen can achieve there

shanesveller · June 22, 2020, 5:25pm

In the most common implementations like AWS Lambda that are shared-nothing, where a single message/request means a single invocation, the implied concurrency for a given amount of message volume happens in another layer altogether, not in the function itself. That is what I meant originally. That said, some problem domains definitely still imply a certain amount of fan-out in the meaningful work, though, so that’s a point well-taken.