Onor.io

Joe Armstrong's "Systems that run forever self-heal and scale" (and my review of it)

My Review

A Review Of:

“Systems that run forever self-heal and scale” by Joe Armstrong (Lambda Jam 2013)

I am a fan of Joe Armstrong. I wish he were still with us to share his wisdom with us (he passed away a few years ago.) Luckily for us he left us a great book and several great talks that help to understand how he created the excellent industrial strength software he created.

It’s not an exaggeration to say that Erlang and the BEAM (which is the purpose-built VM which Erlang runs upon) has been used to create real world systems that have run for literally years without down time. Read that again. YEARS. Anyone who’s ever gotten the “holy crud, the system’s down” call at 2 am will appreciate the wonder of having systems that run for years without tinkering and without failing.

So how did Erlang achieve this miracle? Did they write code that they burned into an eprom and all it does is print “Hello world” trillions of times in an infinite loop? Nope—Erlang was created to run phone switches. If Erlang fails you can’t make phone calls. Again it’s not an exaggeration to say that Erlang literally runs on more than 50% of the phone equipment in the entire world.

So how did they achieve this seeming software development miracle? Well that’s what Joe Armstrong discusses in this talk. The thing that really appeals to me (and why I was and am a fan of his work) is that he didn’t approach the problem from some theoretical, abstract CS approach. He (and Robert Virding and Mike Williams, the co-creators of Erlang) had a practical problem to solve and they created Erlang in order to solve it. They didn’t set out to build some world-killing technology—they set out to solve a difficult engineering problem and just happened to build one of the best engineered solutions I’ve ever seen.

So what are some of the use cases that would lead us to use distributed systems? There are a couple that come to mind without thinking long and hard:

LLM’s: Training LLM’s is hardware intensive. If I can deploy 100 machines to training an LLM then I can have the trained model that much faster. If I can deploy 1000 machines—10 times faster.
Uptime: As Armstrong points out there’s never been a time when Google is taken down for a software upgrade. Since 1998 there’s never been a time when I went to look for something on Google and seen the message “Google is down for maintenance; come back later” Distributed computing allows you to keep things up and going because you don’t have to take things offline in order to upgrade machines.

In sum this talk isn’t some simple step-by-step “How to build your own distributed system” talk but it is a great introduction to the subject—why it’s tough and what we can do to make it easier to solve. Plus Joe was a great speaker who was very good at explaining a pretty tough concept.

“Systems that run forever self-heal and scale” by Joe Armstrong

Author’s note: This review was published in my January “Impractical Engineer” Newsletter. If you’d like to see the whole newsletter it’s here: The Impractical Engineer I hope I can be forgiven for plugging my newsletter but this does seem like a talk that Elixir folks might be interested in seeing.

2 comments

#erlang #scaling #self-healing #joe-armstrong

8 486 2

2026-01-02 19:30:05 UTC

Most Liked

rvirding

Creator of Erlang

Yes, we were out to solve a problem! What we found later was that the properties of the problem we were trying to solve are much more general than we had realised. I still remember when I heard that someone had programmed a webserver in Erlang thinking why would anyone want to program a a webserver in Erlang. Well, of course the answer is that they many very similar requirements!

Post #2

Onor.io

I have a very deep-seated admiration for the work the three of you did! It’s a major feat of engineering and it was impressive and still is impressive.

Post #3

Where Next?

View thread on forum (has 2 responses!)

erlang

scaling

self-healing

joe-armstrong

Home Learning Resources>Talks

#erlang #scaling #self-healing #joe-armstrong

21 484 2

Last post

Popular in Talks

Learning Resources>Talks

Keynote: Gang of None? Design Patterns in Elixir - José Valim | ElixirConf EU 2024

Code Sync: Keynote: Gang of None? Design Patterns in Elixir - José Valim | ElixirConf EU 2024 Comments welcome! View the <span class="ha...

#code-sync #elixirconf-eu

168 3155 43

2025-02-15 13:06:44 UTC

New

Learning Resources>Talks

Chris McCord ElixirConfEU Keynote: Phoenix LiveView - Interactive, Real TIme Apps - No need to write Javascript

One resource for liveview…

#elixirconfeu #liveview #2019 #elixirconfeu2019

39 2484 5

2019-05-01 17:02:46 UTC

New

Learning Resources>Talks

LoneStar Elixir 2019 Talks

A bunch of talks were published :+1:t3:

#conferences #2019 #lonestar

42 2571 5

2019-03-07 15:10:07 UTC

New

Learning Resources>Talks

ElixirConf 2022 - Brian Cardarella - What is LiveView Native?

ElixirConf: ElixirConf 2022 - Brian Cardarella - What is LiveView Native? Comments welcome! View the <span class="hashtag-icon-placehold...

#elixirconf #liveview-native #elixirconf-us

45 1452 6

2022-09-11 13:04:47 UTC

New

Learning Resources>Talks

4) ElixirConf 2017 - Thinking In Ecto - Darin Wilson

by @darinwilson

#elixirconf2017

39 2642 13

2019-09-23 19:04:04 UTC

New

Learning Resources>Talks

0) ElixirConf 2017 Talks List (all talks now added!)

#Day 1 ElixirConf 2017 - Day 1 Keynote - Justin Schneck (Discussion thread) ElixirConf 2017 - My Journey from Go to Elixir - Veronica L...

#elixirconf2017

64 7856 20

2017-09-16 15:34:59 UTC

New

Learning Resources>Talks

6) ElixirConf 2017 - Building an Artificial Pancreas with Elixir and Nerves - Tim Mecklem

ElixirConf 2017 - Building an Artificial Pancreas with Elixir and Nerves - by @tmecklem People with Type 1 ...

/nerves #elixirconf2017

33 3549 6

2017-09-21 15:01:02 UTC

New

Learning Resources>Talks

25) ElixirConf 2017 - Managing Tables With Elixir and OTP - Robert Beene

ElixirConf 2017 - Managing Tables With Elixir and OTP - Robert Beene We’ve all waited for a table at a resta...

#elixirconf2017

3 1514 1

2017-10-10 22:35:53 UTC

New

Learning Resources>Talks

Watching Phoenix LiveView Talk by Chris McCord

I am in middle of watching this. I have to stop every minute and think over what Chris just said. The amount of info he sends over that ...

/phoenix #elixirconf

26 1858 9

2021-11-11 19:35:44 UTC

New

Learning Resources>Talks

ElixirConf EU 2019 Talk videos

ElixirConf EU 2019 talk videos ElixirConf EU 2020 - Early bird tickets on sale now! Website: http://www.elixirconf.eu Twitter: www.t...

#elixirconfeu #elixirconfeu2019

21 3065 9

2019-05-19 12:23:16 UTC

New

Other popular topics

Questions & Help>Questions

Deleting item from a list

Hello, can anybody help here..? I have a list of players and I what to delete an element, but every for loop the list is reverting to ori...

7 24396 4

2020-03-18 04:04:09 UTC

New

Questions & Help>Questions

How can I write a raw sql query?

Hi, I have to write a raw query for one of my project. But till now I have used ecto queries and don’t have much experience writing raw ...

/phoenix #ecto

13 19797 20

2020-04-12 00:15:10 UTC

New

Questions & Help>Questions

Erlang and Elixir on Apple Silicon/M1 Chip

Hello all! I am typing this post from my new MacBook Pro with the M1 chip. I’m loving it so far, and will probably use it as my daily dr...

#erlang #troubleshooting

121 25186 65

2023-07-05 21:22:36 UTC

New

Questions & Help>Questions

How to set environment variables in dev.exs?

Hi All, I set a environment variables in dev.exs , like below code. when i start server, how can i set the ${enable} value? thanks. d...

/phoenix

31 22088 15

2021-03-16 00:58:41 UTC

New

Questions & Help>Questions

Starship (cross-shell prompt) error - (starship::utils): Executing command "elixir" timed out

I am using the Starship cross-shell prompt – it seems pretty nice, but I get some errors: [WARN] - (starship::utils): Executing command ...

#starship

8 17451 3

2021-04-26 16:14:19 UTC

New

Questions & Help>Questions

No such input `xxxxx` for action ResourceName1.create

In the code below, if the create action is not set to accept “extra_key” as an input, it errors out with a message shown above. Is there ...

/ash

3 78691 2

2024-05-13 17:51:41 UTC

New

Questions & Help>Questions

How to get struct from map - elixir?

Lets say I have map like this fetching from my database %{"_id" => #BSON.ObjectId<58eb1a7a9ad169198c3dXXXX>, "email" => ...

/phoenix #ecto #maps #structs

38 34977 34

2025-08-22 12:15:57 UTC

New

Chat & Discussions>Discussions

LiveView demos, examples, and sample apps thread!

Seen any cool LiveView demos, sample apps or examples? Please post them here! :003:

/phoenix #liveview

232 30749 60

2021-07-02 10:53:43 UTC

New

Chat & Discussions>Discussions

The complexity of Haskell vs. Elixir's simplicity

I wrote this comment on r/haskell, and it’s not popular there. :wink: But I think I’m on to something… Haskell reminds me of Java, and e...

#language-implementation #haskell

138 30136 35

2021-03-12 09:32:38 UTC

New

Questions & Help>Questions

Why would I choose Elixir as a general purpose programming language?

In asking this question I am more interested about the expressiveness of the language itself and less concerned about the availability of...

#functional-programming #use-cases

65 35072 13

2020-01-05 04:29:20 UTC

New

Learning Resources>Talks

Engineering Network Protocol Clients - Carlos Souza | ElixirConf US

Learning Resources>Talks

Extending Elixir with WebAssembly Components - Chris Nelson| ElixirConf US

Learning Resources>Talks

Hologram: The Journey to Local-First Elixir in the Browser (ElixirConf EU 2026)

Learning Resources>Talks

Building Careers, Balancing Life: Stories from the Elixir World and Beyond | ElixirConf US

Learning Resources>Talks

Managing Distributed Recorder Workers with Elixir-Misael Perez Chamorro | ElixirConf US

Learning Resources>Talks

The Architecture Behind Deploying Livebook Apps w/ Livebook Teams-Hugo Baraúna | ElixirConf US

Learning Resources>Talks

Practical Mentorship for a Stronger Community - Jordan Miller | ElixirConf US

Learning Resources>Talks

LT: smithy beam: Contract first API Development - Frank Eickhoff | ElixirConf EU

Learning Resources>Talks

Update from the Erlang Ecosystem Foundation - Dan Janowski | ElixirConf EU

Learning Resources>Talks

Keynote: The Latest on Elixir Types - José Valim | ElixirConf EU

Learning Resources>Talks

Resources Talks ❯

Latest on Elixir Forum

Spectre - OTP-native runtime for explicit and policy-controlled AI agents

News>Announcing

How to start Oban, but delay job processing?

Questions & Help>Questions

Engineering Network Protocol Clients - Carlos Souza | ElixirConf US

Learning Resources>Talks

Junior Full Stack Developer (Elixir, Phoenix, React) - Ireland

Jobs & Member Profiles>Jobs

Ancient Stones - a Phoenix world-building dashboard for RPG and fiction settings

News>Announcing

Simple/light way to self host microapps and services?

Questions & Help>Questions

Full Stack Elixir Developer - Patient Reach 360, Dayton, OH, USA, Remote US

Jobs & Member Profiles>Jobs

Help with load testing a custom protocol over TCP

Questions & Help>Questions

Nx ecosystem 0.13 library updates

News>News & Updates

Learning Elixir: Error Handling with try/rescue/catch/after

Blogs & Podcasts>Blog Posts

Elixir-lang.org redesign

Chat & Discussions>Discussions

Plugin for Claude Code - specialist agents and an enforced Elixir/Phoenix development workflow

News>Announcing

BEAM There, Done That with Garrison Hinson-Hasty & Isaac Yonemoto on Safer Native Code

Blogs & Podcasts>Podcasts

Localize_ecto - locale-aware Postgres collation for Ecto queries

News>Announcing

Batching Phoenix LiveView Updates

Blogs & Podcasts>Blog Posts

Elixir Forum ❯

Sub Categories:

Forums

We're in Beta

About us Mission Statement

Joe Armstrong's "Systems that run forever self-heal and scale" (and my review of it)

Onor.io

Joe Armstrong's "Systems that run forever self-heal and scale" (and my review of it)

Most Liked

rvirding

Onor.io

Where Next?

Popular in Talks

Keynote: Gang of None? Design Patterns in Elixir - José Valim | ElixirConf EU 2024

Chris McCord ElixirConfEU Keynote: Phoenix LiveView - Interactive, Real TIme Apps - No need to write Javascript

LoneStar Elixir 2019 Talks

ElixirConf 2022 - Brian Cardarella - What is LiveView Native?

4) ElixirConf 2017 - Thinking In Ecto - Darin Wilson

0) ElixirConf 2017 Talks List (all talks now added!)

6) ElixirConf 2017 - Building an Artificial Pancreas with Elixir and Nerves - Tim Mecklem

25) ElixirConf 2017 - Managing Tables With Elixir and OTP - Robert Beene

Watching Phoenix LiveView Talk by Chris McCord

ElixirConf EU 2019 Talk videos

Other popular topics

Deleting item from a list

How can I write a raw sql query?

Erlang and Elixir on Apple Silicon/M1 Chip

How to set environment variables in dev.exs?

Starship (cross-shell prompt) error - (starship::utils): Executing command "elixir" timed out

No such input `xxxxx` for action ResourceName1.create

How to get struct from map - elixir?

LiveView demos, examples, and sample apps thread!

The complexity of Haskell vs. Elixir's simplicity

Why would I choose Elixir as a general purpose programming language?

Learning Resources>Talks

Latest on Elixir Forum

Sponsor Spotlight

Our Sponsors

Categories:

Sub Categories:

Forums

Popular Tags

Our Sponsors

We're in Beta