Flame – rethinking serverless

Seems like this one is not here yet:

Imagine if we could auto scale simply by wrapping any existing app code in a function and have that block of code run in a temporary copy of the app.

Enter the FLAME pattern.

FLAME - Fleeting Lambda Application for Modular Execution

With FLAME, you treat your entire application as a lambda, where modular parts can be executed on short-lived infrastructure.

Check the screencast to see it in action:


Shout out to @chrismccord @jeregrine @mcrumm @seanmor5 for this awesome work, and their ability to build hype! :smirk:


HN: Rethinking serverless with FLAME | Hacker News

Lobsters: Rethinking Serverless with FLAME | Lobsters


Please don’t thank me, I just submitted a quick doc fix this morning and didn’t do any of the hard stuff :smiley: This was all @chrismccord @jeregrine and @mcrumm


Great work, absolutely love the fact that infrastructure is abstracted away!

I used on a few project erlang erpc, however the biggest problem with that was always having to maintain 2 separate codebases and their contracts.


This looks lit! :fire: I am sure this simplifies certain use cases.

Regarding CPU-intensive workloads, how does it compare against Vertical Auto-scaling? Unlike horizontal auto-scaling, where you would end up running multiple instances of your unneeded web servers, with vertical auto-scaling, resource allocation would be more granular, and allotted CPU will be used by the process that needs it. And unlike Flame, we don’t have to run another instance of our application or need distribution setup. Theoretically, we should be able to provide enough CPU based on the actual usage (up to the machine limit).

On a minor note, if the workload spawns an OS process (like ffmpeg in the example), then we can go even more granular and cover the safety aspect by limiting the CPU using cgroups so the web server always stays up.

That said, I can see that doing auto-scaling and cgroups can be complex, and it does not address all the use case Flame covers, and probably the developer experience, tooling will be much better with Flame. I just want to know others thoughts on such approaches, if I am missing some details.

Thinking about cgroups and Flame, I guess they can work together too. We can create a Flame Backend Adaptor to start applications locally with safety guarantees (CPU/memory limit) set using cgroups. And it can be made to work with or without Erlang Distribution.

May be cool to see an Oban integration, where you could configure job queues to run in flame pools?

1 Like

I appreciate the tag but if you check the contributions you’ll note that Flame is almost entirely Chris– as of this morning the rest of us all have one commit each :smiley:


Firstly, this is cool as hell, congrats to everyone that worked on it.

Secondly I have a problem that this kind of looks like it might help with however I have a stumbling block.

I have an application that I want to call bits of it remotely like how FLAME seems to be designed but how the method is orchestrated is via an elixir script file. It acts like a DSL for the whole thing.

Id also have a need to customise the image that actually ran remotely, say I wanted one FLAME job to be run on alpine, one on ubuntu.

I need to think about this.

def generate_thumbnails(%Video{} = vid, interval) do
  parent_stream = File.stream!(vid.filepath, [], 2048)
  FLAME.call(MyApp.FFMpegRunner, fn ->
    tmp_file = Path.join(System.tmp_dir!(), Ecto.UUID.generate())
    flame_stream = File.stream!(tmp_file)
    Enum.into(parent_stream, flame_stream)

    tmp = Path.join(System.tmp_dir!(), Ecto.UUID.generate())
    args =
      ["-i", tmp_file, "-vf", "fps=1/#{interval}", "#{tmp}/%02d.png"]
    System.cmd("ffmpeg", args)
    urls = VidStore.put_thumbnails(vid, Path.wildcard(tmp <> "/*.png"))
    Repo.insert_all(Thumb, Enum.map(urls, &%{vid_id: vid.id, url: &1}))

If File.stream! returns a struct with a path key containing the path to the file on the local machine, how can Enum.into(parent_stream, flame_stream) start a stream and read from that file from the remote machine?

Otherwise, this looks really nice! Amazing even.

Not to confuse with a small project I’ve had for 7 years already called “Flames” on hex which is a sort of simplistic version of an error aggregation service.

This project reminded me that it was time to publish my liveview rewrite I’ve been running off a branch for the past year.

1 Like

File io in BEAM is process based. Streaming here will Just Work™ and chunk at 2048 bytes at a time. It’s just part of beam and Elixir’s File interface, so you’ll need go spelunking if you want the impl details :slight_smile:

It really blows your mind just how many ridiculous things we get for free like this


I’d be interested to see a Membrane video + AI + Fly + FLAME demo app

if I am reading this correctly, also interested in how existing monoliths could FLAME out expensive services like video or AI

some folks with bandwidth and commercial skin in the game would benefit the Elixir Phoenix ecosystem if they could do a side-by-side $ comparison of FLAME fly vs Lambda AWS etc

with tech landscape facing budget constraints right now, focus on ROI with FLAME

I really like the just-in-time or on demand or drop in Lambda, the ability to make your monolith fragment into ephemeral microservices

certainly another genuine game changer from @chrismccord


Cost benefits will come back to the flag fall metering model for on demand workloads vs a commitment tier.

At the very low end it may be beneficial, but at some point you will be better off with a commitment tier as that always provides the largest discount because cloud service providers are guaranteed ROI when you commit. They may be some benefit to using demand based between increments in number of nodes on a commitment tier and to handle spikes that exceed any reserve capacity in your commitments.

Where FLAME is different is that you can delegate specialised work to a pool that scales to zero without a lot of complexity so your primary nodes continue serving the non-specialised work.

This also afords targeting work to run on specialised VMs and very little change to the code to achieve a very different architecture.

  @doc false
  def __open__(%File.Stream{path: path, node: node}, modes) when node == node() do
    :file.open(path, modes)

  @doc false
  def __open__(%File.Stream{path: path, node: node}, modes) do
    :erpc.call(node, :file_io_server, :start, [self(), path, List.delete(modes, :raw)])

Alright, the node is stored in the stream struct and the file stream implementation will just ask the right node to do the work.

Indeed this is so simple and Just Works™ :smiley:


This is great. this reminds me this decorator python function that runs functions on different infrastructures with GPU attached created by Fal.ai

You can also do that with Fly GPU machines, where your app is running in a regular machine, and FLAME in GPU. For example:

 name: BBRunner,
   gpu_kind: "a100-pcie-40gb", cpu_kind: "performance", cpus: 8, memory_mb: 20480}},

You’ll need to consider model load time on cold start, which could be 10-20s to get things loaded into memory for sizable models. Dynamic provisioning any kind of ML setup pays this price, but something to keep in mind. You’ll also want to bake the model/xla/bublebee artificats into your build step for as long as is feasible, so all the cached artifacts are in the docker container without needing to pull them from elsewhere.


According to @seanmor5 JAX has a way now to cache compilations, so we may be able to optimize the cold load time as well :slight_smile:

just out of curiosity: did you manage to run a 7B model (like mistral) via bumblebee right within a phx application and chat with it? I mean, “just like” a dependency? I am still a bit unsure if this stuff will “just work” when deploying to fly as usual

Yes. This is not using FLAME, but here’s llama2-13b running Elixir/bumblebee on Fly GPU https://gpubee.fly.dev/