Piggy backing a bit on @dvcrn topic BEAM optimization for functions with static return type?, I’ve been trying to understand in a deeper manner how Elixir works internally to generate the BEAM’s bytecode. After reading way too many blog posts I’ve found some things:
- People either think that Elixir compiles directly to Erlang source code (
.erl
) - Or people think that Elixir compiles directly to BEAM bytecode (
.beam
)
Both of these assumptions seem to be wrong. From the Elixir/Erlang Crash Course from Elixir’s official webpage, we can see that:
Elixir compiles into BEAM byte code (via Erlang Abstract Format).
Steps from Elixir source code to BEAM bytecode
So it’s not directly to Erlang source code, but it’s also not directly to BEAM bytecode. It is first transformed into Erlang Abstract Format (EAF). Continuing further into this topic, I’ve found a couple of blog posts, this one in particular BEAM by Example, where the author tells us the following:
Intermediate representations:
Erlang source code → Abstract Syntax Tree (‘P’) → expanded AST (‘E’) → Core Erlang (‘to_core’) → BEAM byte-code
So Elixir is first transforming to this Abstract Syntax Tree or Expanded AST intermediate representations. It should be something like this:
Elixir → Erlang Abstract Format → Core Erlang → BEAM bytecode
Note
I’ve also seen one or two posts online talking about Elixir being transformed into Erlang Forms. I’ve got no idea if these “Erlang Forms” are the same as one of the steps above or if they are an entirely different thing.
Now we’ve got a few different cases:
- Elixir → EAF
This can be achieved through the :elixir
Erlang module, that can be found here, like so:
expr = Macro.to_string(quote do: 1 + 2)
env = :elixir.env_for_all([])
eaf = :elixir.quoted_to_erl(expr, env)
# => Erlang Abstract Format of the quoted expression
- EAF → Core Erlang
and - Core Erlang → BEAM bytecode
I haven’t found a way to achieve these two steps. The further I’ve got is that Erlang’s compiling function can be used to get the various formats:
c(<file_name>, <format>)
c("file.erl", 'P')
c("file.erl", 'E')
c("file.erl", to_core)
c("file.erl", to
*BEAM Bytecode → Disassemble
This can be done either by c("file.erl, 'S').
or :beam_disasm.file/1
, which I believe are the same thing, as far as I could find.
Example Gist
I’ve built this small Gist to better show the steps from an Erlang source code all the way to the disassembled bytecode.
Note
James Fish also spoke to me on Slack and told me to check out the :compile.forms/1
Erlang function. I don’t fully understand what this function actually does or returns. It seems to receive Erlang Abstract Format as an argument.
Erlang docs are sparse and usually scattered all around. I’ve only managed to gather some info about this topic from several sources, but I’d like to better understand this process of Elixir → BEAM. I’ve watched dozens of Elixir talks, but I don’t recall ever seeing this explained.
I’m hoping someone around here has some further knowledge on this