Version 7.3.0
Donate
Thanks
- For reporting
StreamEx
is missing nonNull
on some IDEs/versions
Changelog
v7.3.0
Enhancements
- #1015 - @KronicDeth
- Add supplemental file editor for
.beam
files: “BEAM Chunks”. The decompiled binary file will continue to be shown on the default “Text” editor tab.
- #1021 - @KronicDeth
-
Support for Remove incorrect wording regarding test cases · elixir-lang/elixir@23c7542 · GitHub
Version |
Struct |
Started Event |
Finished Event |
%ExUnit.Test{} field |
< 1.6.0 |
%ExUnit.TestCase{} |
:case_started |
:case_finished |
case |
>= 1.6.0 |
%ExUnit.TestCase{} |
:module_started |
:module_finished |
module |
Because Elixir 1.6.0 could not introduce a breaking change, the < 1.6.0
events are fired, but resources/exunit/1.6.0/team_city_ex_unit_formatting.ex
will ignore them and only convert the >= 1.6.0
events to TeamCity event used in the JetBrains Test Runner UI.
- #1018 - Expose Declarations > Functions and Declarations > Macros in Color Settings - @dimcha
Bug Fixes
- #1019 - Don’t use
StreamEx
because support is inconsistent across IDEs - @KronicDeth
README.md
Updates
.beam
Files
.beam
files are the compiled version of modules on the BEAM virtual machine used by Elixir and Erlang. They are the equivalent of .class
files in Java.
.beam
files are not detected purely by their file extension: the BEAM file format starts with a magic number, FOR1
, that is checked for before decompiling.
.beam
files have 2 editors registered: decompiled Text and BEAM Chunks
Decompression
If the .beam
module was compiled with the compressed
compiler directive, which in Erlang looks like
-compile([compressed])
and in Elixir looks like
@compile [:compressed]
then the outer file format is GZip (which is detected by checking for the gzip magic number, 1f 8b
, at the start of the file) and the .beam
will be (stream) decompressed before the .beam
header is checked and the chunks decoded.
BEAM Chunks
.beam
files are composed of binary chunks. Each chunk is formatted
Offset |
+0 |
+1 |
+2 |
+3 |
0 |
Name (ASCII Characters) |
4 |
Length (`unsigned-big-integer`) |
8+ |
Chunk-Specific |
This format is generically referred to as Type-Length-Value
The BEAM Chunks editor tab is subdivided into further tabs, one for each chunk in the .beam
file.
The tabs are listed in the order that the chunks occur in the .beam file.
Atom
/ AtU8
The Atom
chunk holds LATIN-1 encoded atoms while AtU8
chunk holds UTF-8 atoms. There will only be one of these atom-related chunks in any given .beam
file. AtU8
is used in newer versions of OTP that support UTF-8 atoms. AtU8
was introduced in OTP 20.
Format
Offset |
+0 |
+1 |
+2 |
+3 |
0 |
atom count (`unsigned-big-integer`) |
4 |
length1 (`unsigned-byte`) |
bytes (for length1) |
4+length1+...+lengthn-1 |
lengthn (`unsigned-byte`) |
bytes (for lengthn) |
Tab
The Atom
/AtU8
tab shows a table with the columns
Column |
Description |
Source |
Index |
1-based to match Erlang convention. In the Code chunk, atom(0) is reserved to always translate to nil |
Derived |
Byte Count |
The byte count for the atom’s bytes |
Raw |
Characters |
From encoding the bytes as LATIN-1 for Atom chunk or UTF-8 for AtU8 chunk |
Derived |
Attr
The Attr
chunk holds the module attributes, but only those that are persisted. Erlang module attributes are persisted by default, but in Elixir module attributes need to be marked as persisted with Module.register_attribute/3
Format
The Attr
chunk uses External Term Format (term_to_binary
’s output) to encode a proplist, which is similar to, but not quite the same an Elixir Keyword list
All modules will have a :vsn
attribute that is either set explicitly or defaults to the MD5 of the module.
Tab
The Attr
tab shows a table with the columns
Column |
Description |
Source |
Key |
Attribute name |
Raw |
Value |
Attribute value. Note: The value always appears as a list as read from the binary format. I don’t know why. |
Raw |
CInf
The CInf
chunk is the Compilation Information for the Erlang or Erlang Core compiler. Even Elixir modules have it because Elixir code passes through this part of the Erlang Core compiler
Format
The CInf
chunk uses External Term Format (term_to_binary
’s output) to encode a proplist, which is similar to, but not quite the same an Elixir Keyword list
Tab
The CInf
tab shows a table with the columns
Column |
Description |
Source |
Key |
Option name |
Raw |
Value |
Inspected value |
Raw |
Code
The Code
chunk contains the byte code for the module.
Format
It is encoded in BEAM Compact Term Encoding, which differs from the binary format produced by term_to_binary
.
Tab
The Code
tab shows a read-only editor with one byte code operation on each line. For ease of reading, operations are grouped by function and then label block with indentation indicating scope.
By default as many references to other chunks and references to other parts of Code
chunk are inlined to ease understanding. If you want to see the raw byte code operations, you can turn off the various inliners.
####### Controls
Control |
On |
Off |
Inline Atoms |
atom(0) is inlined as nil
|
atom(N) if "Inline Integers" is Off
|
atom(n) looks up index `n` in `Atom`/`AtU8` chunk and inlines its `inspect`ed version
|
N if "Inline Integers" is On and the argument supports "Inline Integers"
|
Inline Functions |
literal(n) looks up index n in FunT chunk and inlines the name if the argument supports "Inline Functions"
|
literal(n) if "Inline Integers" is Off
|
n if "Inline Integers" is On and the argument supports "Inline Integers"
|
Inline Imports
|
literal(n) looks up index n in ImpT and inlines it as a function reference: &module.name/arity if argument supports "Inline Functions"
|
literal(n) if "Inline Integers" Is Off
|
n if "Inline Integers" is On and the argument supports "Inline Integers"
|
Inline Integers |
atom(n) and literal(n) inline as n if argument supports "Inline Integers" |
atom(n) , integer(n) , and literal(n) |
integer(n) inlines as n |
Inline Labels |
label(n) inlines as n if argument supports "Inline Labels" |
label(n) |
Inline Lines |
line(literal(n)) looks up index `n` in the "Line Reference" table in the `Lines` chunk. The Line Reference contains a file name index and line. The file name index is looked up in the "File Name" table in the `Lines` chunk. The line from the Line Reference and the File name from the "File Name" table are inlined as `line(file_name: file_name, line: line)`. |
line operations are left as is |
Inline Literals |
literal(n) looks up index n in LitT chunk and inlines its `inspect`ed version if the argument supports "Inline Literals" |
literal(n) |
Inline Local Calls |
label(n) finds label(n) in Code chunk, then searches back for the previous func_info operation, then inlines it as a function reference: &module.name/arity if argument supports "Inline Local Calls" |
label(n) |
Inline Strings |
Looks up bit_length and byte_offset into `StrT` chunk as their CharList value if supported by operation as value to string argument name |
bit_length and byte_offset arguments are left as is |
Show Argument Names |
Adds keyword argument names before each argument value |
Leaves values as positional arguments |
If any of the inliners are incorrect or you have an argument name that makes more sense, please open an issue.
Dbgi
The Dbgi
chunk contains Debug Info. It was introduced in OTP 20 as a replacement for the Abst
chunk. While the Abst
chunk was required to contain the Erlang AST, the Dbgi
format can contain the debug info for other languages, such as Elixir quoted
form AST.
Format
Because the format is language neutral, the format is a set of nested, versioned formats. The outer most layer is
{:debug_info_v1, backend, metadata | :none}
For :debug_info_v1
, Elixir’s backend
is :elixir_erl
. The metadata
for :elixir_erl
is further versioned: {:elixir_v1, map, specs}
.
map
contains the bulk of the data.
Key |
Value |
:attributes |
Attributes similar to the Attr chunk, but at the Elixir, instead of Core Erlang level. Usually they match with the exception that attributes doesn’t contain vsn when Attr contains the MD5 version |
:compile_opts |
Compilation options similar to CInf chunk’s options key, but at for Elixir, instead of Core Erlang level. |
:definitions |
The Elixir quoted AST for reach function clause. |
:file |
The name of the file the module was generated from. |
:line |
The line in :file where the module was defined, such as the line defmodule occurred. |
:module |
The name of the module as an atom |
:unreachable |
Unreachable functions |
Tab
The Dbgi
tag show the single value map entries: :file
, :line
, and :module
.
For the multi-value keys: :attributes
, :compile_opts
, and :definitions
, there are individual tabs.
####### Attributes
The Attributes tab has the same format as the Attr
s chunk.
####### Compile Options
The Compile Options tab is usually empty, much like the CInf
options
key for Erlang.
####### Definitions
The Definitions tab is split between a tree of Module, Function/Arity and clauses.
Clicking on a clause will show only that clause, but clicking on a higher level in the tree will show all clauses in the function or the entire Module.
The AST stored in the definitions
tab and the process of converting it back to code is not format preserves, so it will not look precisely like the source code as the AST has undergone some macro expansion before its put in the Dbgi
chunk. As common idioms are understood, reversals will be add to the renderer.
ExDc
The ExDc
chunk stores ExDoc. Not the rendered HTML from the ex_doc
package, but the the @doc
, @moduledoc
, and @typedoc
attribute values that work even without ex_doc
installed. This chunk is what is consulted when the h
helper is used in iex
.
Format
The ExDc
chunk is the encoded with term_to_binary
. The term format is a versioned as {version, versioned_format}
. The current version
tag is :elixir_docs_v1
and the versioned_format
is a Keyword.t with keys matching the Code.get_docs/2
tags :callback_docs
, :docs
, :moduledoc
, and :type_docs
keys.
Tab
Like Dbgi
, the ExDc
tab is split between a tree to navigate and an editor to show the decompiled value.
Click on a node in the tree will show all docs at that level and any descendants.
Node |
Description |
Root |
All docs |
Module |
@moduledoc |
Types |
All @typedoc s |
Types child |
A specific @typedoc |
Callbacks |
All @callback @doc s |
Callbacks child |
A specific @callback ’s @doc |
Functions/Macros |
All @doc s for functions/macros |
Functions/Macros child |
A specific function/macro’s @doc |
ExpT
The ExpT
chunk is the Export Table. The name “Export” derives from the Erlang
module attribute -export
, which is used to “export” functions from a module. It is the equivalent of making a function or macro public with def
and defmacro
as opposed to making it private with defp
and defmacrop
in Elixir.
Format
The BEAM format and the ExpT
chunk, being made for Erlang, has no concept of macros. It only understands functions, so Elixir macros, like __using__/1
called by use
are compiled to plain Erlang functions with MACRO-
prefixed to their name and an extra argument (the __CALLER__
environment) as the first argument, which increases the arity, yielding a full MFA of MACRO-__using__/2
as seen above.
Tab
The ExpT
tab shows a table with the columns
Column |
Description |
Source |
Atom Index |
Index into the Atom or AtU8 chunk for the function’s name |
Raw |
Name |
The atom referenced by “Atom Index” |
Derived |
Arity |
The arity (argument count) of the function |
Raw |
Label |
Label index in the Code chunk where the function is defined. This label is usually immediately after the func_info operation and before the first pattern match or guard operation. |
Raw |
ImpT
The ImpT
chunk is the Import Table. It DOES NOT encode just the Erlang -import
attributes or Elixir import
macro calls: it tracks any external function or macro called from another module. call_ext_*
operations in the Code
chunk don’t store the Module and Function (MF) of the function they will call directly in the bytecode, instead, one of the arguments is an index into the ImpT
chunk. This way, all external calls are normalized into the ImpT
chunk instead of being denormalized to the call site. The arity still appears at the call site to help with checking the argument count.
Format
You may notice that erlang.byte_size/1
is included in the table. This is because even BIFs are referenced by MFA and not a pre-assigned number as would be the case for system calls in operating systems like Linux. BEAM is like an Operation System, but not in all ways.
Tab
The ImpT
tab shows a table with the columns
Column |
Description |
Source |
Index |
0-based index used by references in the Code chunk. |
Derived |
Module Atom Index |
Index into the Atom or AtU8 chunk for the Module’s name |
Raw |
Module Atom |
The atom referenced by “Module Atom Index”. |
Derived |
Function Atom Index |
Index into the Atom or AtU8 chunk for the functon’s name |
Raw |
Function Atom |
The atom referened by “Function Atom Index”. |
Derived |
LitT
The LitT
chunk contains literals loaded as arguments in Code
chunk.
Format
Confusingly, in the Code
chunk sometimes the literal(N)
term is used to encode integer N
, an index into another chunk, or an actual index
into the LitT
. How literal
terms are handled is completely dependent on the specific operation, so without having outside knowledge about the bytecode operation arguments for BEAM, the best way to figure out if literal
terms are an integer or an index is to toggle the various controls in the Code
tab to see if literal
with no inlining turns into a LitT
literal, FunT
function reference, ImpT
function reference, or integer.
Tab
The LitT
tab shows a table with the columns
Column |
Description |
Source |
# |
0-based index used by references in the Code chunk. |
Derived |
Term |
The equivalent of `raw |
> binary_to_term() |
Line
The Line
chunk encodes both the file name and line number for each line(literal(n))
operation in the Code
chunk. The n
in line(literal(n))
is an index in to the Line References table in the Line
chunk. This is used in Phoenix view modules to show where code from templates comes from.
Format
The Line
chunk is composed of 2 subsections: (1) Line References and (2) File Names. First there is a header setting up the number of each entry to expect.
Offset |
+0 |
+1 |
+2 |
+3 |
0 |
emulator version (`unsigned-big-integer`) |
4 |
flags (`unsigned-big-integer`) |
8 |
Line Instruction Count (`unsigned-big-integer`) |
12 |
Line Reference Count (`unsigned-big-integer`) |
16 |
File Name Count (`unsigned-big-integer`) |
####### Line References
This uses the Compact Term Format used for the Code
chunk. The format ends up producing {file_name_index, line}
pairs using the following algorithm:
Term |
Interpretation |
atom(n) |
Change file_name_index to n |
integer(n) |
Add {file_name_index, n} to end of Line References |
####### File Names
Offset |
+0 |
+1 |
+2 |
+3 |
0 |
Byte Count (`unsigned-big-integer`) |
Bytes |
Tab
The Line
tab has one subtab for each subsection in the tab. Each subsection has its own table.
LocT
The LocT
chunk is the dual to the ExpT
chunk: it contains all private functions and macros.
Format
You’ll notice entries like -__struct__/1-fun-0-
, starts with -
and have a /
suffix with fun
in it. This naming scheme is used for anonymous functions such as those defined with fn
or the capture operator (&
) in Elixir. Much like how macros don’t really exist and use a MACRO-
suffix, anonymous functions/lambdas don’t exist, and instead use a distinct naming scheme -<PARENT_FUNCTION>/*fun*
. Unlike MACRO-
, which is an Elixir invention, anonymous functions/lambdas really being local named functions with derived names is also done in pure Erlang modules. Erlang’s anonymous functions are defined with fun
, which is where the fun
part of the naming scheme comes from.
Tab
The LocT
tab shows a table with the columns
Column |
Description |
Source |
Atom Index |
Index into the Atom or AtU8 chunk for the function’s name |
Raw |
Name |
The atom referenced by “Atom Index” |
Derived |
Arity |
The arity (argument count) of the function |
Raw |
Label |
Label index in the Code chunk where the function is defined. This label is usually immediately after the func_info operation and before the first pattern match or guard operation. |
Raw |
StrT
The StrT
chunk contains all Erlang strings (that is, Elixir charlists) used in the Code
chunk.
Format
The StrT
chunk contains a single contiguous pool. These strings are used for byte code operations like bs_put_string
. Not all strings appear in StrT
. Some strings, including most Elixir strings (Erlang binaries) appear in the LitT
chunk that holds literals. I’m not sure how the compiler determines whether to use StrT
or LitT
. I think it all depends on the byte code operation.
Instead of encoding the start and length of each string in the chunk itself, the start and length for any given string is passed as arguments to the byte code operations in the Code
chunk. By doing this, shared substrings can be efficiently encoded in StrT
.
Tab
Installation Instructions