Compilation and how to check the final executable code - beam file (inline, dead code, optimization)

trashyEx · September 30, 2022, 2:58pm

Hello!

I would like to inspect the final code that is executed, and I don’t know how to do it.

Question 1: Coming from C, what could be the nearest thing to inspect the assembly code generated by -O0 or -O2, for instance, in Elixir/Erlang?

It is ok if it is some sort of assembly or other erlang representation, but I would like to see the final optimizations performed (or the last thing we can get before jumping to the JIT compiler, etc).

Question 2: Does Mix perform any code optimization when doing MIX_ENV=prod mix compile compared with mix compile? The former builds in _build/prod, while the latter in _build/dev but when decompiling the final beam files I don’t see any differences. Can be a problem regarding my decompilation tools? or the optimizations are not produced at the beam level?

Question 3: My purpose is to know if a code like this is inlined by default:

defmodule HwPhoenix.Optimizable do
  def public_single_call(keys) do
    nested_single_call(keys)
  end

  defp nested_single_call(keys) do
    hash_keys(keys)
  end

  defp hash_keys(keys) do
    keys |> Enum.map(&key_hash/1)
  end

  defp key_hash(key) do
    :erlang.phash2(key)
  end
end

If it is compiled with mix compile (I assume MIX_ENV=dev mix compile is exactly the same), I can expect that these functions are probably not inlined, but if I do MIX_ENV=prod mix compile I would like that some of them are inlined, generating a final code that could be something close to this representation:

defmodule HwPhoenix.Optimizable do
  def public_single_call(keys) do
    Enum.map(keys, &:erlang.phash2/1)
  end
  
  # Eg. if the next 2 were also public, also inlined here:
  #
  # def nested_single_call(keys) do
  #   Enum.map(keys, &:erlang.phash2/1)
  # end
  # 
  # def hash_keys(keys) do
  #   Enum.map(keys, &:erlang.phash2/1)
  # end
end

Is any function inlined by default under certain conditions?

Question 4: Can the compiler detect and eliminate dead code?

For instance, in the previous example, if we have 3 public functions def, but we only use one of them from other modules, the compiler may remove the rest. I suspect it is not done, but do we have any tool, compiler option or way to do that? Maybe the only option is to use a tool like dialyzer, providing suggestions, and we have to manually remove them.

Furthermore, does the erlang runtime gain anything if we remove code? I assume less RAM consumption, but anything in terms of speed? I suspect that is really minor optimization, and we can obtain some benefit only if we remove functions that have the same signature (name/arity), to facilitate the pattern matching operation. Is it correct?

Question 5: How can I check that inline is doing what I expect?

I have marked hash_keys as inlined (even including the other compiler option in case it would affect), but I don’t know how to check that the beam file is effectively with hash_keys inlined.

defmodule HwPhoenix.Optimizable do
  def single_call(keys) do
    nested_single_call(keys)
  end

  def nested_single_call(keys) do
    hash_keys(keys)
  end

  def hash_keys(keys) do
    keys |> Enum.map(&key_hash/1)
  end

  @compile :inline_list_funcs
  @compile {:inline, key_hash: 1}
  defp key_hash(key) do
    :erlang.phash2(key)
  end
end

I tried three different ways to inspect the final code, and I can see in all of them the key_hash function (which should be inlined). What am I doing wrong? How is the correct way to inspect the final code to be executed?

So, I created two projects: the same modules/code but one without the @compile flags, called normal, and the other inline with those flags shown above. I needed to create two projects since I suspect some IDE extension has problems detecting that the code has change (maybe related with debug_info), so, I want to be sure each .beam file “points” to its proper .ex file (one normal, the other with compiler flags).

Via Visual Studio Code BEAMdasm

MIX_ENV=prod or MIX_ENV=dev generates the same, but still, for this case:

In dev:

In prod:

In prod with inline:

In all cases looks like the same but with different Compilation Info.

Via Jetbrains Elixir plugin

In any of the three ways (dev, prod, prod inline flags):

# Source code recreated from a .beam file by IntelliJ Elixir
defmodule HwPhoenix.Optimizable do

  # Functions

  def __info__(p0) do
    # body not decompiled
  end

  def module_info() do
    # body not decompiled
  end

  def module_info(p0) do
    # body not decompiled
  end

  def public_single_call(keys) do
    nested_single_call(keys)
  end

  # Private Functions

  defp unquote(:"-fun.key_hash/1-")(p0) do
    # body not decompiled
  end

  defp hash_keys(keys) do
    Enum.map(keys, &:key_hash/1)
  end

  defp key_hash(key) do
    :erlang.phash2(key)
  end

  defp nested_single_call(keys) do
    hash_keys(keys)
  end
end

Via decompile tool

Using decompile.

In the normal project (MIX_ENV=prod or MIX_ENV=dev generates the same):

%% MIX_ENV=prod mix decompile Elixir.HwPhoenix.Optimizable --to asm
Retrieving code for Elixir.HwPhoenix.Optimizable
-file("lib/hw_phoenix/optimizable.ex", 1).

-module('Elixir.HwPhoenix.Optimizable').

-compile([no_auto_import]).

-export(['__info__'/1, public_single_call/1]).

-spec '__info__'(attributes |
                 compile |
                 functions |
                 macros |
                 md5 |
                 exports_md5 |
                 module |
                 deprecated) -> any().

'__info__'(module) -> 'Elixir.HwPhoenix.Optimizable';
'__info__'(functions) -> [{public_single_call, 1}];
'__info__'(macros) -> [];
'__info__'(exports_md5) ->
    <<"gq\223p¸fGygo\d\222HÛò">>;
'__info__'(Key = attributes) ->
    erlang:get_module_info('Elixir.HwPhoenix.Optimizable',
                           Key);
'__info__'(Key = compile) ->
    erlang:get_module_info('Elixir.HwPhoenix.Optimizable',
                           Key);
'__info__'(Key = md5) ->
    erlang:get_module_info('Elixir.HwPhoenix.Optimizable',
                           Key);
'__info__'(deprecated) -> [].

hash_keys(_keys@1) ->
    'Elixir.Enum':map(_keys@1, fun key_hash/1).

key_hash(_key@1) -> erlang:phash2(_key@1).

nested_single_call(_keys@1) -> hash_keys(_keys@1).

public_single_call(_keys@1) ->
    nested_single_call(_keys@1).

In prod with inline:

%% MIX_ENV=prod mix decompile Elixir.HwPhoenix.Optimizable --to asm
Retrieving code for Elixir.HwPhoenix.Optimizable
-file("lib/hw_phoenix/optimizable.ex", 1).

-module('Elixir.HwPhoenix.Optimizable').

-compile([no_auto_import,
          inline_list_funcs,
          {inline, [{key_hash, 1}]}]).

-export(['__info__'/1, public_single_call/1]).

-spec '__info__'(attributes |
                 compile |
                 functions |
                 macros |
                 md5 |
                 exports_md5 |
                 module |
                 deprecated) -> any().

'__info__'(module) -> 'Elixir.HwPhoenix.Optimizable';
'__info__'(functions) -> [{public_single_call, 1}];
'__info__'(macros) -> [];
'__info__'(exports_md5) ->
    <<"gq\223p¸fGygo\d\222HÛò">>;
'__info__'(Key = attributes) ->
    erlang:get_module_info('Elixir.HwPhoenix.Optimizable',
                           Key);
'__info__'(Key = compile) ->
    erlang:get_module_info('Elixir.HwPhoenix.Optimizable',
                           Key);
'__info__'(Key = md5) ->
    erlang:get_module_info('Elixir.HwPhoenix.Optimizable',
                           Key);
'__info__'(deprecated) -> [].

hash_keys(_keys@1) ->
    'Elixir.Enum':map(_keys@1, fun key_hash/1).

key_hash(_key@1) -> erlang:phash2(_key@1).

nested_single_call(_keys@1) -> hash_keys(_keys@1).

public_single_call(_keys@1) ->
    nested_single_call(_keys@1).

Only differences:

$ diff -s prod.asm ../hw_phoenix2/prod.asm
6,8c6
< -compile([no_auto_import,
<           inline_list_funcs,
<           {inline, [{key_hash, 1}]}]).
---
> -compile([no_auto_import]).

The tool decompilerl produces exactly the same output as this tool.

Other tools: beam_disassemble:

I could not decompile since beam_disassemble does not work with recent versions of OTP (issue).

Other tools: decompiler:

Extracted from this gist.

#!/usr/bin/env escript
%% -*- mode: erlang -*-

main([BeamFile]) ->
    {ok,{_,[{abstract_code,{_,AC}}]}} = beam_lib:chunks(BeamFile,[abstract_code]),
        io:fwrite("~s~n", [erl_prettypr:format(erl_syntax:form_list(AC))]).

./decompiler.erl _build/dev/lib/hw_phoenix/ebin/Elixir.HwPhoenix.Optimizable.beam
escript: exception error: no match of right hand side value {error,beam_lib,
{missing_backend,
"_build/dev/lib/hw_phoenix/ebin/Elixir.HwPhoenix.Optimizable.beam",
elixir_erl}}
in function  erl_eval:expr/5 (erl_eval.erl, line 450)
in call from escript:eval_exprs/5 (escript.erl, line 869)
in call from erl_eval:local_func/6 (erl_eval.erl, line 572)
in call from escript:interpret/4 (escript.erl, line 780)
in call from escript:start/1 (escript.erl, line 277)
in call from init:start_em/1
in call from init:do_boot/3

Therefore, what is the correct way to inspect the final code to be executed, with the purpose of reviewing and understanding how different code optimizations are applied (inlining, code transformations, etc)? What is the de-facto methodology?

Seeing these examples it seems that inline is not produced in the EAF, expanded AST… neither the beam files. But of course, I can be completely wrong.

Thank you in advance.

hst337 · September 30, 2022, 4:28pm

Question 1: Coming from C, what could be the nearest thing to inspect the assembly code generated by -O0 or -O2, for instance, in Elixir/Erlang?

Take a look at erlc options in this script: eplaypen/priv/scripts/compile.sh at master · seriyps/eplaypen · GitHub. You can see that to compile something into one of the IRs you can pass an erlc option. To do this with elixir code, you can use something like

ERL_COMPILER_OPTIONS="dkern" elixir x.ex

For more information about IRs you can refer to: The Erlang Runtime System

Question 2: Does Mix perform any code optimization when doing MIX_ENV=prod mix compile compared with mix compile?

Only optimizations which erlang does + protocol consolidation.

The former builds in _build/prod, while the latter in _build/dev but when decompiling the final beam files I don’t see any differences. Can be a problem regarding my decompilation tools? or the optimizations are not produced at the beam level?

Yeah, they’re basically the same, because all erlang optimizations are lightweight, therefore both dev and prod use full set of optimizations.

Question 3: My purpose is to know if a code like this is inlined by default:

Private functions are inlined in the module when their size is small enough. You can force inlining using @compile {:inline, function: arity} module attribute. As I said, this works only for private functions

Question 4: Can the compiler detect and eliminate dead code?

It can, and it does, but Erlang and Elixir runtime assumes that modules can be recompiled in runtime, therefore it can’t perform on inter-module calls optimizations.

LostKobrakai · September 30, 2022, 4:33pm

Modules can also be added and removed at runtime. It’s not just recompiling existing ones.

hst337 · September 30, 2022, 4:38pm

Actually, if anyone is interested, I am working on optimizing transpiler for Elixir right now. I already hit 40% performance improvement in my library Pathex and implemented some of optimizations like Enum fusion, compile-time preevaluation, some peepholes (like case Map.get(map, key, default)) etc

However, it is currently in work in progress state, but I am planning to release it someday anyway

dimitarvp · September 30, 2022, 6:45pm

Why not open-source it now if it’s already working?

Sounds like it’s very valuable.

hauleth · September 30, 2022, 7:12pm

No, it doesn’t do it for public functions because it cannot do it for few reasons:

Due to late binding of Erlang functions and being able to add/replace modules at runtime it is not possible to be 100% sure that given function is and will not be used
In addition to above there is no way to detect unused functions even when you do not use runtime generated modules or IEx, the dynamic function calls via apply/3, spawn/3, spawn_link/3, etc. prevent you from doing such analysis. For example there is no way to detect that function start_link/1 in your GenServers is used, because it is fully dynamically called.

hst337 · September 30, 2022, 7:23pm

That makes sense, but compiler is relatively big project which has a meaning only when all parts are working correctly. And I’ll make it public only when it will compile all projects correctly (if these projects meet the requirements). I think I’ll make an official announcement with more details in a couple of months

I just don’t want to release an alpha which will gain a reputation of some hacky and buggy project.

dimitarvp · September 30, 2022, 7:37pm

For what it’s worth it’s the same for Rust, at least for library crates (not for binaries i.e. commands you can run). If a library declares something public then the compiler doesn’t include it in the dead code analysis at all.

dimitarvp · September 30, 2022, 7:38pm

Fair, I’d go about this by putting only what works in it in the main branch and warn visitors that the project is under heavy development. However that requires some extra work from you so it’d be understandable if you are not willing to put it.