BEAM optimization for functions with static return type?

Hello fellow Developers!

We had a question coming up over in the elixir slack in what BEAM / Elixir is doing under the hood for constructs like this:

def foo do
    %{foo: "bar"}
end

or the short version

def foo, do: %{foo: "bar"}

This is very commonly used to define constants and I am sure there is some optimization going on under the hood so the hash doesn’t have to get created with each function call, but can someone give here a little insight in what is actually happening?

3 Likes

The same value - a compiled constant - will be used each time. We can see this is indeed the case by inspecting the BEAM bytecode.

Let’s prepare an example module:

defmodule Test do
  def foo, do: %{foo: "bar"}
end

Save it as test.ex, compile with elixirc test.ex - the output will be the binary Elixir.Test.beam. We can disassemble the compiled module into the BEAM assembly with :beam_disasm.file/1. The output contains the section that interests us the most - the foo/0 function:

{:function, :foo, 0, 7,
   [{:line, 1}, 
    {:label, 6}, 
    {:func_info, {:atom, Test}, {:atom, :foo}, 0},
    {:label, 7}, 
    {:move, {:literal, %{foo: "bar"}}, {:x, 0}}, 
    :return]
 }

We have a function foo/0 that starts at label 7. The whole body of the function consists of moving a literal value to the X0 register (which is the return register on the BEAM - the function calling that one will expect the result in this exact location) and returning from the function. So we see a literal value is used.

But what does it mean “a literal value”? Each .beam file consists of several “chunks” that represent various things - the compiled code itself (Code chunk), public (exported) functions (ExpT), static atoms used in that module (Atom), etc. One of those chunks is the literals chunk called LitT. When the module is loaded, the chunk is unpacked and all the terms in there are placed somewhere in memory - they are constants that can be referenced multiple times (and since we know they don’t change and won’t go away they are skipped in garbage collection).

With the help of a function like below we can list all the literals from the chunk:

defmodule Literals do
  def literals(module) when is_atom(module) do
    path = :code.which(module)
    {:ok, {^module, [{'LitT', <<_::4*8, compressed::binary>>}]}} =
      :beam_lib.chunks(path, ['LitT'])
    <<_::4*8, records::binary>> = :zlib.uncompress(compressed)
    unpack(records)
  end

  defp unpack(<<>>) do
    []
  end
  defp unpack(<<len::4*8, record::binary-size(len), rest::binary>>) do
    [:erlang.binary_to_term(record) | unpack(rest)]
  end
end

For our Test module, Literals.literals(Test) gives [[foo: 0], %{foo: "bar"}] - which is the list of exported functions (used in the __info__(:functions) call) and our literal map.

13 Likes

@michalmuskala any idea why I’m getting this error when trying to use :beam_disasm.file/1?

iex(1)> :beam_disasm.file("Elixir.Test.beam")
{:error, :beam_lib, {:not_a_beam_file, "Elixir.Test.beam"}}

I’ve compiled the file with elixirc:

defmodule Test do
  def foo() do
    IO.puts "bar"
  end
end

:beam.disasm.file/1 (like most :beam_*) functions require the path as a charlist (erlang string) and not a binary.

1 Like

That is indeed some awesome insight into disassembly and the BEAM. I’m not sure if it addresses the full question though, or I am just missing something totally. The first part of the question is:

def map_a do
  %{foo: "1"} 
end

And this maps to the constant literal question. But in his question, he asks not just about the constant literal, but also the following code:

def map_b do
  Map.merge(map_a, %{bar: "2"})
end

So will the “literal” optimization extend to this call to Map.merge/2?

I extended @michalmuskala 's example to this:

defmodule Test do
  def foo, do: %{foo: "bar"}
  def baz do
    Map.merge(foo, %{huh: "wha"})
  end
end

And here is the relevant disassembly:

{:function, :baz, 0, 7,
   [{:line, 1}, {:label, 6},
    {:func_info, {:atom, Test}, {:atom, :baz}, 0}, {:label, 7},
    {:allocate, 0, 0}, {:line, 2}, {:call, 0, {Test, :foo, 0}},
    {:move, {:literal, %{huh: "wha"}}, {:x, 1}}, {:line, 2},
    {:call_ext_last, 2, {:extfunc, :maps, :merge, 2}, 0}]},

So it looks like it’s not doing any optimization to “get rid of” the Map.merge/2 function call. Every time the code is called, it will execute a merging of the constant map values to produce a new map.

I’ve never seen beam cross-module optimize, because modules can be hot-swapped, thus the implementation could change.

That does not mean the JIT does not optimize those calls though (I’ve not checked that), but at compile-time it certainly would not because of hot-code swapping.

1 Like

How do really know what Maps.merge does at runtime? The dynamic code handling means you can never safely inline a call to a function in another module, so it is not done. Inlining local functions is done but it is not always problem free. IIRC the talk by Lukas Larsson at EUC mentioned this in conjunction with tracing.

1 Like

Yes, I believe I see what you are both saying. This I think confirms my thoughts on the OP question of any optimizations done for the Maps.merge portion. This means that his merging code is always going to run everytime the map_b function is called, yes?

Here is a more precise definition of his use case:

mapping_fields = %{ 
  "id" => :id,
  "user" => %{
    "name" => :user_name,
    "email" => :user_email
  },
  "project" => %{
    "name" => :project_name
  },
  "created_at" => :inserted_at
  # many other fields
}

filterable_fields = Enum.filter(mapping_fields, fn({key,value})->
  Enum.member?([:id, :user_email, :inserted_at], value)
end) |> Enum.into(%{})

insertable_fields = Enum.filter(mapping_fields, fn({key,value})->
  Enum.member?([:user_email, :user_name, :project_name], value)
end) |> Enum.into(%{})

He is currently generating the filterable_fields and insertable_fields on each request and is looking to optimize, but first he’s just checking to see if optimization is necessary. But with the clarification of the internals, it sounds like it is going to be “rebuilt” every time when one of the Enum-generated collections is consumed.

I still think that there is a compile-time solution for him with macros, but I am afraid that I’m not proficient enough at macros to provide the answer.

A macro could indeed do it yes. :slight_smile:

1 Like

Unless of course if you expand to a function call which is done at run-time. :wink:

2 Likes

Lol, true, though you could do the Map.merge in the macro and return the output as a quoted value in in the return. ^.^

1 Like

Lol for sure :laughing:
I was going to try the macro at lunchtime here

1 Like

This is the OP who posted the question in Slack :slight_smile:

I’d appreciate examples on how to write a macro for this case, as well as any best practices on how to store constants generated by arbitrary code. I heard about using ETS, but it seems overkill for a constant which will never change after initial computation.

I’m coming from Ruby/Rails background, and there, of course, I could just do FILTERABLE_FIELDS = any_arbitrary_code() at the class/module level. I could also use cattr_attribute or class instance variable which seems similar to Elixir’s module attributes, but again in Ruby you can run arbitrary code to compute and assign to a class/module attribute.

1 Like

Something like this might do it (untested):

defmodule Foo do
  @mapping_fields %{
    "id" => :id,
    "user" => %{
      "name" => :user_name,
      "email" => :user_email
    },
    "project" => %{
      "name" => :project_name
    },
    "created_at" => :inserted_at
    # many other fields
  }

  @filterable_fields Enum.filter(@mapping_fields, fn({key,value})->
    Enum.member?([:id, :user_email, :inserted_at], value)
  end) |> Enum.into(%{})

  @insertable_fields Enum.filter(@mapping_fields, fn({key,value})->
    Enum.member?([:user_email, :user_name, :project_name], value)
  end) |> Enum.into(%{})

  defp filterable_fields(), do: @filterable_fields
  defp insertable_fields(), do: @insertable_fields

  # ... use filterable_fields() and insertable_fields() in other functions in this module
end

2 Likes

Glad you joined here! :wave:

@sasajuric Yes! I had my head fixated on macros from my initial misframing of the problem, and I didn’t realize that you could just use a function in a module attribute. This would be the “simple” aspect I think that @eugene is referring to. I also had to adjust the map filtering, since the Enum.member? call doesn’t work for the nested maps.

defmodule MapConstants.Helper do
  def gen_map(src, passing_values) when is_map(src) do
    src
    |> Enum.reduce(%{}, fn({key, value}, acc) ->
         cond do
           is_atom(value) and value in passing_values ->
             Map.put(acc, key, value)

           is_atom(value) ->
             acc

           is_map(value) ->
             nested_map = gen_map(value, passing_values)
             if map_size(nested_map) > 0 do
               Map.put(acc, key, nested_map)
             else
               acc
             end
         end
       end)
  end
end

defmodule MapConstants do
  import MapConstants.Helper

  @mapping_fields %{
    "id" => :id,
    "user" => %{
      "name" => :user_name,
      "email" => :user_email
    },
    "project" => %{
      "name" => :project_name
    },
    "created_at" => :inserted_at
  }

  @filterable_values [:id, :user_email, :inserted_at]
  @insertable_values [:user_email, :user_name, :project_name]

  @filterable_mapping_fields gen_map(@mapping_fields, @filterable_values)
  @insertable_mapping_fields gen_map(@mapping_fields, @insertable_values)

  def get_mapping_fields(:all), do: @mapping_fields
  def get_mapping_fields(:filterable), do: @filterable_mapping_fields
  def get_mapping_fields(:insertable), do: @insertable_mapping_fields
end

and here is the corresponding disassembly

  {:function, :get_mapping_fields, 1, 7,
   [{:line, 1}, {:label, 6},
    {:func_info, {:atom, MapConstants}, {:atom, :get_mapping_fields}, 1},
    {:label, 7}, {:test, :is_atom, {:f, 6}, [x: 0]},
    {:select_val, {:x, 0}, {:f, 6},
     {:list,
      [atom: :all, f: 8, atom: :filterable, f: 9, atom: :insertable, f: 10]}},
    {:label, 8},
    {:move,
     {:literal,
      %{"created_at" => :inserted_at, "id" => :id,
        "project" => %{"name" => :project_name},
        "user" => %{"email" => :user_email, "name" => :user_name}}}, {:x, 0}},
    :return, {:label, 9},
    {:move,
     {:literal,
      %{"created_at" => :inserted_at, "id" => :id,
        "user" => %{"email" => :user_email}}}, {:x, 0}}, :return, {:label, 10},
    {:move,
     {:literal,
      %{"project" => %{"name" => :project_name},
        "user" => %{"email" => :user_email, "name" => :user_name}}}, {:x, 0}},
    :return]},

And if you’re interested, I’ve uploaded a concept app for it, and here the test file.

Fantastic insights guys! I’m glad I asked here :slight_smile:

1 Like