Why function's external term format in Elixir is smaller than the one in Erlang?

When using term_to_binary to convert function to external term format binary, I found that binary size in Elixir is much smaller than the one in Erlang. I was expecting they should be close. I am curious why ?

Following are the results:

In Elixir iex:

Erlang/OTP 21 [erts-10.0.4] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [hipe] [dtrace]

Interactive Elixir (1.7.2) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> byte_size(:erlang.term_to_binary(fn -> 1 end))
135
iex(2)> byte_size(:erlang.term_to_binary(fn -> 1 end, minor_version: 0))
135
iex(3)> byte_size(:erlang.term_to_binary(fn -> 1 end, minor_version: 1))
135
iex(4)> byte_size(:erlang.term_to_binary(fn -> 1 end, minor_version: 2))
129

In Erlang shell:

Erlang/OTP 21 [erts-10.0.4] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [hipe] [dtrace]

Eshell V10.0.4  (abort with ^G)
1> byte_size(term_to_binary(fun() -> 1 end)).
709
2> byte_size(term_to_binary(fun() -> 1 end, [{minor_version, 0}])).
709
3> byte_size(term_to_binary(fun() -> 1 end, [{minor_version, 1}])).
709
4> byte_size(term_to_binary(fun() -> 1 end, [{minor_version, 2}])).
684
5 Likes

That’s a fascinating question… I had never noticed it.

Based on :fun_info/1, the only significant difference I find is on eval and value keys of the env property:

> erlang:fun_info(fun() -> 1 end).
[{pid,<0.77.0>},
 {module,erl_eval},
 {new_index,20},
 {new_uniq,<<189,144,182,154,187,236,207,96,89,18,34,161,
             52,152,16,216>>},
 {index,20},
 {uniq,99386804},
 {name,'-expr/5-fun-3-'},
 {arity,0},
 {env,[{[],
        {eval,#Fun<shell.21.44360414>},
        {value,#Fun<shell.5.44360414>},
        [{clause,6,[],[],[{integer,6,1}]}]}]},
 {type,local}]
> :erlang.fun_info(fn -> 1 end)
[
  pid: #PID<0.89.0>,
  module: :erl_eval,
  new_index: 20,
  new_uniq: <<189, 144, 182, 154, 187, 236, 207, 96, 89, 18, 34, 161, 52, 152,
    16, 216>>,
  index: 20,
  uniq: 99386804,
  name: :"-expr/5-fun-3-",
  arity: 0,
  env: [{[], :none, :none, [{:clause, 7, [], [], [{:integer, 0, 1}]}]}],
  type: :local
]

Which, when looking deeper, gets us:

> {env, [{[], {eval,Eval}, {value,Value}, _}]} = erlang:fun_info(fun() -> 1 end, env).
{env,[{[],
       {eval,#Fun<shell.21.44360414>},
       {value,#Fun<shell.5.44360414>},
       [{clause,6,[],[],[{integer,6,1}]}]}]}
 
> byte_size( term_to_binary(Eval) ).
473

> byte_size( term_to_binary(Value) ).
98

> 135 + 473 + 98.
706
2 Likes

In any case, I have no idea what it means. Some sort of environment imported from the shell? Variables defined on it, even if not used? I tried doing that explicitly but found no meaningful pattern when it comes to the size of serialized functions.

The difference goes away once you serialize compiled anonymous functions, rather than shell-defined ones:

foo.erl:

-module(foo).                                                                                                                                                                                                                                  
                                                                                                                                                                                                                                               
-export([bar/0]).                                                                                                                                                                                                                              
                                                                                                                                                                                                                                               
bar() ->                                                                                                                                                                                                                                       
    fun () -> 1 end. 

foo.ex:

defmodule Foo do                                                                                                                                                                                                                               
  def bar() do                                                                                                                                                                                                                                 
    fn () -> 1 end                                                                                                                                                                                                                             
  end                                                                                                                                                                                                                                          
end

> byte_size( term_to_binary( foo:bar() ) ).
70
> byte_size( :erlang.term_to_binary( Foo.bar ) )
77
4 Likes

… Which might make sense if those are serialized function references rather than serialized code. I’m at a loss.