Elixir structs vs Erlang records

tim2CF · December 10, 2018, 4:02pm

Hello!

Does anyone know why Elixir structures are implemented on top of Erlang maps, but not on top of Erlang records (tuples)? For me, Erlang records are looking more natural then maps for basement of Elixir structs, and besides in some cases it seems performance of records is better (for example in pattern matching), take a look:

iex(2)> defmodule Hello do
...(2)> defstruct [:foo]
...(2)> end
{:module, Hello,
 <<70, 79, 82, 49, 0, 0, 5, 144, 66, 69, 65, 77, 65, 116, 85, 56, 0, 0, 0, 182,
   0, 0, 0, 18, 12, 69, 108, 105, 120, 105, 114, 46, 72, 101, 108, 108, 111, 8,
   95, 95, 105, 110, 102, 111, 95, 95, 7, ...>>, %Hello{foo: nil}}
iex(3)> struct = %Hello{foo: 123}
%Hello{foo: 123}
iex(4)> fn -> 1..1000000 |> Enum.each(fn(_) -> %Hello{foo: foo} = struct; foo end) end |> :timer.tc  
{13719674, :ok} 
iex(5)> record = {Hello, 123}
{Hello, 123}
iex(6)> fn -> 1..1000000 |> Enum.each(fn(_) -> {Hello, foo} = record; foo end) end |> :timer.tc
{565886, :ok}
iex(7)>

Of course, you can say - “If you like Erlang records, just use it”, but Elixir infrastructure is very coupled with structs/maps - they are used everywhere (Ecto, Phoenix, Plug, Elixir standard libraries). So if I’m using Elixir, I don’t really have a choice - I have to use structs/maps

kokolegorille · December 10, 2018, 4:09pm

There is a good answer from Jose here about the choice of struct vs record.

https://groups.google.com/forum/#!msg/elixir-lang-talk/6kn7J2XnFg8/I5poTNCEHwAJ

michalmuskala · December 10, 2018, 4:18pm

Please, please, don’t benchmark in the shell. The shell runs an interpreter and not the compiled code. The result will probably will be wildly different.

OvermindDL1 · December 10, 2018, 7:43pm

Indeed, here it is in Benchee, I tried to prevent certain optimizations from happening by interning the test data in different modules than that which is accessed so things like the record macro’s don’t get optimized out and so forth (records were a lot faster than structs before I made that change).

Code struct_record_bench.exs:

defmodule AStruct1 do
  defstruct [a: 1]
  def news1(), do: %__MODULE__{}
end

defmodule AStruct9 do
  defstruct [a: 1, b: 2, c: 3, d: 4, e: 5, f: 6, g: 7, h: 8, i: 9]
  def news9(), do: %__MODULE__{}
end

defmodule ARecords do
  import Record
  defrecord :aRecord1, [a: 1]
  defrecord :aRecord9, [a: 1, b: 2, c: 3, d: 4, e: 5, f: 6, g: 7, h: 8, i: 9]
  def newr1(), do: aRecord1()
  def newr9(), do: aRecord9()
end

defmodule StructRecordBench do
  import AStruct1
  import AStruct9
  import ARecords

  def classifiers(), do: [:get, :put]

  def time_mult(_), do: 2

  def inputs(_) do
    nil
  end

  def actions(:get) do
    %{
      "Struct1" => fn -> news1().a end,
      "Struct9-first" => fn -> news9().a end,
      "Struct9-last" => fn -> news9().i end,
      "Record1" => fn -> aRecord1(newr1(), :a) end,
      "Record9-first" => fn -> aRecord9(newr9(), :a) end,
      "Record9-last" => fn -> aRecord9(newr9(), :i) end,
    }
  end

  def actions(:put) do
    %{
      "Struct1" => fn -> %{news1() | a: 42} end,
      "Struct1-opt" => fn -> %AStruct1{news1() | a: 42} end,
      "Struct9-first" => fn -> %{news9() | a: 42} end,
      "Struct9-first-opt" => fn -> %AStruct9{news9() | a: 42} end,
      "Struct9-last" => fn -> %{news9() | i: 42} end,
      "Struct9-last-opt" => fn -> %AStruct9{news9() | i: 42} end,
      "Record1" => fn -> aRecord1(newr1(), a: 42) end,
      "Record9-first" => fn -> aRecord9(newr9(), a: 42) end,
      "Record9-last" => fn -> aRecord9(newr9(), i: 42) end,
    }
  end
end

Results:

╰─➤  mix bench struct_record           

Benchmarking Classifier:  get
=============================

Operating System: Linux"
CPU Information: AMD Phenom(tm) II X6 1090T Processor
Number of Available Cores: 6
Available memory: 15.67 GB
Elixir 1.7.4
Erlang 21.1.1

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 2 s
memory time: 2 s
parallel: 1
inputs: none specified
Estimated total run time: 36 s

Benchmarking Record1...
Benchmarking Record9-first...
Benchmarking Record9-last...
Benchmarking Struct1...
Benchmarking Struct9-first...
Benchmarking Struct9-last...

Name                    ips        average  deviation         median         99th %
Record9-last        23.29 M      0.0429 μs   ±107.83%      0.0400 μs      0.0700 μs
Record9-first       22.06 M      0.0453 μs     ±5.87%      0.0440 μs      0.0550 μs
Record1             21.61 M      0.0463 μs    ±10.12%      0.0450 μs      0.0640 μs
Struct9-first       18.14 M      0.0551 μs    ±19.80%      0.0560 μs      0.0660 μs
Struct1             17.93 M      0.0558 μs     ±8.76%      0.0560 μs      0.0700 μs
Struct9-last        17.46 M      0.0573 μs     ±6.31%      0.0560 μs      0.0720 μs

Comparison:  
Record9-last        23.29 M
Record9-first       22.06 M - 1.06x slower
Record1             21.61 M - 1.08x slower
Struct9-first       18.14 M - 1.28x slower
Struct1             17.93 M - 1.30x slower
Struct9-last        17.46 M - 1.33x slower

Memory usage statistics:

Name             Memory usage
Record9-last             72 B
Record9-first            72 B - 1.00x memory usage
Record1                  72 B - 1.00x memory usage
Struct9-first            72 B - 1.00x memory usage
Struct1                  72 B - 1.00x memory usage
Struct9-last             72 B - 1.00x memory usage

**All measurements for memory usage were the same**

Benchmarking Classifier:  put
=============================

Operating System: Linux"
CPU Information: AMD Phenom(tm) II X6 1090T Processor
Number of Available Cores: 6
Available memory: 15.67 GB
Elixir 1.7.4
Erlang 21.1.1

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 2 s
memory time: 2 s
parallel: 1
inputs: none specified
Estimated total run time: 54 s

Benchmarking Record1...
Benchmarking Record9-first...
Benchmarking Record9-last...
Benchmarking Struct1...
Benchmarking Struct1-opt...
Benchmarking Struct9-first...
Benchmarking Struct9-first-opt...
Benchmarking Struct9-last...
Benchmarking Struct9-last-opt...

Name                        ips        average  deviation         median         99th %
Record1                 18.38 M      0.0544 μs   ±691.86%      0.0500 μs       0.120 μs
Record9-first           15.52 M      0.0644 μs   ±497.05%      0.0600 μs       0.140 μs
Record9-last            15.45 M      0.0647 μs   ±539.09%      0.0600 μs       0.150 μs
Struct1                 15.13 M      0.0661 μs   ±647.84%      0.0600 μs       0.130 μs
Struct1-opt             14.48 M      0.0691 μs   ±327.02%      0.0600 μs       0.110 μs
Struct9-first           14.31 M      0.0699 μs   ±309.43%      0.0600 μs       0.160 μs
Struct9-first-opt       12.73 M      0.0786 μs   ±328.39%      0.0700 μs       0.130 μs
Struct9-last            11.66 M      0.0857 μs   ±414.23%      0.0800 μs        0.21 μs
Struct9-last-opt        10.74 M      0.0931 μs   ±322.63%      0.0800 μs        0.21 μs

Comparison:  
Record1                 18.38 M
Record9-first           15.52 M - 1.18x slower
Record9-last            15.45 M - 1.19x slower
Struct1                 15.13 M - 1.21x slower
Struct1-opt             14.48 M - 1.27x slower
Struct9-first           14.31 M - 1.28x slower
Struct9-first-opt       12.73 M - 1.44x slower
Struct9-last            11.66 M - 1.58x slower
Struct9-last-opt        10.74 M - 1.71x slower

Memory usage statistics:

Name                 Memory usage
Record1                      96 B
Record9-first               160 B - 1.67x memory usage
Record9-last                160 B - 1.67x memory usage
Struct1                     112 B - 1.17x memory usage
Struct1-opt                 112 B - 1.17x memory usage
Struct9-first               176 B - 1.83x memory usage
Struct9-first-opt           176 B - 1.83x memory usage
Struct9-last                176 B - 1.83x memory usage
Struct9-last-opt            176 B - 1.83x memory usage

**All measurements for memory usage were the same**

So Records are faster in general (in all tested cases here actually) than Structs, but only marginally so, so much so that only the most performance sensitive code would really care, so in general most people shouldn’t care.

I’m surprised that putting the struct type in the update syntax doesn’t make it faster actually as it could infer some existing structure, but I guess that all it would be doing is adding an extra runtime check or so (hence the module-optional variants are slower in the end).

EDIT: Personal Opinion time: Personally I’d prefer records were ubiquitous and used struct syntax (first class records in other words). Records in every language I’ve seen are statically sized, there is no point in them being maps, especially if they ‘own’ their module definition as structs do now then all the proper accessors for Access and extra data would all be accessible as they are for structs as well and as such by using those generated macro’s then you could generate getting/setting code that would be even more efficient than how structs work now. HOWEVER, Elixir is extremely poorly typed and doesn’t know what the type of a given thing would be, and Erlang works around that by requiring using the record name at all uses of a record variable, Elixir tries to be a little more succinct, and that succinctness is at odds with efficiency, and so the first-class syntax uses the slightly less efficient version in order for ease of use and relegates the more efficient version to a side set of macro’s since you require the names anyway. If Elixir had a decent typing system then you’d be able to have both efficiency and succinctness, but maybe that’s for an Elixir 2.0 or something. ^.^

EDIT: Hmm, a possible workaround for the first-class syntax access would be just dispatching based on the ‘module’ in the type-tag of the record, it would be a ‘remote call’ on the BEAM but might be good… I should test…

Qqwy · December 10, 2018, 7:59pm

Great benchmark!

In practice, wouldn’t the datastructure’s internals usually be read/patternmatched/created/modified from within the same module that defines it? So is preventing these optimizations justified?

OvermindDL1 · December 10, 2018, 8:03pm

In some cases sure, in those cases though then records are even much faster than structs (still not enough for ‘most’ code to care though).

OvermindDL1 · December 10, 2018, 9:43pm

I went ahead and decided to test a more ‘direct’ record interface, I added a new Record interface to emulate the back-end of if a front-end syntax were used (the start of it anyway, it could be more fleshed out).

The struct_record_bench.exs file now:

defmodule AStruct1 do
  defstruct [a: 1]
  def news1(), do: %__MODULE__{}
end

defmodule AStruct9 do
  defstruct [a: 1, b: 2, c: 3, d: 4, e: 5, f: 6, g: 7, h: 8, i: 9]
  def news9(), do: %__MODULE__{}
end

defmodule ARecords do
  import Record
  defrecord :aRecord1, [a: 1]
  defrecord :aRecord9, [a: 1, b: 2, c: 3, d: 4, e: 5, f: 6, g: 7, h: 8, i: 9]
  def newr1(), do: aRecord1()
  def newr9(), do: aRecord9()
end

defmodule BRecord do
  defmacro __using__(fields) do
    # Add more helpers and flesh out functions and checks if this should ever be actually 'used'
    mappings = fields|>Enum.map(&elem(&1, 0))|>Enum.with_index(1)
    ast_new = [quote do def new() do {__MODULE__, unquote_splicing(Enum.map(fields, &elem(&1, 1)))} end end]
    ast_fields = [quote do def fields() do unquote(fields) end end]
    ast_field = Enum.map(mappings, fn {k, i} ->
      quote do defmacro field(unquote(k)) do unquote(i) end end
    end) ++ [quote do defmacro field(k) do quote do unquote(__MODULE__).field_idx(unquote(k)) end end end]
    ast_field_idx = Enum.map(mappings, fn {k, i} -> quote do def field_idx(unquote(k)), do: unquote(i) end end)
    ast_get = [quote do defmacro get(r, k) do quote do elem(unquote(r), unquote(__MODULE__).field(unquote(k))) end end end]
    ast_put = [quote do defmacro put(r, k, v) do quote do put_elem(unquote(r), unquote(__MODULE__).field(unquote(k)), unquote(v)) end end end]
    {:__block__, [], ast_new++ast_fields++ast_field++ast_field_idx++ast_get++ast_put}
  end

  defmacro get(r, k) do
    quote do
      r = unquote(r)
      elem(r, elem(r, 0).field_idx(unquote(k)))
    end
  end

  defmacro put(r, k, v) do
    quote do
      r = unquote(r)
      put_elem(r, elem(r, 0).field_idx(unquote(k)), unquote(v))
    end
  end
end

defmodule ARecord1 do
  use BRecord, [a: 1]
end

defmodule ARecord9 do
  use BRecord, [a: 1, b: 2, c: 3, d: 4, e: 5, f: 6, g: 7, h: 8, i: 9]
end

defmodule StructRecordBench do
  import AStruct1
  import AStruct9
  import ARecords
  require BRecord
  require ARecord1
  require ARecord9

  def classifiers(), do: [:get, :put]

  def time_mult(_), do: 2

  def inputs(_) do
    nil
  end

  def actions(:get) do
    %{
      "Struct1" => fn -> news1().a end,
      "Struct9-first" => fn -> news9().a end,
      "Struct9-last" => fn -> news9().i end,
      "Record1-stock" => fn -> aRecord1(newr1(), :a) end,
      "Record1-remote" => fn -> ARecord1.new() |> BRecord.get(:a) end,
      "Record1-direct" => fn -> ARecord1.new() |> ARecord1.get(:a) end,
      "Record9-first-stock" => fn -> aRecord9(newr9(), :a) end,
      "Record9-first-remote" => fn -> ARecord9.new() |> BRecord.get(:a) end,
      "Record9-first-direct" => fn -> ARecord9.new() |> ARecord9.get(:a) end,
      "Record9-last-stock" => fn -> aRecord9(newr9(), :i) end,
      "Record9-last-remote" => fn -> ARecord9.new() |> BRecord.get(:i) end,
      "Record9-last-direct" => fn -> ARecord9.new() |> ARecord9.get(:i) end,
    }
  end

  def actions(:put) do
    %{
      "Struct1" => fn -> %{news1() | a: 42} end,
      "Struct1-opt" => fn -> %AStruct1{news1() | a: 42} end,
      "Struct9-first" => fn -> %{news9() | a: 42} end,
      "Struct9-first-opt" => fn -> %AStruct9{news9() | a: 42} end,
      "Struct9-last" => fn -> %{news9() | i: 42} end,
      "Struct9-last-opt" => fn -> %AStruct9{news9() | i: 42} end,
      "Record1-stock" => fn -> aRecord1(newr1(), a: 42) end,
      "Record1-remote" => fn -> ARecord1.new() |> BRecord.put(:a, 42) end,
      "Record1-direct" => fn -> ARecord1.new() |> ARecord1.put(:a, 42) end,
      "Record9-first-stock" => fn -> aRecord9(newr9(), a: 42) end,
      "Record9-first-remote" => fn -> ARecord9.new() |> BRecord.put(:a, 42) end,
      "Record9-first-direct" => fn -> ARecord9.new() |> ARecord9.put(:a, 42) end,
      "Record9-last-stock" => fn -> aRecord9(newr9(), i: 42) end,
      "Record9-last-remote" => fn -> ARecord9.new() |> BRecord.put(:i, 42) end,
      "Record9-last-direct" => fn -> ARecord9.new() |> ARecord9.put(:i, 42) end,
    }
  end
end

Essentially the ‘stock’ is the Elixir stock record interface, the ‘direct’ is what an optimized type-aware setup would do (ala requiring the use of the record ‘name’ in the usages, like in Erlang), and the ‘remote’ is what a dynamic dispatch interface would look (so something like elixir’s existing struct syntax with no known type information). And the results:

╰─➤  mix bench struct_record                                                                                                      1 ↵

Benchmarking Classifier: get
============================

Operating System: Linux"
CPU Information: AMD Phenom(tm) II X6 1090T Processor
Number of Available Cores: 6
Available memory: 15.67 GB
Elixir 1.7.4
Erlang 21.1.1

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 2 s
memory time: 2 s
parallel: 1
inputs: none specified
Estimated total run time: 1.20 min


Benchmarking Record1-direct...
Benchmarking Record1-remote...
Benchmarking Record1-stock...
Benchmarking Record9-first-direct...
Benchmarking Record9-first-remote...
Benchmarking Record9-first-stock...
Benchmarking Record9-last-direct...
Benchmarking Record9-last-remote...
Benchmarking Record9-last-stock...
Benchmarking Struct1...
Benchmarking Struct9-first...
Benchmarking Struct9-last...

Name                           ips        average  deviation         median         99th %
Record1-direct             23.80 M      0.0420 μs    ±81.53%      0.0400 μs      0.0600 μs
Record9-first-stock        23.12 M      0.0433 μs   ±102.29%      0.0400 μs      0.0800 μs
Record9-first-direct       22.25 M      0.0449 μs     ±5.41%      0.0440 μs      0.0530 μs
Record9-last-direct        22.25 M      0.0450 μs     ±4.85%      0.0440 μs      0.0530 μs
Record9-last-stock         22.04 M      0.0454 μs     ±6.16%      0.0440 μs      0.0540 μs
Record1-stock              22.03 M      0.0454 μs     ±6.26%      0.0440 μs      0.0540 μs
Struct9-last               18.52 M      0.0540 μs    ±22.96%      0.0500 μs      0.0800 μs
Struct9-first              18.45 M      0.0542 μs     ±6.64%      0.0550 μs      0.0640 μs
Struct1                    18.16 M      0.0551 μs     ±8.19%      0.0550 μs      0.0680 μs
Record1-remote             12.31 M      0.0812 μs     ±7.58%      0.0800 μs      0.0930 μs
Record9-last-remote        11.83 M      0.0845 μs     ±4.51%      0.0830 μs      0.0970 μs
Record9-first-remote       11.76 M      0.0850 μs     ±4.73%      0.0840 μs      0.0980 μs

Comparison: 
Record1-direct             23.80 M
Record9-first-stock        23.12 M - 1.03x slower
Record9-first-direct       22.25 M - 1.07x slower
Record9-last-direct        22.25 M - 1.07x slower
Record9-last-stock         22.04 M - 1.08x slower
Record1-stock              22.03 M - 1.08x slower
Struct9-last               18.52 M - 1.29x slower
Struct9-first              18.45 M - 1.29x slower
Struct1                    18.16 M - 1.31x slower
Record1-remote             12.31 M - 1.93x slower
Record9-last-remote        11.83 M - 2.01x slower
Record9-first-remote       11.76 M - 2.02x slower

Memory usage statistics:

Name                    Memory usage
Record1-direct                  72 B
Record9-first-stock             72 B - 1.00x memory usage
Record9-first-direct            72 B - 1.00x memory usage
Record9-last-direct             72 B - 1.00x memory usage
Record9-last-stock              72 B - 1.00x memory usage
Record1-stock                   72 B - 1.00x memory usage
Struct9-last                    72 B - 1.00x memory usage
Struct9-first                   72 B - 1.00x memory usage
Struct1                         72 B - 1.00x memory usage
Record1-remote                  72 B - 1.00x memory usage
Record9-last-remote             72 B - 1.00x memory usage
Record9-first-remote            72 B - 1.00x memory usage

**All measurements for memory usage were the same**

Benchmarking Classifier: put
============================

Operating System: Linux"
CPU Information: AMD Phenom(tm) II X6 1090T Processor
Number of Available Cores: 6
Available memory: 15.67 GB
Elixir 1.7.4
Erlang 21.1.1

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 2 s
memory time: 2 s
parallel: 1
inputs: none specified
Estimated total run time: 1.50 min


Benchmarking Record1-direct...
Benchmarking Record1-remote...
Benchmarking Record1-stock...
Benchmarking Record9-first-direct...
Benchmarking Record9-first-remote...
Benchmarking Record9-first-stock...
Benchmarking Record9-last-direct...
Benchmarking Record9-last-remote...
Benchmarking Record9-last-stock...
Benchmarking Struct1...
Benchmarking Struct1-opt...
Benchmarking Struct9-first...
Benchmarking Struct9-first-opt...
Benchmarking Struct9-last...
Benchmarking Struct9-last-opt...

Name                           ips        average  deviation         median         99th %
Record1-stock              18.63 M      0.0537 μs   ±736.59%      0.0500 μs       0.120 μs
Record1-direct             17.69 M      0.0565 μs   ±759.41%      0.0500 μs       0.120 μs
Record9-first-stock        15.75 M      0.0635 μs   ±391.04%      0.0600 μs       0.140 μs
Record9-last-direct        15.54 M      0.0644 μs   ±554.63%      0.0600 μs       0.140 μs
Record9-first-direct       15.49 M      0.0645 μs   ±566.52%      0.0600 μs       0.150 μs
Record9-last-stock         15.46 M      0.0647 μs   ±512.37%      0.0600 μs       0.150 μs
Struct1                    15.11 M      0.0662 μs   ±603.05%      0.0600 μs       0.130 μs
Struct9-first              13.87 M      0.0721 μs   ±359.95%      0.0600 μs        0.20 μs
Struct1-opt                13.45 M      0.0744 μs   ±292.42%      0.0700 μs       0.140 μs
Struct9-first-opt          12.81 M      0.0781 μs   ±332.89%      0.0700 μs       0.120 μs
Struct9-last               11.68 M      0.0856 μs   ±390.60%      0.0800 μs       0.170 μs
Struct9-last-opt           11.46 M      0.0873 μs   ±261.13%      0.0800 μs       0.130 μs
Record1-remote              9.77 M       0.102 μs   ±232.22%       0.100 μs        0.22 μs
Record9-first-remote        9.28 M       0.108 μs   ±147.97%       0.100 μs       0.170 μs
Record9-last-remote         9.17 M       0.109 μs   ±177.88%       0.100 μs        0.20 μs

Comparison: 
Record1-stock              18.63 M
Record1-direct             17.69 M - 1.05x slower
Record9-first-stock        15.75 M - 1.18x slower
Record9-last-direct        15.54 M - 1.20x slower
Record9-first-direct       15.49 M - 1.20x slower
Record9-last-stock         15.46 M - 1.21x slower
Struct1                    15.11 M - 1.23x slower
Struct9-first              13.87 M - 1.34x slower
Struct1-opt                13.45 M - 1.38x slower
Struct9-first-opt          12.81 M - 1.45x slower
Struct9-last               11.68 M - 1.59x slower
Struct9-last-opt           11.46 M - 1.63x slower
Record1-remote              9.77 M - 1.91x slower
Record9-first-remote        9.28 M - 2.01x slower
Record9-last-remote         9.17 M - 2.03x slower

Memory usage statistics:

Name                    Memory usage
Record1-stock                   96 B
Record1-direct                  96 B - 1.00x memory usage
Record9-first-stock            160 B - 1.67x memory usage
Record9-last-direct            160 B - 1.67x memory usage
Record9-first-direct           160 B - 1.67x memory usage
Record9-last-stock             160 B - 1.67x memory usage
Struct1                        112 B - 1.17x memory usage
Struct9-first                  176 B - 1.83x memory usage
Struct1-opt                    112 B - 1.17x memory usage
Struct9-first-opt              176 B - 1.83x memory usage
Struct9-last                   176 B - 1.83x memory usage
Struct9-last-opt               176 B - 1.83x memory usage
Record1-remote                  96 B - 1.00x memory usage
Record9-first-remote           160 B - 1.67x memory usage
Record9-last-remote            160 B - 1.67x memory usage

**All measurements for memory usage were the same**

So first of all, the stock and direct should be identical if I apply all optimizations to the direct version as stock already has, even without it (a single opcode change from the looks of it, meh right now, close enough) it’s still almost identical. As for direct it is doing a remote (I.E. slow) module function call (ah if only we had tuple calls still to save the ugly elem'ing stuff on the call, then we really could “right now” have a native looking syntax, but tuple calls got broke in the latest OTP version, permanently…), so it should be slower in general and so it is. Some of these calls were slower on older BEAM’s so that is probably why map’s were picked when maps came out for ‘structs’ but nowadays I’m not sure if there is a reason to having picked maps over tagged tuples for records as in the traditional cases you still put the type for 'put’ing %StructName{..|..} as well as could easily come up with something similar for optimized ‘get’ call). You still have unknown-record fallback capabilities as well with this even if a touch slower.

Still, it was a fun test, I really do wish Elixir Structs were tagged tuples underneath as it would make working in the Erlang ecosystem easier. ^.^

michalmuskala · December 10, 2018, 10:04pm

The primary advantage of structs over records is readability. I can take any struct, print it anytime and it will be readable. With records, you can easily get an unreadable pile of nested tuples - it’s a huge pain when working with Erlang APIs. Another huge disadvantage of records is during upgrades - changing the record structure correctly is extremely hard when you have the “new” modules with the new definition of the records that need to handle the old records. For that reason, I know that some Erlang codebases don’t use records at all, but rather prefer proplists, which are a poor-man’s version of maps.

Qqwy · December 10, 2018, 10:15pm

To be honest, I think the readability of map-based structs is greatly improved because of Elixir’s syntactic sugar that turns %{a: 1, b: 2, __struct__: Foo} into %Foo{a: 1, b: 2}. But of course even without that you are able to see the names of the different fields; is this what you mean?

As for upgrading: This is definitely a place where map-based structs are better than tuple-based records!

OvermindDL1 · December 10, 2018, 10:47pm

That’s why you have an inspect for them. Unlike the older Elixir days pre-consolidation (and my ProtocolEx can handle tagged tuples fine and faster than Elixir’s Protocols) the protocol can know precisely what is supported and what is not and such a default implementation could even by specified by the defbetterrecord call itself (or via an option in case they want to override it, or add an __inspect__ method in the module as a fallback or whatever, lots of options).

With my above defined BRecords module you could easily have upgrade functions specified inside it then protocol dispatch an upgrade path along them (and other things if needed), or just call the functions straight to upgrade them. You have to do similar things to maps and structs anyway as data formats change, a string might need to become a list in whatever field, etc… etc… It is good to reify those to a specific area and just pass in the needed upgrade information as always. I never had issues with that all in Erlang.

I never actually ran across a library that used proplists as their ‘state’ store, records or trivial values I’ve always seen. That sounds like a very bad way to handle the code and would make dialyzer typing it a bit more irritating as well (though with such codebases I wonder if they used dialyzer at all).

You could have identical syntax for records though, that exact same %Foo{a: 1, b: 2} could easily generate a Foo record.

Still unsure about that, with maps it’s easy to keep old useless data polluting it, more irritating to Dialyze, takes up more space, takes longer to update unless it’s truly huge, and I still think it is a bad BAD idea to upgrade state in-place, you should always decompose an old version and build a new version just to make sure dialyzer helps catch issues (which it would not otherwise on upgrades since it ‘assumes’ the old version types would be the new already unless you explicitly override it), to make sure you don’t miss something (with a map you can forget a key pretty easily, not so with records), among other things.

All of this would of course be far far more sensible and direct if Elixir actually had a half decent type system, dynamic typing is a horror and is the reason why this is an issue at all… >.<

tim2CF · December 11, 2018, 9:36am

Totally agree, and I hope at least one of these projects will be usable someday…

Just off-topic comment because you mentioned type systems

OvermindDL1 · December 11, 2018, 4:44pm

Heh, I’m well known around here as the proponent for good type systems. ^.^;

I made a Gradualixir plugin for mix that lets you use gradualizer like you can use dialyzer if you are curious.

It’s not on hex.pm (use it from github master) since gradualizer is incomplete yet and thus not on hex.pm yet, and you can’t put things on hex.pm that depend on something that’s not on hex.pm.

Qqwy · December 12, 2018, 2:01pm

I think in the end, the main advantage that maps have over records, is that they contain the names of the fields they contain right there, rather than these being position-dependent.
This has great advantages for introspection, pattern-matching and dispatch.

some of that could be built on top of a tuple-based system, but not all. Most importantly, knowing the field names of a record is only possible whenever the record definition itself is in scope.

OvermindDL1 · December 12, 2018, 4:41pm

True but in a tagged tuple tagged with a module then getting the field names is just as simple as something like elem(thing, 0).fields() or whatever information you want. Plus you could have another encoding scheme that just encodes it into it, perhaps in-position like {Blah, {:a, 1}, {:b, 2}} or maybe as metadata at the end or so like {Blah, 1, 2, [:a, :b]} or there are tons of other possible format as well. Hmm, ideas…

michalmuskala · December 12, 2018, 6:17pm

Or maybe something like:

{{:__struct__, :a, :b}, Blah, 1, 2}

Oh, wait. That’s almost exactly how structs on top of small maps are implemented

OvermindDL1 · December 12, 2018, 7:38pm

Lol, it is exactly, and that would work well though it decomposes into 2 elem calls then, they are quite cheap though. ^.^
There are many styles to represent such a thing, though with how rare needing the names are I’d still think just putting them on the parent module is overall best, uses much less memory in the representation and makes it far easier to serialize (though if you kept the names in the representation and maybe a version key then you could implement fairly transparent migrations, hmm, structs could do that as well…).