I went ahead and decided to test a more ādirectā record interface, I added a new Record interface to emulate the back-end of if a front-end syntax were used (the start of it anyway, it could be more fleshed out).
The struct_record_bench.exs
file now:
defmodule AStruct1 do
defstruct [a: 1]
def news1(), do: %__MODULE__{}
end
defmodule AStruct9 do
defstruct [a: 1, b: 2, c: 3, d: 4, e: 5, f: 6, g: 7, h: 8, i: 9]
def news9(), do: %__MODULE__{}
end
defmodule ARecords do
import Record
defrecord :aRecord1, [a: 1]
defrecord :aRecord9, [a: 1, b: 2, c: 3, d: 4, e: 5, f: 6, g: 7, h: 8, i: 9]
def newr1(), do: aRecord1()
def newr9(), do: aRecord9()
end
defmodule BRecord do
defmacro __using__(fields) do
# Add more helpers and flesh out functions and checks if this should ever be actually 'used'
mappings = fields|>Enum.map(&elem(&1, 0))|>Enum.with_index(1)
ast_new = [quote do def new() do {__MODULE__, unquote_splicing(Enum.map(fields, &elem(&1, 1)))} end end]
ast_fields = [quote do def fields() do unquote(fields) end end]
ast_field = Enum.map(mappings, fn {k, i} ->
quote do defmacro field(unquote(k)) do unquote(i) end end
end) ++ [quote do defmacro field(k) do quote do unquote(__MODULE__).field_idx(unquote(k)) end end end]
ast_field_idx = Enum.map(mappings, fn {k, i} -> quote do def field_idx(unquote(k)), do: unquote(i) end end)
ast_get = [quote do defmacro get(r, k) do quote do elem(unquote(r), unquote(__MODULE__).field(unquote(k))) end end end]
ast_put = [quote do defmacro put(r, k, v) do quote do put_elem(unquote(r), unquote(__MODULE__).field(unquote(k)), unquote(v)) end end end]
{:__block__, [], ast_new++ast_fields++ast_field++ast_field_idx++ast_get++ast_put}
end
defmacro get(r, k) do
quote do
r = unquote(r)
elem(r, elem(r, 0).field_idx(unquote(k)))
end
end
defmacro put(r, k, v) do
quote do
r = unquote(r)
put_elem(r, elem(r, 0).field_idx(unquote(k)), unquote(v))
end
end
end
defmodule ARecord1 do
use BRecord, [a: 1]
end
defmodule ARecord9 do
use BRecord, [a: 1, b: 2, c: 3, d: 4, e: 5, f: 6, g: 7, h: 8, i: 9]
end
defmodule StructRecordBench do
import AStruct1
import AStruct9
import ARecords
require BRecord
require ARecord1
require ARecord9
def classifiers(), do: [:get, :put]
def time_mult(_), do: 2
def inputs(_) do
nil
end
def actions(:get) do
%{
"Struct1" => fn -> news1().a end,
"Struct9-first" => fn -> news9().a end,
"Struct9-last" => fn -> news9().i end,
"Record1-stock" => fn -> aRecord1(newr1(), :a) end,
"Record1-remote" => fn -> ARecord1.new() |> BRecord.get(:a) end,
"Record1-direct" => fn -> ARecord1.new() |> ARecord1.get(:a) end,
"Record9-first-stock" => fn -> aRecord9(newr9(), :a) end,
"Record9-first-remote" => fn -> ARecord9.new() |> BRecord.get(:a) end,
"Record9-first-direct" => fn -> ARecord9.new() |> ARecord9.get(:a) end,
"Record9-last-stock" => fn -> aRecord9(newr9(), :i) end,
"Record9-last-remote" => fn -> ARecord9.new() |> BRecord.get(:i) end,
"Record9-last-direct" => fn -> ARecord9.new() |> ARecord9.get(:i) end,
}
end
def actions(:put) do
%{
"Struct1" => fn -> %{news1() | a: 42} end,
"Struct1-opt" => fn -> %AStruct1{news1() | a: 42} end,
"Struct9-first" => fn -> %{news9() | a: 42} end,
"Struct9-first-opt" => fn -> %AStruct9{news9() | a: 42} end,
"Struct9-last" => fn -> %{news9() | i: 42} end,
"Struct9-last-opt" => fn -> %AStruct9{news9() | i: 42} end,
"Record1-stock" => fn -> aRecord1(newr1(), a: 42) end,
"Record1-remote" => fn -> ARecord1.new() |> BRecord.put(:a, 42) end,
"Record1-direct" => fn -> ARecord1.new() |> ARecord1.put(:a, 42) end,
"Record9-first-stock" => fn -> aRecord9(newr9(), a: 42) end,
"Record9-first-remote" => fn -> ARecord9.new() |> BRecord.put(:a, 42) end,
"Record9-first-direct" => fn -> ARecord9.new() |> ARecord9.put(:a, 42) end,
"Record9-last-stock" => fn -> aRecord9(newr9(), i: 42) end,
"Record9-last-remote" => fn -> ARecord9.new() |> BRecord.put(:i, 42) end,
"Record9-last-direct" => fn -> ARecord9.new() |> ARecord9.put(:i, 42) end,
}
end
end
Essentially the āstockā is the Elixir stock record interface, the ādirectā is what an optimized type-aware setup would do (ala requiring the use of the record ānameā in the usages, like in Erlang), and the āremoteā is what a dynamic dispatch interface would look (so something like elixirās existing struct syntax with no known type information). And the results:
ā°āā¤ mix bench struct_record 1 āµ
Benchmarking Classifier: get
============================
Operating System: Linux"
CPU Information: AMD Phenom(tm) II X6 1090T Processor
Number of Available Cores: 6
Available memory: 15.67 GB
Elixir 1.7.4
Erlang 21.1.1
Benchmark suite executing with the following configuration:
warmup: 2 s
time: 2 s
memory time: 2 s
parallel: 1
inputs: none specified
Estimated total run time: 1.20 min
Benchmarking Record1-direct...
Benchmarking Record1-remote...
Benchmarking Record1-stock...
Benchmarking Record9-first-direct...
Benchmarking Record9-first-remote...
Benchmarking Record9-first-stock...
Benchmarking Record9-last-direct...
Benchmarking Record9-last-remote...
Benchmarking Record9-last-stock...
Benchmarking Struct1...
Benchmarking Struct9-first...
Benchmarking Struct9-last...
Name ips average deviation median 99th %
Record1-direct 23.80 M 0.0420 Ī¼s Ā±81.53% 0.0400 Ī¼s 0.0600 Ī¼s
Record9-first-stock 23.12 M 0.0433 Ī¼s Ā±102.29% 0.0400 Ī¼s 0.0800 Ī¼s
Record9-first-direct 22.25 M 0.0449 Ī¼s Ā±5.41% 0.0440 Ī¼s 0.0530 Ī¼s
Record9-last-direct 22.25 M 0.0450 Ī¼s Ā±4.85% 0.0440 Ī¼s 0.0530 Ī¼s
Record9-last-stock 22.04 M 0.0454 Ī¼s Ā±6.16% 0.0440 Ī¼s 0.0540 Ī¼s
Record1-stock 22.03 M 0.0454 Ī¼s Ā±6.26% 0.0440 Ī¼s 0.0540 Ī¼s
Struct9-last 18.52 M 0.0540 Ī¼s Ā±22.96% 0.0500 Ī¼s 0.0800 Ī¼s
Struct9-first 18.45 M 0.0542 Ī¼s Ā±6.64% 0.0550 Ī¼s 0.0640 Ī¼s
Struct1 18.16 M 0.0551 Ī¼s Ā±8.19% 0.0550 Ī¼s 0.0680 Ī¼s
Record1-remote 12.31 M 0.0812 Ī¼s Ā±7.58% 0.0800 Ī¼s 0.0930 Ī¼s
Record9-last-remote 11.83 M 0.0845 Ī¼s Ā±4.51% 0.0830 Ī¼s 0.0970 Ī¼s
Record9-first-remote 11.76 M 0.0850 Ī¼s Ā±4.73% 0.0840 Ī¼s 0.0980 Ī¼s
Comparison:
Record1-direct 23.80 M
Record9-first-stock 23.12 M - 1.03x slower
Record9-first-direct 22.25 M - 1.07x slower
Record9-last-direct 22.25 M - 1.07x slower
Record9-last-stock 22.04 M - 1.08x slower
Record1-stock 22.03 M - 1.08x slower
Struct9-last 18.52 M - 1.29x slower
Struct9-first 18.45 M - 1.29x slower
Struct1 18.16 M - 1.31x slower
Record1-remote 12.31 M - 1.93x slower
Record9-last-remote 11.83 M - 2.01x slower
Record9-first-remote 11.76 M - 2.02x slower
Memory usage statistics:
Name Memory usage
Record1-direct 72 B
Record9-first-stock 72 B - 1.00x memory usage
Record9-first-direct 72 B - 1.00x memory usage
Record9-last-direct 72 B - 1.00x memory usage
Record9-last-stock 72 B - 1.00x memory usage
Record1-stock 72 B - 1.00x memory usage
Struct9-last 72 B - 1.00x memory usage
Struct9-first 72 B - 1.00x memory usage
Struct1 72 B - 1.00x memory usage
Record1-remote 72 B - 1.00x memory usage
Record9-last-remote 72 B - 1.00x memory usage
Record9-first-remote 72 B - 1.00x memory usage
**All measurements for memory usage were the same**
Benchmarking Classifier: put
============================
Operating System: Linux"
CPU Information: AMD Phenom(tm) II X6 1090T Processor
Number of Available Cores: 6
Available memory: 15.67 GB
Elixir 1.7.4
Erlang 21.1.1
Benchmark suite executing with the following configuration:
warmup: 2 s
time: 2 s
memory time: 2 s
parallel: 1
inputs: none specified
Estimated total run time: 1.50 min
Benchmarking Record1-direct...
Benchmarking Record1-remote...
Benchmarking Record1-stock...
Benchmarking Record9-first-direct...
Benchmarking Record9-first-remote...
Benchmarking Record9-first-stock...
Benchmarking Record9-last-direct...
Benchmarking Record9-last-remote...
Benchmarking Record9-last-stock...
Benchmarking Struct1...
Benchmarking Struct1-opt...
Benchmarking Struct9-first...
Benchmarking Struct9-first-opt...
Benchmarking Struct9-last...
Benchmarking Struct9-last-opt...
Name ips average deviation median 99th %
Record1-stock 18.63 M 0.0537 Ī¼s Ā±736.59% 0.0500 Ī¼s 0.120 Ī¼s
Record1-direct 17.69 M 0.0565 Ī¼s Ā±759.41% 0.0500 Ī¼s 0.120 Ī¼s
Record9-first-stock 15.75 M 0.0635 Ī¼s Ā±391.04% 0.0600 Ī¼s 0.140 Ī¼s
Record9-last-direct 15.54 M 0.0644 Ī¼s Ā±554.63% 0.0600 Ī¼s 0.140 Ī¼s
Record9-first-direct 15.49 M 0.0645 Ī¼s Ā±566.52% 0.0600 Ī¼s 0.150 Ī¼s
Record9-last-stock 15.46 M 0.0647 Ī¼s Ā±512.37% 0.0600 Ī¼s 0.150 Ī¼s
Struct1 15.11 M 0.0662 Ī¼s Ā±603.05% 0.0600 Ī¼s 0.130 Ī¼s
Struct9-first 13.87 M 0.0721 Ī¼s Ā±359.95% 0.0600 Ī¼s 0.20 Ī¼s
Struct1-opt 13.45 M 0.0744 Ī¼s Ā±292.42% 0.0700 Ī¼s 0.140 Ī¼s
Struct9-first-opt 12.81 M 0.0781 Ī¼s Ā±332.89% 0.0700 Ī¼s 0.120 Ī¼s
Struct9-last 11.68 M 0.0856 Ī¼s Ā±390.60% 0.0800 Ī¼s 0.170 Ī¼s
Struct9-last-opt 11.46 M 0.0873 Ī¼s Ā±261.13% 0.0800 Ī¼s 0.130 Ī¼s
Record1-remote 9.77 M 0.102 Ī¼s Ā±232.22% 0.100 Ī¼s 0.22 Ī¼s
Record9-first-remote 9.28 M 0.108 Ī¼s Ā±147.97% 0.100 Ī¼s 0.170 Ī¼s
Record9-last-remote 9.17 M 0.109 Ī¼s Ā±177.88% 0.100 Ī¼s 0.20 Ī¼s
Comparison:
Record1-stock 18.63 M
Record1-direct 17.69 M - 1.05x slower
Record9-first-stock 15.75 M - 1.18x slower
Record9-last-direct 15.54 M - 1.20x slower
Record9-first-direct 15.49 M - 1.20x slower
Record9-last-stock 15.46 M - 1.21x slower
Struct1 15.11 M - 1.23x slower
Struct9-first 13.87 M - 1.34x slower
Struct1-opt 13.45 M - 1.38x slower
Struct9-first-opt 12.81 M - 1.45x slower
Struct9-last 11.68 M - 1.59x slower
Struct9-last-opt 11.46 M - 1.63x slower
Record1-remote 9.77 M - 1.91x slower
Record9-first-remote 9.28 M - 2.01x slower
Record9-last-remote 9.17 M - 2.03x slower
Memory usage statistics:
Name Memory usage
Record1-stock 96 B
Record1-direct 96 B - 1.00x memory usage
Record9-first-stock 160 B - 1.67x memory usage
Record9-last-direct 160 B - 1.67x memory usage
Record9-first-direct 160 B - 1.67x memory usage
Record9-last-stock 160 B - 1.67x memory usage
Struct1 112 B - 1.17x memory usage
Struct9-first 176 B - 1.83x memory usage
Struct1-opt 112 B - 1.17x memory usage
Struct9-first-opt 176 B - 1.83x memory usage
Struct9-last 176 B - 1.83x memory usage
Struct9-last-opt 176 B - 1.83x memory usage
Record1-remote 96 B - 1.00x memory usage
Record9-first-remote 160 B - 1.67x memory usage
Record9-last-remote 160 B - 1.67x memory usage
**All measurements for memory usage were the same**
So first of all, the stock and direct should be identical if I apply all optimizations to the direct version as stock already has, even without it (a single opcode change from the looks of it, meh right now, close enough) itās still almost identical. As for direct
it is doing a remote (I.E. slow) module function call (ah if only we had tuple calls still to save the ugly elem
'ing stuff on the call, then we really could āright nowā have a native looking syntax, but tuple calls got broke in the latest OTP version, permanentlyā¦), so it should be slower in general and so it is. Some of these calls were slower on older BEAMās so that is probably why mapās were picked when maps came out for āstructsā but nowadays Iām not sure if there is a reason to having picked maps over tagged tuples for records as in the traditional cases you still put the type for 'putāing %StructName{..|..}
as well as could easily come up with something similar for optimized āgetā call). You still have unknown-record fallback capabilities as well with this even if a touch slower.
Still, it was a fun test, I really do wish Elixir Structs were tagged tuples underneath as it would make working in the Erlang ecosystem easier. ^.^