Elixir vs. Erlang benchmarks - is one faster than the other?

Is there any significant difference in the runtime speed of programs written in erlang vs. elixir? What (if any) advantages are there to writing one’s source code in raw erlang instead of elixir?

There are no differences, both compile to the same byte code. Erlang is not more “raw” or low level than Elixir.

15 Likes
  • For comprehensions (for lists) are more efficient in erlang, I think.
  • Enum functions are slower than their counterparts in erlang since they do more work (protocol dispatch).
  • Functions that are wrappers around erlang’s functions (without @compile {:inline, ...}) require additional function call, which increases reduction count and can affect execution time of that function.
6 Likes

Yes! Elixir is slightly slower because we dispatch to Enum.reduce and Erlang doesn’t perform a remote call.

We inline most common types with the list type coming first. So for the builtin types, that should be no difference in performance.

@compile {:inline, ...} only works within the same module so it is not a general mechanism for wrapping Erlang functions. The Elixir compiler, however, does inline most of its calls to :erlang.

So there are tiny differences in some very specific cases, but they are quite unlikely to matter in general.

21 Likes

Also elixir code might be sometimes less efficient because of the way people write it and not due to the language itself.

  • using string concatenation instead of iolists
  • using structs instead of records
  • accessing map elements via . several times instead of a single pattern match
  • using Keyword.fetch on every element in an “opts” list instead of iterating over it once
  • in general, trying to use elixir “imperatively” thus producing subpar code (many examples of this on this very forum)
12 Likes

In general the differences are tiny. Those should also decrease in the next OTP release with several patches to the compiler optimising some idioms common in Elixir.

There are also some cases, where the way Elixir programs are usually structured might be more efficient than Erlang programs.

  • accessing map elements via . instead of maps:get calls
  • using binaries for default string type instead of charlists
  • doing some computation at compile-time via macros

And there are probably more - it’s not black and white. Overall I wouldn’t expect major performance differences between Erlang and Elixir code, especially if you’re writing it paying attention to performance.

5 Likes

I made a fixed version a half-year ago! ^.^

Mine is actually faster than even erlang’s in some cases (same speed in the average case) due to my design requiring you to essentially ‘type’ the arguments so it can generate the most efficient code. It was mostly a proof of concept though (hence why it’s in my ‘proof-of-concept’ repo) and never finished it up to put it out into it’s own package. :slight_smile:

But the syntax (as from this benchmark https://github.com/OvermindDL1/ex_core/blob/master/bench/comprehension_bench.exs ):

defmodule Helpers do
  use ExCore.Comprehension

  # map * 2

  def elixir_0(l) do
    for\
      x <- l,
      do: x * 2
  end

  def ex_core_0(l) do
    comp do
      x <- list l
      x * 2
    end
  end

  # Into map value to value*2 after adding 1

  def elixir_1(l) do
    for\
      x <- l,
      y = x + 1,
      into: %{},
      do: {x, y * 2}
  end

  def ex_core_1(l) do
    comp do
      x <- list l
      y = x + 1
      {x, y * 2} -> %{}
    end
  end
end

I’m actually surprised at that it inlines at the compiler step instead of just making those calls defmacro delegations or so… Seems a duplication of effort? (Though I’m betting the compiler-level inlining was added well before macros were. ^.^)

Even the Elixir standard to_string/1 can take an iolist and put out a binary just fine so there is no reason to. In some cases binary appending is more efficient however.

This speed difference should be reduced in OTP 21 I think it was, though there will still be a difference, just not ‘as’ big. :slight_smile:

It would be interesting Elixir didn’t use . for maps and instead had a built-in lens/prism functionality that ended up generating matchers.

Yeah the default access pattern of opts[:blah] is definitely not the most efficient by far… (and yet I do it a lot due to its simplicity) ^.^;

Back in my ol’ Erlang days I had a fairly simple library, similar to the commandline-opt-parse module in Erlang that did a single pass over the output and spit back out a tuple (based on the requested values) after fixing them up (erlang style proplists are so much nicer than elixir style keyword lists) into a normalized output based on what I wanted, or else either log an error or perhaps throw an error, etc… I really should rewrite it to handle Elixir’s more limited KWLists sometime, smaller implementation for sure.

+++

Doesn’t . delegate to maps:get or is it going to some internal implementation to match out (which shouldn’t that then be the same speed as maps:get unless it were inlined into the module?)?

Ooo is the new OTP version going to make iterating over binaries faster than lists? Lists have always been so much faster than binaries for so many things ever since I started erlang (though significantly larger). :slight_smile:

This this this is a big thing. Erlang had it’s share of macro-like things but they are a bit of hell to write, especially compared to Elixir macro’s.

+1 They still generate the same code to the same VM. ^.^

2 Likes

Tell me more. I’m new to Elixir and all of the learning materials I’m using focus on structs. Didn’t realize Records were a thing until today. What are the pros/cons and why is there a performance benefit to Records over Structs/Maps?

See

http://erlang.org/doc/reference_manual/records.html
https://groups.google.com/d/msg/elixir-lang-talk/6kn7J2XnFg8/n-r3NUaCHwAJ

I think they are implemented slightly differently in elixir, though.
https://hexdocs.pm/elixir/Record.html

But structs are generally much nicer to work with because they are maps and not tuples. For example, IIRC cowboy 2.0 moved from using records to represent its Req object to maps. But from reading its (as well as ranch and cowlib) source code I got a feeling that cowboy was striving for completeness and ease of use rather than performance.

2 Likes

foo.bar expands to:

case foo do
  %{bar: bar} -> bar
  %{} -> :erlang.raise({:badkey, :bar, foo})
  _ -> apply(foo, :bar, [])
end

I was mostly writing about memory size, which has some less direct effect on the applications triggering less GC, sending binaries between processes without copying, allowing for more structural sharing with sub binaries, etc. The difference is actually not that big on OTP 20 already when iterating over bytes (I think there are some things that should make binaries even faster on OTP 21).

Operating System: macOS
CPU Information: Intel(R) Core(TM) i7-4770HQ CPU @ 2.20GHz
Number of Available Cores: 8
Available memory: 16 GB
Elixir 1.7.0-dev
Erlang 20.3
Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
parallel: 1
inputs: none specified
Estimated total run time: 21 s

Name                  ips        average  deviation         median         99th %
list             265.45 K        3.77 μs    ±19.34%        3.60 μs        6.90 μs
binary_byte      210.06 K        4.76 μs    ±25.28%        4.40 μs        8.60 μs
binary_utf8       93.43 K       10.70 μs    ±30.39%          10 μs          20 μs

Comparison:
list             265.45 K
binary_byte      210.06 K - 1.26x slower
binary_utf8       93.43 K - 2.84x slower

Benchmark code: https://gist.github.com/michalmuskala/35ba70cec520e704233993cc4c575141

2 Likes

The Google Groups thread seems to imply that structs/maps will be more performant when doing the typical pattern match and have benefits wrt polymorphism and protocol support. It seems the only thing Records excel at are access within tight loops.

If I understand this correctly, it seems that Structs would outperform Records for the vast majority of general state holding problems. Like the state of a Cowboy Req object and nearly every GenServer.

Am I misreading this?

Like the state of a Cowboy Req object and nearly every GenServer.

Changing deeply nested values in tuples is more expensive, I think, than in maps. That’s the only performance benefit maps have over tuples that I know of.

If you only build a tuple once (like it could be done with cowboy req), and then use it just to read values from it, it should be more efficient than a map.

You can benchmark your own use case with benchee both using maps and tuples and see for yourself.

According to José (in 2015 anyway)

    defrecord Student, [:first_name, :last_name]

    def name(record) do
      "#{record.first_name} #{record.last_name}"
    end

The function above would also work on any record that contains the fields first_name and last_name. Its implementation relies on the fact we can call a function on a tuple like {Student, “josé”, “valim”}. So when you defined a record, we automatically defined functions for all of its fields. This added runtime behaviour to records but, because this relies on a tuple dispatch, it was actually slower than accessing or matching fields in a map!

This added runtime behaviour to records but, because this relies on a tuple dispatch, it was actually slower than accessing or matching fields in a map!

I don’t think it works like this anymore. You can extract the values you need in a single pattern match.

def name(student(first_name: first_name, last_name: last_name)) do
  # ...
end

unless elixir does something funny, it should be the same as

def name({:student, first_name, last_name}) do
  # ...
end

This kind of pattern matching (first element is an atom) is somewhat optimized (since records are used quite often in erlang) and, at least in my experience, much faster than pattern matching maps.

That is why we should get different syntax for accesses, like what about using the OCaml’y # as struct access (not literally proposing # as it is a comment, but just as an example of why not use different syntax for different type, or if Elixir was actually typed then you could use the same syntax as there is no ambiguity then). :slight_smile:

OCaml uses M.f to access a module namespace (capitalized initial character like Elixir), and it uses s.f to access a struct (when the left start does not start with a capitalized initial character, also like elixir). It uses a.[0] for array access. It uses o#m for object access. Among a few others. You can easily tell the type of something by the operators used even without typing declarations. :slight_smile:

Ah yes, entirely true. :slight_smile:

Would be nice. Though I did testing not long ago about iterating over a charlist and over a binary and it was faster for me to :erlang.string_to_binary and operate over the list then it was to iterate over the binary at the time (it was not super-long stuff, like 30 chars or so).

Ah yep, your results is about on par with the differences that I saw too.

Hmm, I wouldn’t think that would be true but would need to benchmark… A record is nothing but a tuple, so a RecordName(blah, blee, blorp) literally is just {RecordName, blah, blee, blorp}, the 4-tuple, and tuple access and matching is FAST, like nothing faster kind of fast. However rebuilding it and updating it I could see as slower than updating maps on larger sized records/tuples (though not smaller).

Not deeply nested, but rather larger tuples in general.

Yep, Tuples are O(1) access, Maps are O(N).

That was purely because Elixir used a really really bad access pattern for Records. If instead it copied the OCaml way (which requires types) or the Haskell way (which, interestingly, does not require types), then it would be perfectly efficient. But putting it as a . access was really really bad and I don’t know why that would have ever been chosen for records, it is just really really bad.

Those would be identical I’d think, that is how erlang compiles it anyway.

Very.

2 Likes

Is there official Elixir documentation that enumerates the time complexity for Elixir’s container types? I was going through the docs for both Record and Map and neither seems to state the time complexity of their various functions.

Also, does function head matching have the same performance? The info on reddit seems to imply that matches are always O(1).

def hello(%{name: name}), do: IO.puts "Hi #{name}"

Is matching and extracting on the :name key here O(1) or O(n)?

Even if you were just matching and not extracting I’d expect things to be the same.

def hello(%{name: "Jim"}), do: IO.puts "Love long and prosper."

If not slightly worse, since I’d expect there’s an additional comparison check.

Maps have actually two representations depending on the size. Small maps (less than 32 keys) are represented as two sorted arrays - one for keys and one for values. Accessing a key in such a map involves a linear search (and a linear search is much faster than a binary search since the size is so small). Depending on how you look at it, you might say that’s O(n) since it depends on the map size, or O(1) since the worst case is O(32), which is O(1) - the big O notation is not very helpful when talking about small data structures, especially with cache locality and branch prediction effects, you can get significantly different results from what you’d expect by a simple complexity analysis.

Large maps are represented as HAMT and the access is O(log n) - the “forking” factor of the maps is 32, so even very big maps are rather shallow and rarely involve more than 3 levels (32k items).

Anyway, maps are superior to me because of one simple thing - I can read them. Records are terrible in that regard - you get a tagged tuple and you have no idea what the fields mean. Debugging with records is a significantly worse experience and takes much longer. I would reach for records in the tightest loops of the program, where I need some “lolspeed”, but for everything else, maps and structs are plenty fast.

13 Likes

Thank you for clarifying Michal - I a s getting a mild heat attack reading that map access was O(N).

Comparing the speed of deeply nested tuples/records and flat maps/structs seems a bit strange. However you look at it with records you access a specific field in the tuple while with maps you have to search for the right field. Check @michalmuskala description of the internals.

Yes, using for records/tuples for small sizes, currently less than 32, is faster than maps but maps are definitely faster for large sizes. It all depends on how big your records/structs are. Maps/structs do use more memory than tuples/records as they include the names of the keys though that definitely does make them more readable at runtime.

4 Likes