@gregvaughn is absolutely right that correctness is the first priority. After that if you’re not satisfied with performance you need to measure, measure, measure.
Thanks to your question I took the opportunity to try the bmark benchmarking tool. Here’s a quick benchmark test I wrote based on the direction in the project’s README and your suggested expressions.
defmodule StringContenation do
bmark :binary_concat_operator, runs: 1000 do
ANSI.red <> "red" <> ANSI.reset <> ANSI.green <> "green" <> ANSI.reset
bmark :erlang_list_to_binary, runs: 1000 do
:erlang.list_to_binary([ANSI.red, "red", ANSI.reset, ANSI.green, "green", ANSI.reset])
I’m piping to
IO.puts here so we can compare these approaches against IO lists. By default bmark does 20 runs per benchmark block, but I’ve increased them to 1000 to get more statistically significant results.
Next I run
mix bmark. As expected I see a bunch of
redgreen lines printed and colored appropriately.
The benchmark timing data are in the
$PROJ_ROOT/results/ directory. To compare the two implementations I run
mix bmark.cmp results/stringcontenation.binary_concat_operator.results \
And the results!
76.677 -> 142.257 (+85.53%) with p < 0.025
t = 2.2981012392583193, 1998 degrees of freedom
In this particular example, the
:erlang.list_to_binary, on average, took 1.8553 times as long as the
<>. Don’t interpret this to mean you should always use
<>. The inputs in these examples are rather small. The README for the bmark project has a bit more detail on how to interpret these results.
Now let’s try benchmarking a couple more approaches:
bmark :io_list_all_known_data, runs: 1000 do
[ANSI.red, "red", ANSI.reset, ANSI.green, "green", ANSI.reset]
bmark :io_list_append_to_end, runs: 1000 do
# Simulates converting an incoming stream of unknown data to
# into an IO list and printing.
[[[[[[ANSI.red], "red"], ANSI.reset], ANSI.green], "green"], ANSI.reset]
Let’s compare each of these new approaches against the
65.195 -> 68.618 (+5.25%) with p < 1
t = 0.48428845990199687, 1998 degrees of freedom
<> is still faster, although we have little confidence about that because the
p-value is as high as it can get!
65.195 -> 125.968 (+93.22%) with p < 0.025
t = 2.1281669445218134, 1998 degrees of freedom
<> crushes the deeply nested IO list, and we can be pretty confident about that for these particular inputs on my test machine. However, that may not be true for concatenating sequences of strings in general. To check that you’d need to do a lot more benchmarks with different kinds of inputs and ideally on several different kinds of machines.
Exercises for the reader
- Determine how long IO lists have to get before they beat
<>, if at all.
- Does size of the binaries within the lists or the nesting depth affect the results, and if so how?
- Are any of these expressions reduced at compile-time? We’re benchmarking run-time speeds, so any work done by the compiler to optimize our inputs may explain surprising results.
- Do you suspect Elixir, Erlang, or any part of your cache hierarchy are giving you unrealistic results? How can you tell? Could you somehow reduce the influence of caching?
- How well do your benchmarks correspond to the way your code will be exercised in production? Do you have any data about production response times?