system-specific performance issue I found with Benchee (FreeBSD, i7-7800X), could use some help

I am getting very unusual numbers from Benchee, the elixir performance testing package on specific systems, and I am trying to figure out whether it is an issue with the package or something more pervasive. I am planning to use a free i7-9800X for a development server, however when I ran a simple ETS performance test the results were extremely poor as compared to other systems. I’ll show the code at the end, but here are the results first (from my Linux laptop):

CPU Information: Intel(R) Core™ i5-8350U CPU @ 1.70GHz
Number of Available Cores: 8

Name ips average deviation median 99th %
term_to_binary 9.96 M 100.41 ns ±64.64% 99 ns 115 ns
lookup 1.04 M 958.10 ns ±1330.11% 856 ns 1463 ns
insert 0.71 M 1400.61 ns ±1077.40% 1225 ns 1931 ns

Now here are the results from my i7-9800X:

CPU Information: Intel(R) Core™ i7-9800X CPU @ 3.80GHz

Name ips average deviation median 99th %
term_to_binary 218.09 K 4.59 μs ±26.93% 4.50 μs 4.54 μs
lookup 215.76 K 4.63 μs ±79.81% 4.50 μs 9 μs
insert 214.63 K 4.66 μs ±71.99% 4.50 μs 9 μs

As you can see term to binary is literally 50x slower, and the ETS inserts and lookups are 5x slower. This absolutely is a platform specific issue, and today I was able to test this on multiple platforms. Two systems appear to be affected (i.e. have very low results), which are this i7-9800X and an i9-7920x, both running FreeBSD, with elixir installed from both packages and ports.

The systems that are not affected are a Ryzen 7 3800X, an i5-9500, an i5-8400T, an i5-8350U, and an i5-5300U. All use the same version of FreeBSD (and in some cases the same exact binaries), except for my laptop (8350U) which runs Linux. I’ve also recreated the app from scratch to make sure it wasn’t a build or dependency issue.

Yes, I have an unusually large amount of strange systems LOL, but the Intel X299 / HEDTs are really my favorite and the ones that I most enjoy using. My concern is that I may be seeing an issue with them that makes them unsuitable to running elxir in development / production, but I know that it’s too soon to say that definitively. Thanks for the help!

defmodule Warp.Test.EtsTest do

  alias Warp.Ets

  @db :ets

  def run_test do
    list = Enum.to_list(1..1_000_000)
    for x <- list do
      Ets.put(@db, {x, x * 33})
      "insert" => fn -> Ets.put(@db, {Enum.random(0..999_999), Enum.random(0..999_999)}) end,
      "lookup" => fn -> Ets.get(@db, Enum.random(0..999_999)) end,
      "term_to_binary" => fn -> :erlang.term_to_binary("8675309") end
    time: 3,
    memory_time: 1
1 Like

I was able to find and resolve the issue. Here is what’s happening: The function erlang:monotonic_time() is much slower on one platform than another, and by default Benchee doesn’t take this into account. Benchmarks can be improved by adding the option “measure_function_call_overhead: true” to the call which deducts the time of an empty function call from the benchmark which helps because the empty function call also includes a call to monotonic_time.

My suggestion to anyone who uses Benchee to measure fast operations is to consider that you may just be benchmarking the monotonic_time calculation and to try using the “measure_function_call_overhead: true” option.