Performance with Explorer scales linearly with size and then suddenly degrades on large file

betoparcus · January 15, 2024, 10:18am

I was playing around with some real-life solutions for 1BRC using elixir and external libraries such as Flow and Explorer and found that Explorer’s performance is great but changes 10x (for worse) when dealing with 1bil lines vs 500mil or less.

Here’s the code with some comments:

defmodule WithExplorer do
  # Results
  # [
    # 1_000_000_000: 675483.000ms,
    #   500_000_000: 58244.713ms,
    #   100_000_000: 10321.046ms,
    #    50_000_000: 5104.949ms,
  # ]
  require Explorer.DataFrame
  alias Explorer.{DataFrame, Series}

  @filename "./data/measurements.txt"

  def run() do
    parent = self()

    results = @filename
    |> DataFrame.from_csv!(header: false, delimiter: ";", eol_delimiter: "\n")
    |> DataFrame.group_by("column_1")
    |> DataFrame.summarise(min: Series.min(column_2), mean: Series.mean(column_2), max: Series.max(column_2))
    |> DataFrame.arrange(column_1)

    # for idx <- 0..(results["column_1"] |> Series.to_list() |> length() |> Kernel.-(1)) do
    #   "#{results["column_1"][idx]}=#{results["min"][idx]}/#{:erlang.float_to_binary(results["mean"][idx], decimals: 2)}/#{results["max"][idx]}"
    # end
  end
end

What I observe is that CPUs are still busy but not fully utilized and suddenly a lot of disk IO shows up. I have some idea of what might be happening and wonder if there is a way to control this behavior from the high-level API or by compiling Explorer with some Polars specific options.

josevalim · January 15, 2024, 10:21am

Probably the data no longer fits in memory and then it is using disk swap? If that’s the case, that’s happening at the operating system level, so there isn’t much to control.

However, you can pass the :lazy Option to from_csv and then Call collect to perform the operation at once. It should go easier on the memory usage.

betoparcus · January 15, 2024, 2:24pm

Thank you, José. Enabling :lazy cut the time in half. Your suggestion also made me read the docs with more attention and I found that I could set the floats to f32 instead of using f64, which had been automatically inferred.
This made the computation light enough to fit in memory and go even faster, regardless of lazy mode.

Results:

Reading and aggregating 1 Billion Lines with Explorer
- Eager f64:  675483.00ms
- Lazy (f64): 389491.00ms
- Lazy (f32):  53575.23ms 
- Eager f32:   55091.87ms

polypush135 · January 15, 2024, 2:27pm

You got to love a forum that you can just happen to stumble into and read a post about making “checks notes” … a billion row csv “checks notes again”… run faster.

I love this place

Edit: Oh dang also welcome new user @betoparcus

betoparcus · January 15, 2024, 2:38pm

Thanks! Have been a reader for several years but this might indeed be my first post.

I love this place

1 billion % agree.

polypush135 · January 15, 2024, 2:42pm

what kind of hardware specs are getting you under a minute?

Me: “slaps hood” this baby can go from 0 to a billion in under a minute…

betoparcus · January 15, 2024, 2:50pm

Good old gaming computer, but I’d love to try on the M1 too.

OS Name	Microsoft Windows 11 Home (wsl)
System Model	X570 AORUS ELITE WIFI
Processor	AMD Ryzen 9 3900X 12-Core Processor, 
            3801 Mhz, 12 Core(s), 24 Logical Processor(s)
Installed Physical Memory (RAM)	32.0 GB

stefanluptak · January 15, 2024, 3:08pm

If you can provide the data and the code, I can benchmark it on M1 Max with 32GB RAM for you.

betoparcus · January 15, 2024, 4:40pm

Thanks, Stefan. I just uploaded code and generator with some comments GitHub - rparcus/ex_1brc

stefanluptak · January 16, 2024, 9:26am

36130.104

$ elixir -v                                                                                                                                                    
Erlang/OTP 26 [erts-14.2.1] [source] [64-bit] [smp:10:10] [ds:10:10:10] [async-threads:1] [jit]

Elixir 1.16.0 (compiled with Erlang/OTP 26)

Hermanverschooten · January 16, 2024, 2:18pm

45592.203 on Mac Mini M2 Pro but with only 16GB of memory.