To clarify, it depends on the .exs file, but for other reasons besides compilation.
Running elixir path/to/exs won’t have any protocol consolidated. Running inside a Mix project with mix run path/to/exs (and also in a release) may be more performant depending on the idioms used. I don’t think it would in this case though.
Yeah, I can confirm that running mix run vs elixir doesn’t appear to make a difference, at least enough that it’s measurable in this simplistic benchmark.
The single outlier at 24s can pretty safely be ignored.
But again, none of this is a proper benchmark. All we’ve shown is that Elixir can be written to perform decently in a text processing script. The original article 2019 ruby 2.6 version took 37s, meaning we’re quite a lot faster than ruby at this point.
Nice! That’s a really cool solution, and probably scales well! It fails the first requirement though, the input should be piped in. Reading from a file is faster than reading from stdin in general (eg replace IO.binstream with File.read and it runs faster) and gives you much more control, like you show in your code.
You can speed up your version quite a bit if you switch to iolists, instead of printing line by line.
Hi @jola, did you try getting rid of the :line option and processing chunk by chunk? What I’ve seen so far (more details here: Streaming lines from an enum of chunks) is that this
It does like I said in the article, I didn’t spend too much time messing around with the chunk size, but that number gave decent performance improvement compared to reading line by line with that input on my machine.
Also, switching to reading the file directly would be faster, but the original article used stdin in all examples, so it would be not be a fair comparison.
Take a look at @evadne’s solution for one that optimizes reading over 4 workers if you’re curious how fast it can be reading directly from file.
That’s awesome! I think @jola got hers down to 7-ish seconds in her talk (if I’m remembering correctly)? But it would be neat to throw yours onto a 16-core machine and see how it does
I ran @JEG2 locally and it came up with 9.5s. Here’s the comparison:
➜ time elixir wp_parallel.exs < ./words.txt > /dev/null
elixir wp_parallel.exs < ./words.txt > /dev/null 27.14s user 1.79s system 302% cpu 9.550 total
➜ time elixir wp_singlel.exs < ./words.txt > /dev/null
elixir wp_single.exs < ./words.txt > /dev/null 10.28s user 2.91s system 103% cpu 12.739 total
You can see that the parallel version uses much more CPU, but it brings the time down by ~20%. It is a cool approach to read in parallel. I tried to read at once and then spawn out the processes, which was slower than I expected.
@darinwilson running with > /dev/null gets it down lower
➜ elixirconf time elixir lib/direction3/async_stream.ex < ../words.txt > /dev/null
elixir lib/direction3/async_stream.ex < ../words.txt > /dev/null 17.79s user 3.08s system 380% cpu 5.484 total
and then you can cheat a bit
➜ elixirconf time elixir --erl "+hms 500000000" lib/direction3/async_stream.ex < ../words.txt > /dev/null
elixir --erl "+hms 500000000" lib/direction3/async_stream.ex < ../words.txt > 16.26s user 2.82s system 420% cpu 4.543 total
For those keeping score at home, 4.5 is almost as fast as C (admittedly it uses 4x CPU)
➜ elixirconf time elixir lib/extra/jeg2.exs < ../words.txt > /dev/null
elixir lib/extra/jeg2.exs < ../words.txt > /dev/null 27.02s user 1.77s system 276% cpu 10.417 total
Using the +hms cheat brings it down to like 7 seconds.
I really like this solution. You don’t have to stitch together prefix+suffix and memory usage is about 1GB on my machine. Speed is totally reasonable