Elixir language performance tuning for 1 quadrillion records per month

So… a couple of things. First, File.stream! is already splitting the input into lines by default, but you then Stream.map(&String.split(&1, "\n")) again on each line, which does nothing.

If you need to do a large number of updates on a Map, consider switching to ETS. It will perform much better with large amounts of data, since it has constant time access and isn’t garbage collected.

There are some other things I’d try that might help too, like replacing this &((elem(&1, 0) <> "," <> Decimal.to_string(elem(&1, 1))) <> "\n") with fn {key, value} -> key <> "," <> Decimal.to_string(elem(&1, 1)) <> "\n" end.

You’re also building a lot of strings, which in eg the JVM is automatically optimized. The BEAM doesn’t optimize that automatically, but there’s this concept of “iolists” that let you manually optimize it.

I’d suggest taking a look at a similar thread posted before that has a lot of tips Erlang/Elixir string performance - can this be improved?

I also wrote an article about this which isn’t completely up to date but collects most of the improvements from the thread https://blog.jola.dev/elixir-string-processing-optimization and it has examples of eg using ETS instead of Map and using iolists.

21 Likes