Help improving Elixir word count

I tried String.split/3 and it was a little bit slower than :binary.split/3. Since String.split/3 is a wrapper around multiple functions, I do think, the time lost is in the 2 pattern matches and guard calls done before deligating to :binary.split/3, but thats only a theory.

And Haskell is on my list to do.

2 Likes

This is a very slow way to do IO

IO.stream(:stdio, :line)

Not sure if that’s your problem, but in general reading data in large chunks and using :binary.* functions on the chunks is significantly faster than line streaming. As far as “multicore” goes, the Elixir runtime is multicore out of the box. So you’re already breaking the rules by using elixir.

I would at least try something like

IO.stream(:stdio, 320000)
|>

It does require some handling at the end of each buffer to avoid data breaks. The ETL challenge is a similar
problem that might give you some ideas. I’m not sure it can be done in your case, but if you can stick to only
the NIF based function in :binary, that will help with speed as well.

http://blog.dimroc.com/2015/05/07/etl-language-showdown-pt2/

And you might find some ideas in here as well:

3 Likes