Help improving Elixir word count

I tried String.split/3 and it was a little bit slower than :binary.split/3. Since String.split/3 is a wrapper around multiple functions, I do think, the time lost is in the 2 pattern matches and guard calls done before deligating to :binary.split/3, but thats only a theory.

And Haskell is on my list to do.

2 Likes

This is a very slow way to do IO

IO.stream(:stdio, :line)

Not sure if thatā€™s your problem, but in general reading data in large chunks and using :binary.* functions on the chunks is significantly faster than line streaming. As far as ā€œmulticoreā€ goes, the Elixir runtime is multicore out of the box. So youā€™re already breaking the rules by using elixir.

I would at least try something like

IO.stream(:stdio, 320000)
|>

It does require some handling at the end of each buffer to avoid data breaks. The ETL challenge is a similar
problem that might give you some ideas. Iā€™m not sure it can be done in your case, but if you can stick to only
the NIF based function in :binary, that will help with speed as well.

http://blog.dimroc.com/2015/05/07/etl-language-showdown-pt2/

And you might find some ideas in here as well:

3 Likes