Best way to implement line counter?

What is the fastest way to implement a line counter?
Given a .txt file with a lot of words. The code should then return what the longest word/line in the txt file is. I tried in some other language and it really depends on the implementation. Any ideas how to solve that (fast) in elixir?

There is an example of implementing a fast word-counting algorithm (and how to arrive there from a simpler non-concurrent implementation) in the documentation of Flow.

3 Likes

Yea I already tried that but ran into issues

defmodule Counter do
  def count do
    File.stream!("file.txt")
    |> Flow.from_enumerable()
    |> Flow.flat_map(&String.split(&1, " "))
    |> Flow.partition()
    |> Flow.reduce(fn -> %{} end, fn word, acc ->
      Map.update(acc, word, 1, &(&1 + 1))
    end)
    |> Enum.to_list()
  end
end

when running

** (Protocol.UndefinedError) protocol Enumerable not implemented for %Flow{operations: [{:reduce, #Function<1.46133320/0 in Lol.count/0>, #Function<2.46133320/2 in Lol.count/0>}], options: [stages: 12], producers: {:flows, [%Flow{operations: [{:mapper, :flat_map, [#Function<0.46133320/1 in Lol.count/0>]}], options: [stages: 12], producers: {:enumerables, [%File.Stream{line_or_bytes: :line, modes: [:raw, :read_ahead, :binary], path: "file.txt", raw: true}]}, window: %Flow.Window.Global{periodically: [], trigger: nil}}]}, window: %Flow.Window.Global{periodically: [], trigger: nil}} of type Flow (a struct). This protocol is implemented for the following type(s): Date.Range, File.Stream, Function, GenEvent.Stream, HashDict, HashSet, IO.Stream, List, Map, MapSet, Range, Stream 
    (elixir 1.13.3) lib/enum.ex:1: Enumerable.impl_for!/1
    (elixir 1.13.3) lib/enum.ex:143: Enumerable.reduce/3
    (elixir 1.13.3) lib/enum.ex:4144: Enum.reverse/1
    (elixir 1.13.3) lib/enum.ex:3489: Enum.to_list/1
iex(1)>

Depends on what you mean by “fast”. Because if you want to be ultra fast then Elixir will be poor choice. For “fast enough” Flow will be the best.

3 Likes