What is the fastest way to implement a line counter?
Given a .txt file with a lot of words. The code should then return what the longest word/line in the txt file is. I tried in some other language and it really depends on the implementation. Any ideas how to solve that (fast) in elixir?
There is an example of implementing a fast word-counting algorithm (and how to arrive there from a simpler non-concurrent implementation) in the documentation of Flow.
3 Likes
Yea I already tried that but ran into issues
defmodule Counter do
def count do
File.stream!("file.txt")
|> Flow.from_enumerable()
|> Flow.flat_map(&String.split(&1, " "))
|> Flow.partition()
|> Flow.reduce(fn -> %{} end, fn word, acc ->
Map.update(acc, word, 1, &(&1 + 1))
end)
|> Enum.to_list()
end
end
when running
** (Protocol.UndefinedError) protocol Enumerable not implemented for %Flow{operations: [{:reduce, #Function<1.46133320/0 in Lol.count/0>, #Function<2.46133320/2 in Lol.count/0>}], options: [stages: 12], producers: {:flows, [%Flow{operations: [{:mapper, :flat_map, [#Function<0.46133320/1 in Lol.count/0>]}], options: [stages: 12], producers: {:enumerables, [%File.Stream{line_or_bytes: :line, modes: [:raw, :read_ahead, :binary], path: "file.txt", raw: true}]}, window: %Flow.Window.Global{periodically: [], trigger: nil}}]}, window: %Flow.Window.Global{periodically: [], trigger: nil}} of type Flow (a struct). This protocol is implemented for the following type(s): Date.Range, File.Stream, Function, GenEvent.Stream, HashDict, HashSet, IO.Stream, List, Map, MapSet, Range, Stream
(elixir 1.13.3) lib/enum.ex:1: Enumerable.impl_for!/1
(elixir 1.13.3) lib/enum.ex:143: Enumerable.reduce/3
(elixir 1.13.3) lib/enum.ex:4144: Enum.reverse/1
(elixir 1.13.3) lib/enum.ex:3489: Enum.to_list/1
iex(1)>
Depends on what you mean by “fast”. Because if you want to be ultra fast then Elixir will be poor choice. For “fast enough” Flow
will be the best.
3 Likes