Streams - why do I need a helper function and cannot process stream into enum directly?

Hi,

I am learning Elixir through self-study. At the moment I am working through Elixir in Action chapter 3 stream exercises. I have a couple of questions I hope someone can answer:

I am trying to iterate through each line of a file and produce a list of numbers each representing a number of chars in that line. Easy enough.

I wrote the following function:

def lines_lengths!(path) do
path
|> File.stream!()
|> Stream.map(&String.replace(&1, “\n”, “”))
|> Enum.map(&String.length/1)
end

This fails my doctest with the following message:

  1. doctest Exercises.lines_lengths!/1 (1) (ExercisesTest)
    test/exercises_test.exs:3
    Doctest failed
    doctest:
    iex> Exercises.lines_lengths!("/Users/nixlim/Sync/PROJECTS/elixir_learning/elixir_in_action/exercises/sample_text.txt")
    [125, 108, 104, 110]
    code: Exercises.lines_lengths!(
    “/Users/nixlim/Sync/PROJECTS/elixir_learning/elixir_in_action/exercises/sample_text.txt”
    ) === [125, 108, 104, 110]
    left: ‘|lgn’
    right: ‘}lhn’
    stacktrace:
    lib/exercises.ex:77: Exercises (module)

This is the function from the author’s solution:

defp filtered_lines!(path) do
path
|> File.stream!()
|> Stream.map(&String.replace(&1, “\n”, “”))
end

def lines_lengths!(path) do
path
|> filtered_lines!()
|> Enum.map(&String.length/1)
end

This works as intended.

So, my questions are:

  1. Why does the original function produce this weird output:
    left: ‘|lgn’
    right: ‘}lhn’

  2. What am I misunderstanding here - ie why is the filtered_lines function required? My thinking is that all that does (apart from the logic of removing line breaks) is output a stream which is then passed to the Enum.map function. So, why does that not work if I put it into single function?

1 Like

What you’re seeing is a list of integers being printed as a charlist: List – Elixir v1.5.1

1 Like

OK, that is interesting - not sure why it produces a charlist, will read the docs thank you.

So, I decided to do a little pee everywhere debugging and done this:

def lines_lengths!(path) do
stream =
path
|> File.stream!()
|> Stream.map(&String.replace(&1, “\n”, “”))
|> Stream.map(&String.length/1)

Enum.map(stream, fn line -> IO.inspect(line) end)

end
end

This produces the following:
…124
108
103
110

Why are there dots before the 124? Clearly, the function actually performs as expected but something else happens that changes the output.

Charlists are not explicitly produced. Any list of integers may be considered a charlist for printing. You can find the details about the distinction in the docs

1 Like

I think you are running mix test. The (green ?) dots are tests that passed.

If you do not want to run in a test you can mix run your-file.exs.

1 Like

Awesome! Thank you for the dots clarification and how to read the tests results.

The only thing that remains for me to understand is the function question. Why does piping the result of stream directly into Enum method not work? Why is a helper function (in this case filtered_lines) required?

Also, would anyone recommend a quality resource for learning how to do debugging in elixir?

1 Like

To debug the BEAM…

https://www.erlang-in-anger.com/

1 Like
  • '|lgn' is [124, 108, 103, 110]
  • '}lhn' is [125, 108, 104, 110]

You code does not work because somehow you have less character in lines 1 and 3 in your file than your expect in your doctest.

If I were you I would not rely on an existing file in the test. Maybe something like that would be better:

defmodule Exercises do
  @doc ~S"""
      iex> File.write!("/tmp/excerices-test", "aaaa\nbbb\ncc\nd")
      iex> Exercises.lines_lengths!("/tmp/excerices-test")
      [4, 3, 2, 1]
  """
  def lines_lengths!(path) do
    path
    |> filtered_lines!()
    |> Enum.map(&String.length/1)
  end

  defp filtered_lines!(path) do
    path
    |> File.stream!()
    |> Stream.map(&String.replace(&1, "\n", ""))
  end
end

But that requires to not mess with the syntax (We need to use ~S"""), so it would be better in a regular test instead of a doctest.

I don’t know exactly why it does not work for you, but using a single function works exactly the same with me:

defmodule Exercises do
  @doc ~S"""
      iex> File.write!("/tmp/excerices-test", "aaaa\nbbb\ncc\nd")
      iex> Exercises.lines_lengths!("/tmp/excerices-test")
      [4, 3, 2, 1]
  """
  def lines_lengths!(path) do
    path
    |> File.stream!()
    |> Stream.map(&String.replace(&1, "\n", ""))
    |> Enum.map(&String.length/1)
  end
end

Now what you can do if you really like doctests is to separate the code that works with a file from the code that count lines lengths:

defmodule Exercises do
  @doc ~S"""
      iex> File.write!("/tmp/excerices-test", "aaaa\nbbb\ncc\nd")
      iex> Exercises.file_lines_lengths!("/tmp/excerices-test")
      [4, 3, 2, 1]
  """
  def file_lines_lengths!(path) do
    path
    |> File.stream!()
    |> lines_lengths()
  end

  @doc """
      iex> Exercises.lines_lengths(["aaaa", "bbb", "cc", "d"])
      [4, 3, 2, 1]
  """
  def lines_lengths(stream_of_lines) do
    stream_of_lines
    |> Stream.map(&String.replace(&1, "\n", ""))
    |> Enum.map(&String.length/1)
  end
end

But I would still move tests that have side effects (interacting with the file system) in a regular test.

Now for your exact error, as you have different characters than you expect, maybe you have some whitespace that you don’t know, like for instance \r\n line endings. You may want to use String.trim/1 somewhere in your code, probably instead of String.replace/3, because it handles all sorts of line endings, and maybe trailing whitespace:

iex(1)> String.trim("\r\nhello\r\n")
"hello"
iex(2)> String.trim("\nhello\n")    
"hello"
iex(3)> String.trim("\rhello\r")
"hello"
iex(4)> String.trim("\nhello\n     ")
"hello"
2 Likes

This is awesome, thank you very much for your explanation!

1 Like