How to read text into a dictionary

I have some data like:



---
title1
line1
line2
---
title2
line1
line2

and i want a structure of :

%{title1: [title1/line1, title1/line2], title2: [title2/line1, title2/line2}

Here is my code, but i cannot get the dict:

defmodule Index do
    @spliter "---"
    def build_custom_index(index_path) do
        {:ok, contents} = File.read(index_path)
        contents \
        |> String.split(@spliter, trim: true)
        |> Stream.map( &(String.split(&1, "\n", trim: true)) )
        |> Stream.filter( &(Enum.count(&1) > 0) )
        |> Enum.each( fn group -> 
            [head | tail] = group
            tail \
            |> Enum.each( fn line -> 
                new_line = head <> "/" <> line
            end)
        end)
    end
end

This is actually my first Elixir program, I have searched several materials but still in trouble.

I have used Python before.

Would anyone help… many thanks.

Hey @hscspring welcome! The main thing to remember with Elixir is that you should always be thinking about returning values. You can never mutate data, so if you want something different than you have now, you need to construct some kind of function that will return the changed data. Here’s an example from your code:

tail \
|> Enum.each( fn line -> 
  new_line = head <> "/" <> line
end)

Here, instead of doing each and trying to create some new variable outside the function, use

new_lines = Enum.map(tail, fn line -> head <> "/" <> line end)

Which takes the tail list and returns a new list where the function has been applied to each item.

Overall you need two do two things: iterate through the groups, and maintain a dictionary. I’m going to show you two different ways to do this.

The first way will be to build a recursive function that walks through the groups “manually”:

defmodule Index do
  @spliter "---"
  
  def index_from_file(index_path) do
    {:ok, contents} = File.read(index_path)
    build_index(contents)
  end

  def build_index(contents) do
    contents
    |> String.split(@spliter, trim: true)
    |> Stream.map(&String.split(&1, "\n", trim: true))
    |> Stream.filter(&(Enum.count(&1) > 0))
    |> index_groups(%{})
  end
  
  defp index_groups([], index) do
    index
  end

  defp index_groups([group | groups], index) do
    [title | items] = group
    items = Enum.map(items, fn item -> head <> "/" <> line)
    index = Map.put(index, title, items)
    index_groups(groups, index)
  end
end

As a simple thing, note that I split the idea of reading from a file from building the index. This is a common pattern in Elixir, where you try to maximize the number of functions that are “pure” and don’t depend on external inputs or outputs.

The main thing though is the recursive build_index function. It takes the list of groups as a first arg, and then the dictionary as the second arg. If there are no groups, then it just returns the index. If there is a group, it uses Map.put to return a new index containing the updated rows, and then recursively passes remaining groups and the update index to itself.

This pattern is so common that there’s the handy https://hexdocs.pm/elixir/Enum.html#reduce/3 function:

  def build_index(contents) do
    contents
    |> String.split(@spliter, trim: true)
    |> Stream.map(&String.split(&1, "\n", trim: true))
    |> Stream.filter(&(Enum.count(&1) > 0))
    |> Enum.reduce(%{}, fn group, index ->
      [title | items] = group
      items = Enum.map(items, fn item -> head <> "/" <> line)
      Map.put(index, title, items)
    end)
  end

There are a bunch of other little changes that could be done to the code, but hopefully this helps introduce the core concepts around iterating through data and returning new data. Enum.each/2 is honestly used pretty rarely, it’s only useful when you want to do some kind of side effect and you don’t care about keeping the result.

8 Likes

I haven’t tested it, but this should work. Just remember that content need to be list of lines.

def build(content), do: do_build(content, nil, [])

defp do_build([], agg), do: Map.new(agg)
defp do_build(["---", title | rest], agg) do
  {entries, rest} = content(rest)

  do_build(rest, [{title, entries} | agg])
end

defp content(data, agg \\ [])
defp content([], agg), do: {Enum.reverse(agg), []}
defp content(["---" | _] = rest, agg), do: {Enum.reverse(agg), rest}
defp content([entry | rest], agg), do: content(rest, [entry | agg])
4 Likes

It’s really kind of you, I’ve gotten that. That’s quite interesting, thank you again.

1 Like

Thanks a lot. I’m trying.

1 Like

I’m sorry, i got an error like this:

** (FunctionClauseError) no function clause matching in Index.index_groups/2

    The following arguments were given to Index.index_groups/2:

        # 1
        #Stream<[enum: ["\n\n", "\ntitle1\nline1\nline2\n", "\ntitle2\nline1\nline2"], funs: [#Function<48.51129937/1 in Stream.map/2>, #Function<40.51129937/1 in Stream.filter/2>]]>

        # 2
        %{}

and i have to change the code to:

defmodule Index do
    @spliter "---"

    def index_from_file(index_path) do
        {:ok, contents} = File.read(index_path)
        build_custom_index(contents)
    end

    def build_custom_index(contents) do
        stream = contents
        |> String.split(@spliter, trim: true)
        |> Stream.map(&String.split(&1, "\n", trim: true))
        |> Stream.filter(&(Enum.count(&1) > 0))
        # |> index_groups(%{})
        # add this line
        index_groups(Enum.to_list(stream), %{})
    end

    defp index_groups([], index) do
        index
    end

    defp index_groups([group | groups], index) do
        [title | items] = group
        items = Enum.map(items, fn item -> title <> "/" <> item end)
        index = Map.put(index, title, items)
        index_groups(groups, index)
    end
end

then the code is ok. But i still do not understand the error .
i’ve found this: Elixir: (FunctionClauseError) no function clause matching - Stack Overflow, but i don’t think this is my problem.

i guess the problem may come from the Stream, but i can’t find out that.

Oh, by the way, the reduce approach worked well.

1 Like

Ah whoops I didn’t notice you were doing streams. Your solution is correct, you can also keep the |> going by doing:

|> Stream.filter(&(Enum.count(&1) > 0))
|> Enum.to_list()
|> index_groups(%{})

As a tiny tweak:

|> Stream.filter(&(Enum.count(&1) > 0))

should be:

|> Stream.filter(& &1 != [])

Because this way you don’t need to actually count the length of the list

oo, that’s so obvious, i didn’t even see it…

many thanks to u.