List of attribute lists to a neat map

webuhu · February 21, 2022, 3:11pm

I wanted to ask for the most elegant solution for the following task.
I have a list of attribute lists:

attributes = [
  ["color", "head", "red"],
  ["color", "head", "green"],
  ["color", "body", "blue"],
  ["color", "body", "green"],
  ["size", "head", "eye", "green"],
  ["size", "head", "eye", "black"],
  ["size", "head", "hair", "red"],
  ["size", "head", "hair", "green"]
]

How to make them to a neat map?

%{
  "color" => %{
    "head" => ["red", "green"],
    "body" => ["blue", "green"]
  },
  "size" => %{
    "head" => %{
      "eye" => ["green", "black"],
      "hair" => ["red", "green"]
    }
  }
}

Sebb · February 21, 2022, 3:44pm

have a look at Enum.reduce (or wait until someone else solves it).

lud · February 21, 2022, 4:28pm

What did you try so far?

al2o3cr · February 21, 2022, 4:30pm

A good way to approach problems like this is to find simpler versions that you can solve, then figure out how to build on that solution to solve the whole thing.

In this case, the simplest subproblem is at the bottom: given a list of two-element lists, produce the corresponding map. In code:

# Given:
simple_attributes = [
  ["head", "red"],
  ["head", "green"],
  ["body", "blue"],
  ["body", "green"]
]
# Produce:
%{
  "head" => ["red", "green"]
  "body" => ["blue", "green"]
}

How would you describe this operation in words? I’d say something like “group the list by the first element in each sublist”, and indeed Enum.group_by is handy:

Enum.group_by(
  simple_attributes,
  fn [first, _] -> first end
  fn [_, second] -> second end
)

This works for simple_attributes, but it fails for the more complicated list because [_, second] doesn’t match a three-element row like ["color", "head", "red"].

Another way to think of “the second element in a two-element list” is “the TAIL of the list”, and this gets us farther with attributes:

Enum.group_by(
  attributes,
  &hd/1,
  &tl/1
)
# returns
%{
  "color" => [
    ["head", "red"],
    ["head", "green"],
    ["body", "blue"],
    ["body", "green"]
  ],
  "size" => [
    ["head", "eye", "green"],
    ["head", "eye", "black"],
    ["head", "hair", "red"],
    ["head", "hair", "green"]
  ]
}

There’s a promising sign here: the value corresponding to "color" looks JUST LIKE our original simple_attributes. Recursion is afoot!

The boundary conditions are tricky to get right (bugs will usually result in either [[too much wrapping]] or trying to call hd("green")) but this should get you started.

webuhu · February 21, 2022, 5:03pm

Thanks a lot Matt.
You brought the difficulties exactly to the table.

That’s exactly the path I was already going.
But today as I’ve got difficulties with doing the recursion the right way - I got sucked.

I’m still not done and think I’ll need a fresh head for it tomorrow.
I appreciate your bringing me back to the simple steps very much.

Sebb · February 21, 2022, 7:29pm

You should rather go down the path Matt proposed, but since I had some similar problem that needed to put values deep in a nested map I adjusted it for your task to show an alternative way.

defmodule Webuhu do
  def deep_append(map, [key, value], path) do
    update_in(map, path ++ [key], fn
      nil -> [value]
      current when is_list(current) -> current ++ [value]
    end)
  end

  def deep_append(map, [h | t], path) do
    update_in(map, path ++ [h], fn
      nil -> %{}
      current -> current
    end)
    |> deep_append(t, path ++ [h])
  end

  def transform(attr) do
    Enum.reduce(attr, %{}, fn path, acc -> deep_append(acc, path, []) end)
  end
end

this is slow.

webuhu · February 22, 2022, 2:47pm

Thank you everybody for your help.
I came up with the following solution:

ExUnit.start()

defmodule NormalizeTest do
  use ExUnit.Case

  def normalize(attributes) do
    group = group(attributes)

    Enum.reduce(group, %{}, fn {key, value}, acc ->
      if value !== [[]],
        do: put_in(acc, [key], normalize(value)),
        else: List.flatten(attributes)
    end)
  end

  defp group(list) do
    Enum.group_by(list, &hd(&1), &tl(&1))
  end

  test "normalize attributes" do
    attributes = [
      ["color", "head", "red"],
      ["color", "head", "green"],
      ["color", "body", "blue"],
      ["color", "body", "green"],
      ["size", "head", "eye", "green"],
      ["size", "head", "eye", "black"],
      ["size", "head", "hair", "red"],
      ["size", "head", "hair", "green"]
    ]

    normalized_attributes = normalize(attributes)

    assert normalized_attributes === %{
             "color" => %{
               "head" => ["red", "green"],
               "body" => ["blue", "green"]
             },
             "size" => %{
               "head" => %{
                 "eye" => ["green", "black"],
                 "hair" => ["red", "green"]
               }
             }
           }
  end
end

If anything could be more elegant - I’m up for hints.
All the best. Chris

dimitarvp · February 22, 2022, 4:14pm

Nice job.

Sebb · February 22, 2022, 5:23pm

very nice solution. Maybe a little mind bending for people not accustomed to recursion. Took me some time to understand.

If anything could be more elegant

why use put_in when Map.put suffices?
the group/1 function is not needed.
I like the stop condition to come first, which also gets rid of != in favour of ==

def normalize(attributes) do
  attributes
  |> Enum.group_by(&hd(&1), &tl(&1))
  |> Enum.reduce(%{}, fn {key, value}, acc ->
    if value == [[]],
      do: List.flatten(attributes),
      else: Map.put(acc, key, normalize(value))
  end)
end