What's the best way to access and retrieve data from deeply nested Maps and Lists?

So I’ve been working on some projects that envolve external services/API’s that provide data in XML and JSON. In some of these services I frequently incur into a problem which is deeply nested Maps and Lists. Since in some of these replies I know for a fact they will stay the same and won’t change over time or have any “randomness” to them I can simply pattern match them or use some Map.get/2 and Enum.fetch!/2 and get the job done.

However, when the reply may change (maybe the order of the elements or how deeply nested they are) I still haven’t found a proper way to access that data in a good, safe and idiomatic fashion.

I’ve read this blog post by José Valim but it didn’t help this situation in particular.

Here’s an example of the type of problem I’m facing:

         "name" => "External NAT", "natIP" => "146.148.23.208",
         "type" => "ONE_TO_ONE_NAT"}]

Let’s say I want to access the "natIp" key. Without having to use many Map.get/2 and List.first/1 or Enum.fetch!/2 how can we access the data? Specially when you’re not sure if this data structures will always come in the same order or with the same size from an external service (meaning I’m not sure if there is an actual way to pattern match it)?

1 Like

There is no one “right” answer to this question. There will always be a tradeoff between a generic tree search and a search that uses specific knowledge of the data structure.

One approach that I have been playing with is to turn this problem on it’s head and instead of
extracting the data out of the structure and into a function, you approach the problem by taking the function to the data.

I’ve written a general purpose library for dealing with deep data structures like this, it’s
phst_transform and it’s in hex.pm

It builds a map of functions that apply to specific data structure types and uses protocols underneath to do a depth first span of the entire data structure as a tree. One idea I’ve been playing with for extracting single data items from a deep tree like this would be to simply have a transform
that sent the item as a message to another process. Something like this.

potion = %{ Map => fn m → val = Map.get(m, “natIP”)
if (val , do: send pid, val )
m end }
PhStTranform.transform(data, potion )

It’s far from the most efficient way to get the value, but it does have the advantage of working with ANY data structure. I’m not sure PhStTranform is the last word in this kind of thinking, but I think there are a lot of possibilities in stepping back from the model of extract, manipulate and rebuild. If we start thinking about transforming the entire data structure or bringing the function to the data, many things that seem dauntingly complex become quite straightforward.

This kind of solution won’t work for every problem, but there is a lot you can do without actually embedding the knowledge of your entire data structure into your code. If you just know “somewhere in this blob is the Struct I care about”, you can just write the function for that struct.

4 Likes

This may not be a complete solution for you, but it sounds like it improves it at least one step of abstraction. Kernel.get_in/2 will access deeply nested maps very cleanly. The docs discuss details of how to use a function as a key, which would be necessary when you come to lists to find a matching map within it.

5 Likes

This is an interesting problem. It won’t be possible to have a completely generic solution, but let us assume that you always get a list of maps the the following code would help:

defmodule NestedMaps do

  def nested_map() do
    %{"accessConfigs" =>
      [%{"kind" => "compute#accessConfig",
         "name" => "External NAT",
         "natIP" => "146.148.23.208",
         "type" => "ONE_TO_ONE_NAT"}
      ]
     }
  end

  def nested_map2() do
    %{"accessConfigs" =>
      [
        [%{"kind" => "compute#accessConfig",
         "name" => "External NAT",
         "natIP" => ["146.148.23.208","127.0.0.1"],
         "type" => "ONE_TO_ONE_NAT"}
        ],
        [{:config1,"c"}]
      ]
     }
  end

  def get_nested_map(nm) do
    %{"accessConfigs" => nestedmaplist} = nm
    nestedmaplist
  end

  def get_nested_map_from_list(nm, nestedlevel) when nestedlevel < 1 do
    nm
  end

  def get_nested_map_from_list(nm, nestedlevel) do
    get_nested_map_from_list(List.first(nm),nestedlevel-1)
  end

  def get_nested_map_value(nm, val) do
    Map.get nm,val
  end

end

A line like

NestedMaps.nested_map
    |> NestedMaps.get_nested_map 
    |> NestedMaps.get_nested_map_from_list(1) 
    |> NestedMaps.get_nested_map_value "natIP"

would return “146.148.23.208” from your original map list.

NestedMaps.nested_map2 
    |> NestedMaps.get_nested_map 
    |> NestedMaps.get_nested_map_from_list(2) 
    |> NestedMaps.get_nested_map_value "natIP"

would return [“146.148.23.208”, “127.0.0.1”] from the example in the code.

2 Likes

Just a quick refactoring.
To make get_nested_map_from_list tail recursive, it should look like this:

def get_nested_map_from_list(nm, nestedlevel) do
    l = List.first(nm)
    get_nested_map_from_list(l,nestedlevel-1)
end
2 Likes

Here’s a gist with an example of how to use Kernel.get_in/2 with function keys to navigate the nested sample data. It’s probably not a complete solution, but might be a step in the right direction.

4 Likes
defmodule Nested do

def get_inner_element(input) do
Enum.at(input,0) 
|> __MODULE__.get_map(["config","accessConfigs"] )  # we can pass list here for deep nested
|>Enum.at(0) 
|> __MODULE__.get_element_from_map("natIP")
end
def get_map(map,[head|tail])  do
map[head] |> get_map tail 
end

def get_map(map,[]),  do: map 

def get_element_from_map(map,key) do
map[key]
end
end


    input = [%{"config" => %{"accessConfigs" => [%{"kind" => 
  "compute#accessConfig","name" => "External NAT", "natIP" => "146.148.23.208",
 "type" => "ONE_TO_ONE_NAT"}]}}]

IO.inspect Nested.get_inner_element(input)
1 Like