Working with complex data structures (maps within lists within maps)

Hey all,

I’m working with a relatively complex data structure in an application I’m working on. A map that holds lists of maps, etc…

I’m having trouble finding the best way (or any way) of updating this data structure to fit my use case. I’ve simplified to what I think is the most basic case I’m having trouble with:

Pretend my data looks like this:

data = %{devices: [%{type: "laptop",
                     seen: 2},
                   %{type: "cellphone",
                     seen: 1}]}

I have a map that holds a list of devices. Each device has a type and the number of times its been seen.

Every time a user comes online, I’d like to increment the number of times I’ve seen the device they’re using, or add a new device map to the devices list if I’ve never seen their current device.

So for example, if a user logs on using their cellphone, I could do this to increment the number of times we’ve seen "cellphone":

with_type = fn type ->
  fn (:get_and_update, devices, next) ->
    devices
    |> Enum.map(fn
      device = %{type: ^type} -> next.(device)
      device                  -> {device, device}
    end)
    |> :lists.unzip
  end
end

update_in(data, [:devices, 
                 with_type.("cellphone"), 
                 :seen], fn
                   nil  -> 1
                   seen -> seen + 1
                 end)
> %{devices: [%{seen: 2, type: "laptop"}, %{seen: 2, type: "cellphone"}]}

Similarly, the same thing works for "laptop":

update_in(data, [:devices, 
                 with_type.("laptop"), 
                 :seen], fn
                   nil  -> 1
                   seen -> seen + 1
                 end)
> %{devices: [%{seen: 3, type: "laptop"}, %{seen: 1, type: "cellphone"}]}

I’m having trouble figuring out how to add a new device map for devices I haven’t seen before. Trying this with a new device, like "microwave" unsuprisingly doesn’t work as I want it to:

update_in(data, [:devices, 
                 with_type.("microwave"), 
                 :seen], fn
                   nil  -> 1
                   seen -> seen + 1
                 end)
> %{devices: [%{seen: 2, type: "laptop"}, %{seen: 1, type: "cellphone"}]}

This specific code isn’t working because my with_type function isn’t handling the case of not finding a device in the devices list. I can refactor it to handle that case:

with_type = fn type ->
  fn (:get_and_update, devices, next) ->
    devices
    |> Enum.map(fn
      device = %{type: ^type} -> {true, next.(device)}
      device                  -> {false, {device, device}}
    end)
    |> (fn
      devices -> found = devices
                         |> Enum.map(&(elem(&1, 0)))
                         |> Enum.any?
                 case found do
                   true  -> Enum.map(devices, &(elem(&1, 1)))
                   false -> Enum.map(devices, &(elem(&1, 1))) ++ [next.(%{type: type})]
                 end
    end).()
    |> :lists.unzip
  end
end

And everything works:

%{devices: [%{seen: 2, type: "laptop"}, %{seen: 1, type: "cellphone"}, %{seen: 1, type: "microwave"}]}

But this seems incredibly over-complicated and fragile. I feel like there’s got to be an easier way to handle this kind of thing. Any help would be hugely appreciated.

Thanks!

1 Like

It seems like it would be useful to have an explicit additional step before the updating that checks whether a device type exists in the list already and then inserts it if necessary.

Something like ensure_device_exists(data, device) which would return either an updated or unchanged data Map. Like this, everything would even pipe nicely into your existing code.

2 Likes

You can try with plain old recursion:

defmodule Devices do
  def new(), do:
    %{devices: []}

  def seen(state, device_type), do:
    update_in(state.devices, &inc_device_seen(&1, device_type))

  defp inc_device_seen([], device_type), do:
    [%{type: device_type, seen: 1}]
  defp inc_device_seen([%{type: device_type} = device | rest], device_type), do:
    [%{device | seen: device.seen + 1} | rest]
  defp inc_device_seen([other_device | rest], device_type), do:
    [other_device | inc_device_seen(rest, device_type)]
end

Taking it for a spin:

Devices.new()
|> Devices.seen("laptop")
|> Devices.seen("cellphone")
|> Devices.seen("laptop")
|> Devices.seen("microwave")

gives

%{devices: [%{seen: 2, type: "laptop"}, %{seen: 1, type: "cellphone"},
   %{seen: 1, type: "microwave"}]}

One note: the data structure you require (list of maps) is not efficient for this operation. Keeping a list of devices means O(N) complexity for the seen operation. If you don’t need to preserve the order of devices, consider using a single map where keys are types and values are counts. It will improve performance, reduce memory/GC pressure, and likely simplify the code.

7 Likes

Oh wow, that’s nice. I think I was definitely overthinking things by sticking with update_in.

I agree that things would be nicer if I could use maps all the way through. The issue is that the maps stored in lists are actually more complex objects (not just “devices”). Other peices of the app look up those objects using multiple fields (sometimes id, sometimes name, etc…).

To switch to an all-map solutions means that I’d have to maintain a mapping for every key I use to look into the structure. I feel like that added complexity isn’t worth the performance gain (yet).

Thanks!

FWIW,

After some serious soul searching and major refactoring, I switched to using a purely map-based data structure. It has ridiculously simplified most of my code (no more crazy custom accessor functions), and as you mentioned, improved performance.

The added complexity I was worrying about turned out not to be a real issue.

Thanks again for the advice!

4 Likes