How to use update_in on multiple map keys at once? (re: Access.all)

marick · June 3, 2024, 5:41pm

Suppose I have a Network struct that contains a map from names to Clusters, and a Cluster contains a MapSet of “downstream” names:

I can easily use update_in to work with one of the clusters:

    path = [Access.key(:name_to_cluster), :b, Access.key(:downstream)]
    update_in(network, path, &MapSet.put(&1, :cluster_name))

But what if I want to add a downstream to N clusters? I was suspecting that Access.all would “explode” the map into a list of {key, value} tuples, like what for {k,v} <- a_map ... does. But alas:

** (RuntimeError) Access.all/0 expected a list

So, to use Access, do I have to reduce over the N names?

    clusters_to_update = [:gate, :watcher]
    Enum.reduce(clusters_to_update, network, fn elt, acc ->
      path = [Access.key(:name_to_cluster), elt, Access.key(:downstream)]
      update_in(acc, path, &MapSet.put(&1, :cluster_name))
      # or...
      # update_in(acc.name_to_cluster[elt].downstream, &MapSet.put(&1, :cluster_name))
    end)

That’s better than having to assemble each level on the way “out” of the descent (though my 1981 self, programming in C on a computer with 64K of memory, screams at the N^2 space cost), but is it the best I can do, short of a Lens package?

If so, I’m curious why Access.all is restricted to List, rather than open to all Enums? (1981 Brian cries, “do you know what indirecting through a function pointer costs?”)

P.S. I’m also curious why my posts are always marked private? I don’t see anything in the compose window to change that, I never consciously chose it, I don’t see anything in my preferences, and I can’t find anything in the Discord documentation to explain it.

03juan · June 3, 2024, 7:53pm

In my experience I’ve come up to the same frustrations and conclusion, the Access module is a nice utility to use occasionally but definitely not the most perfomant and pretty cumbersome for nested maps, and for comprehensions don’t fare much better, one way or another you have to reduce over some accumulator.

The advantage of the Access module is that you could write your own “all” function that iterates over a map, but if you value memory and cycles why not try some recursion?

def add_downstream_cluster_to_many(network, clusters, add_name) do
  Map.update!(network, :name_to_cluster, &do_add(&1, clusters, add_name))
end

defp do_add(network , [], _), do: network
defp do_add(network, [cluster_name | rest], add_name) do
  network
  |> Map.update!(cluster_name, &do_add_downstream(&1, add_name))
  |> do_add(rest, add_name)
end

defp do_add_downstream(cluster, add_name) do
  Map.update!(cluster, :downstream, &MapSet.put(&1, add_name))
end

Or something like that. A bit of a deconstructed reduce but no excessive redirection or tearing apart and rebuilding maps from lists of k/v tuples.

marick · June 7, 2024, 3:58pm

I’m fully in favor of optimizing measured bottlenecks with careful code, but that’s not how I prefer to write code until I have to.

P.S. Lens is actually a way to write functions missing from Access. You can write things like:

get_in(container, [Lens.map_values, :b])

… although differences of opinion about who should handle missing values means that there are subtleties when combining lenses with literal keys like :b. (Lenses put it inside the lens code, but Access puts it in the glue code between the list items.)

03juan · June 11, 2024, 8:03am

I get ya. To paraphrase people much smarter than I “Make it work, then make it pretty, then make it fast if needed.”

My point was more about the fact that both the Access module and the for comprehension call a lot of functions under the hood, to the extent that I noticed a huge degradation performing them as part of a large-ish ETL loop, whereas it’s fairly trivial to do the same work with some good old-fashioned recursion, performantly.

josevalim · June 11, 2024, 8:09am

A comprehension should be fairly optimized, it is a single Enum.reduce/3 call, so if you can report performance issues, we would gladly investigate it.

Access do traverse and invoke functions at runtime but the macro versions (i.e. update_in(acc.name_to_cluster[elt].downstream, &MapSet.put(&1, :cluster_name)) should be fairly optimizable, because we can inline it at compile-time.

@marick for your case in particular, you would need to define your own traversals indeed. Perhaps a Access.keys could be added as well.

03juan · June 11, 2024, 8:16am

The performance I noticed with Access was because I had to deal with a lot of nested structs at the same time, so without the help of the macro form there were a lot of calls to things like Access.key(). In the case of the comprehensions, I actually may be misremembering major performance issues, I’d have to dig around in my git history to see if I can find anything.