Fetch a hash key for a list item

_lauri · January 22, 2018, 8:51am

Is there a better way to achieve the following in Elixir? In most languages I would return a value from within a loop if a match is encountered, but this doesn’t seem possible in Elixir (or is it?). Been only at it for 2 days, so sorry for the awful code, but here’s what I got (works, but not very semantic, especially the elem(tuple, id) fetching part):

fields = ["done", "network"]
fields_map = %{"status" => ["done", "pending"],"source" => ["network", " family"]}

# How can I map fields to corresponding keys; like the function down below does, but hopefully in a more elegant way

# Is there a more elegant way than this?
Enum.reduce(fields, %{}, fn (field_value, mapped_fields) ->
  Enum.find_value(fields_map, fn(field) ->
    field_values = elem(field, 1)
    if Enum.member?(field_values, field_value) do
      field_type = elem(field, 0)
      mapped_fields = Map.put(mapped_fields, field_type, field_value)
    end
  end) || mapped_fields
end)

# Results as wished: %{"source" => "network", "status" => "done"}

kip · January 22, 2018, 10:09am

In Elixir I think its common to first look at transforming the shape of the data to make the lookup more direct. Assuming for the moment that your fields_map is a predefined list (I realise it may not be) then you can do the transformation at compile time which means the transformation is a one-time cost.

Then for the lookup itself, rather than think imperatively, think again about transformation. In this case you are transforming the list of strings into a map. Enum.map/2 is your friend.

Here’s one example approach:

defmodule Thing do
  @field_map %{"status" => ["done", "pending"], "source" => ["network", "family"]}
    
  @transformed_field_map Enum.reduce(@field_map, [], fn {k, v}, acc ->
      [Enum.map(v, fn i -> {i, k} end) | acc]
    end)
    |> List.flatten
    |> Enum.into(%{})
  
  def lookup(items) when is_list(items) do
    Enum.map(items, fn item -> {Map.get(@transformed_field_map, item), item} end)
    |> Enum.into(%{})
  end
end

Then:

iex> Thing.lookup ["done", "network"]
%{"source" => "network", "status" => "done"}

peerreynders · January 22, 2018, 4:31pm

As was already stated functional programming is about data transformation. Looking at your description I got this:

“Look through field_maps key/values and only keep those keys with at least one value found in fields (with only the values that are in fields).”

“only keep” sounds like filtering

intersection =
  Enum.filter(values, &(Enum.member?(with_values, &1)))

where values are the values found under a key while with_values are the values we are looking for.

We need to do that on each key/values pair found in the map. intersection will be [] (an empty list) if there are not common values - i.e. we are not interested in that key. Otherwise the list contains at least one matching value and we want to keep that key with that list:

fn ({key, values}, result) ->
  intersection =
    Enum.filter(values, &(Enum.member?(with_values, &1))) with_values
  case intersection do
    [] -> result
    _ -> Map.put(result, key, intersection)
  end
end

Notice that this is formulated in a style of a function to be used with Enum.reduce/3 or List.foldl/3. But we still need to get with_values from somewhere. Create a function that includes it in a closure:

put_key_with_some_values =
  fn with_values ->
    # return the following function
    fn ({key, values}, result) ->
      intersection =
        Enum.filter(values, &(Enum.member?(with_values, &1)))
      case intersection do
        [] -> result        
        _ -> Map.put(result, key, intersection)
      end
    end
  end

Now we can create a function that can be folded (once) over a list to pick out the key values we want:

map_with_some_values =
  fn (map, values) ->
    map
    |> Map.to_list()
    |> List.foldl(%{}, put_key_with_some_values.(values))
  end

So we are transforming fields_map to a key/value list which can then be picked over with foldl (reduce) to create the new map including only the keys and values we want with the help from the function supplied to foldl (reduce).

fields = ["done", "network"]
fields_map = %{"status" => ["done", "pending"],"source" => ["network", " family"]}

put_key_with_some_values =
  fn with_values ->
    # function for reduce/foldl (elem, acc -> acc)
    fn ({key, values}, result) ->
      intersection =
        Enum.filter(values, &(Enum.member?(with_values, &1))) # overlap between values and with_values
      case intersection do
        [] -> result                                         # nothing to add
        _ -> Map.put(result, key, intersection)              # add values found in with_values under key
      end
    end
  end

map_with_some_values =
  fn (map, values) ->
    map
    |> Map.to_list()                                        # convert to keyword list
    |> List.foldl(%{}, put_key_with_some_values.(values))   # only keep keys with some of the values specified
  end

result =
  map_with_some_values.(fields_map, fields)

IO.puts("Result: #{inspect result}")

$ elixir demo.exs
Result: %{"source" => ["network"], "status" => ["done"]}
$

So while you may be used to higher order functions from JavaScript, I suspect that you need to become a bit more comfortable with leaning much more heavily on them.

In your problem statement try to phrase the problem as a transformation.

Edit: for a single value replace the filter, empty list logic with find/3, nil logic.

fn ({key, values}, result) ->
  first =
    Enum.find(values, &(Enum.member?(with_values, &1))) # first value from with_values
  case first do
    nil -> result                                       # nothing to add
    _ -> Map.put(result, key, first)                    # add first value found in with_values under key
  end
end

gregvaughn · January 23, 2018, 12:18am

This can be done as a one-liner for comprehension, which are ideal for this sort of nested filtering:

iex(61)> fields = ["done", "network"]
iex(62)> fields_map = %{"status" => ["done", "pending"],"source" => ["network", " family"]}
iex(63)> for {k, values} <- fields_map, v <- values, v in fields, into: %{}, do: {k, v}
%{"source" => "network", "status" => "done"}

peerreynders · January 23, 2018, 3:45am

Sometimes I find the name “for comprehension” a bit ironic because I find that they can really quickly get “in-comprehensible” - reading or unwinding them can be slow and tedious. As far as I can tell it expresses something along the lines of:

fields = ["done", "network"]
fields_map = %{"status" => ["done", "pending"],"source" => ["network", " family"]}

key_field_values =
  fn k ->
    fn v, acc ->
      cond do
        v in fields ->                           # 3. v in fields
          [{k,v}|acc]                            # 5. do: {k, v}
        true ->
          acc
      end
    end
  end

fold_key_values =
  fn {k, values}, acc ->
    List.foldl(values, acc, key_field_values.(k)) # 2. v <- values
  end

result =
  fields_map
  |> Map.to_list()                               # 1. {k, values} <- fields_map
  |> List.foldl([], fold_key_values)
  |> Enum.into(%{})                              # 4. into: %{}

IO.puts("Result: #{inspect result}")

Now I’m not saying that the code is any clearer but going through the motions of piecing it together can be part of the overall “comprehension” process.

gregvaughn · January 23, 2018, 6:11am

I agree for comprehensions can become difficult to read at times. This one is non-trivial because most people are not even aware of filtering clauses, but it does not feel overwrought to me. Here’s how I mentally parse it:

for {k, values} <- fields_map, v <- values, v in fields, into: %{}, do: {k, v}

for each key (k) and list of values in fields_map, and each value (v) in the list of values, when v is in fields, then put the k and v into a map.

Your analysis is hard for me to follow. If I wanted to avoid a for comprehension, I’d still think of it as transforming fields_map, like:

Map.new(fields_map, fn {k, values} ->
  {k, Enum.find(values, fn v -> v in fields end)}
end)

peerreynders · January 23, 2018, 11:57am

Your analysis is hard for me to follow.

I don’t disagree - but I find that I can relatively quickly arrive at a description of the comprehension similar to yours and still not fully understand what is actually going on. It might be just enough to get some intuitive grasp but that can be easily lead astray.

Map.new(fields_map, fn {k, values} →
{k, Enum.find(values, fn v → v in fields end)}
end)


I think that is the best one yet. 

Though I'm not keen on what happens for
`fields = ["done"]`
i.e. 
`%{"source" => nil, "status" => "done"}`

The comprehension (and the original code) gives
`%{"status" => "done"}`

gregvaughn · January 23, 2018, 4:11pm

From a contrary perspective, I generally don’t care about the exact steps a comprehension has to perform. I love its declarative nature. It’s like SQL – I generally don’t care about the query plan (unless it’s been shown to be a performance bottleneck). Same with for comprehensions. I trust that the compiler writers will make it do what I declare I want it to do and they will generally do it efficiently (they even have access to certain performance techniques that I don’t).

RE: the nil behavior of the Map.new/2 approach – Yes, I noticed that difference when matches were not found. In either case result_map["key that didn't match"] returns nil, so I didn’t sweat it. If I were so inclined, I’d argue that the Map.new/2 approach is the pure transformation approach because it preserves every key in the map. All the others are a combination of transformation and reduction and therefore introduce unnecessary reduction complexity. But the requirements aren’t particularly clear on this corner case, so perhaps the reduction is necessary.

OvermindDL1 · January 24, 2018, 6:27pm

That is why I prefer writing Elixir-style for-comprehensions as multiple lines.

Though doing that all in a function argument with commas at the end of each-but-final line is irritating, so I ended up making my own version of for comprehensions a while back that even outperform the stock version of for comprehensions (with even more additional abilities) thanks to some type information you can supply.

How mine look (remember, a bit of extra typing information, I prefer it as it makes it both more clear to me and to the compiler of my comprehension (named comp since for is taken) and allows it to generate at worst equal to Elixir stock-for performance and at best a LOT better performance! ^.^
Example from my very limited docs (as I’ve not published it or anything, I just use it at times in my own stuff, I probably should clean it up sometime and release my ‘core’ library after removing cruft):

  iex> comp do
  ...>   x <- list [1, 2, 3]
  ...>   x
  ...> end
  [1, 2, 3]

  iex> comp do
  ...>   x <- list [1, 2, 3]
  ...>   x * 2
  ...> end
  [2, 4, 6]

  iex> l = [1, 2, 3]
  iex> comp do
  ...>   x <- list [1, 2, 3]
  ...>   y <- list l
  ...>   x * y
  ...> end
  [1, 2, 3, 2, 4, 6, 3, 6, 9]

And a stupid-simple benchmark I made when I was initially developing it:

defmodule Helpers do
  use ExCore.Comprehension

  # map * 2

  def elixir_0(l) do
    for\
      x <- l,
      do: x * 2
  end

  def ex_core_0(l) do
    comp do
      x <- list l
      x * 2
    end
  end

  # Into map value to value*2 after adding 1

  def elixir_1(l) do
    for\
      x <- l,
      y = x + 1,
      into: %{},
      do: {x, y * 2}
  end

  def ex_core_1(l) do
    comp do
      x <- list l
      y = x + 1
      {x, y * 2} -> %{}
    end
  end
end

inputs = %{
  "List - 10000 - map*2" => {:lists.seq(0, 10000), &Helpers.elixir_0/1, &Helpers.ex_core_0/1},
  "List - 10000 - into map +1 even *2" => {:lists.seq(0, 10000), &Helpers.elixir_1/1, &Helpers.ex_core_1/1},
}


actions = %{
  "Elixir.for"  => fn {input, elx, _core} -> elx.(input) end,
  "ExCore.comp" => fn {input, _elx, core} -> core.(input) end,
}


Benchee.run actions, inputs: inputs, time: 5, warmup: 5, print: %{fast_warning: false}

And the results:

Operating System: Linux
CPU Information: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
Number of Available Cores: 2
Available memory: 8.011072 GB
Elixir 1.6.0-dev
Erlang 20.1
Benchmark suite executing with the following configuration:
warmup: 5.00 s
time: 5.00 s
parallel: 1
inputs: List - 10000 - into map +1 even *2, List - 10000 - map*2
Estimated total run time: 40.00 s



Benchmarking with input List - 10000 - into map +1 even *2:
Benchmarking Elixir.for...
Benchmarking ExCore.comp...

Benchmarking with input List - 10000 - map*2:
Benchmarking Elixir.for...
Benchmarking ExCore.comp...

##### With input List - 10000 - into map +1 even *2 #####
Name                  ips        average  deviation         median
ExCore.comp        342.58        2.92 ms     ±4.24%        2.89 ms
Elixir.for         307.20        3.26 ms     ±5.52%        3.21 ms

Comparison: 
ExCore.comp        342.58
Elixir.for         307.20 - 1.12x slower

##### With input List - 10000 - map*2 #####
Name                  ips        average  deviation         median
ExCore.comp        2.48 K      403.16 μs    ±17.93%      403.00 μs
Elixir.for         1.99 K      501.74 μs    ±12.10%      512.00 μs

Comparison: 
ExCore.comp        2.48 K
Elixir.for         1.99 K - 1.24x slower

And it generated this code if you are curious as to how it works and is so fast:

(
  (
    defp($comp_ex_core_1_1_32(l, acc)) do
      :maps.from_list($comp_ex_core_1_1_32_2(l, acc, l))
    end
    (
      defp($comp_ex_core_1_1_32_2(l, acc, [])) do
        _ = l
        acc
      end
      defp($comp_ex_core_1_1_32_2(l, acc, [x | list]) when true) do
        _ = l
        acc = [(
          y = x + 1
          {x, y * 2}
        ) | acc]
        $comp_ex_core_1_1_32_2(l, acc, list)
      end
      defp($comp_ex_core_1_1_32_2(l, acc, [_ | list])) do
        $comp_ex_core_1_1_32_2(l, acc, list)
      end
    )
  )
  (
    defp($comp_ex_core_0_1_15(l, acc)) do
      :lists.reverse($comp_ex_core_0_1_15_2(l, acc, l))
    end
    (
      defp($comp_ex_core_0_1_15_2(l, acc, [])) do
        _ = l
        acc
      end
      defp($comp_ex_core_0_1_15_2(l, acc, [x | list]) when true) do
        _ = l
        acc = [x * 2 | acc]
        $comp_ex_core_0_1_15_2(l, acc, list)
      end
      defp($comp_ex_core_0_1_15_2(l, acc, [_ | list])) do
        $comp_ex_core_0_1_15_2(l, acc, list)
      end
    )
  )
)

Which should be fairly optimal erlang (after the usual optimization passes) if my memory does not fault me. I cannot run the code through the formatter because Elixir is not homoiconic and it’s AST can represent things that the language cannot, so the above display is just Elixir’s “Best Effort” at outputting valid Elixir (which failed in this case, it should have unquoted the names and so forth).

The basic syntas is as such:

comp do # Standard body wrapping
  # The body can contain a variety of things, but 3 basic 'formats (`<>` is requires, `[]` is optional)

  <match> [when blah] <- <cmd> <something>
  # Matches the output of the `something` expression, interpreted as a `cmd` (you can use `Access` as
  # a fallback for any iterable type), a few built in ones like map/list/filter and such, but any custom user
  # module works too (which is how the `Access` fallback works) that follows the Access style calls.
  # The when part is optional of course, the match is what it is matched with, if the match fails then it
  # bypasses this loop as expected.
  # If this is the last expression in the body then the value of this binding is what is returned for this loop.

  [binding =] <expr>
  # A simple binding expression, not a filter like in Elixir's `for` (use the `filter` command for that)
  # If this is the last expression in the body then it is returned, useful when you have just an expression
  # without a binding (though you can always have an expression for it's side-effects elsewhere).

  expr -> <type>
  # If this exists it must be the last value of the body (or it errors).  The expression can be any expression
  # that returns a list (I need to support typing this better...) and it is returned 'as' the type (like Elixir's `for`).
  # You can specify an example of the type like `%{}` or `[]` or so, or the typespec name of it, like `map()` or
  # `list()` or so (or a custom module).  For types it knows about internally then it will generate optimized code
  # for it.

end # Standard body ending

I prefer this style a lot better for a large body-style iterable. Though I still wish we had Erlang’s comprehension syntax ‘as well’; it is SO much shorter and more readable (though very constrained in what you can do) so would be perfect for ‘most’ use-cases, like the one a few posts above.