Different behavior between List.flatten() and Enum/Stream.flat_map()

pertsevds · December 29, 2022, 10:18am

When i use List.flatten() it would flatten a deep list with every nested [].

iex> list = [[],["ant",["hello","hi",[[[]]]], "bat"], ["cat", "dog"]]
[[], ["ant", ["hello", "hi", [[[]]]], "bat"], ["cat", "dog"]]
iex> List.flatten(list)
["ant", "hello", "hi", "bat", "cat", "dog"]

As expected.

But when i do Stream.flat_map()

iex(6)> list = [[],["ant",["hello","hi",[[[]]]], "bat"], ["cat", "dog"]]
[[], ["ant", ["hello", "hi", [[[]]]], "bat"], ["cat", "dog"]]
iex(7)> list
[[], ["ant", ["hello", "hi", [[[]]]], "bat"], ["cat", "dog"]]
iex(8)> |> Stream.flat_map(& &1)
#Function<60.124013645/2 in Stream.transform/3>
iex(9)> |> Enum.to_list()
["ant", ["hello", "hi", [[[]]]], "bat", "cat", "dog"]

It flattens only the first level of the list. It’s a flatten() with depth == 1.

And that was not what i expected.
I needed a deep_flatten() for Stream, so i’ve made it like this:

defmodule ExTelnet.StreamDeepFlatten do
  def deep_flatten(enumerables) do
    deep_flat_map(enumerables, & &1)
  end

  def deep_flatten(first, second) do
    deep_flat_map([first, second], & &1)
  end

  def deep_flat_map(enum, mapper) when is_function(mapper, 1) do
    Stream.transform(enum, nil, fn val, nil ->
      case val do
        val when is_list(val) -> {deep_flat_map(val, mapper), nil}
        val -> {[mapper.(val)], nil}
      end
    end)
  end
end

Works as expected for me:

iex(2)> list = [[],["ant",["hello","hi",[[[]]]], "bat"], ["cat", "dog"]]
iex(3)> list
iex(4)> |> Stream.map(&IO.inspect(&1))
iex(5)> |> ExTelnet.StreamDeepFlatten.deep_flatten()
iex(6)> |> Stream.map(&IO.inspect(&1))
iex(7)> |> Stream.map(&("seen " <> &1))
iex(8)> |> Stream.map(&IO.inspect(&1))
iex(9)> |> Enum.to_list()
[]
["ant", ["hello", "hi", [[[]]]], "bat"]
"ant"
"seen ant"
"hello"
"seen hello"
"hi"
"seen hi"
"bat"
"seen bat"
["cat", "dog"]
"cat"
"seen cat"
"dog"
"seen dog"
["seen ant", "seen hello", "seen hi", "seen bat", "seen cat", "seen dog"]

So what i’m questioning myself now is: “Am I reinvening the wheel? Maybe there is some better simpler method and I just don’t see it?”

LostKobrakai · December 29, 2022, 10:49am

There might be simpler way to do this, but I’m wondering what the background to this is. Can you show a practical usecase for this, where the inputs are not lists, but actual (nested) enumerables/streams?

pertsevds · December 30, 2022, 9:19am

Sometimes you have functions that parse some portion of text and returns []. If they are included in one another we can have something like [[[]]] at the output.

Sebb · December 30, 2022, 9:39am

behaves like Enum.flat_map/2 not like List.flatten/1

pertsevds · December 30, 2022, 1:56pm

Well, i know. But why?

lucaong · December 30, 2022, 2:04pm

One reason: flat_map flattening a single level, and flatten flattening multiple levels, produces a more versatile behavior: if one needs to flatten only a single level (maybe the nested items are collections themselves and should be treated as individual items) one can use flat_map. If multiple levels should be flattened, one can call flatten inside the function called by flat_map.

pertsevds · December 30, 2022, 2:26pm

I don’t see it as that. flatten for lists - flattens multiple levels. So what is the word “flat” in flat_map? It’s flatten. What does flatten do here? It’s flattening a single level. Why? It’s inconsistent with flatten for lists.

I think it should be named concat_map because of what it does. Not flattening, concatenating. And by the way, in the docs we have exactly that:

conceptually, this is similar to a combination of map/2 and concat/1.

Link: Enum — Elixir v1.14.2

lucaong · December 30, 2022, 2:35pm

Another aspect is that List.flatten is more specialized, it only flattens lists, while it leaves other collections unchanged:

> List.flatten([%{"x" => :foo, "y" => :bar}])
[%{"x" => :foo, "y" => :bar}]

This makes it clear what to flatten and what not. Instead, flat_map flattens every enumerable:

> Enum.flat_map([%{"x" => :foo, "y" => :bar}], fn x -> x end)
[{"x", :foo}, {"y", :bar}]

The fact that flat_map flattens all enumerable would make it more confusing if it were to flatten all levels: if each element is a list of maps, should the map be flattened too, or not? Flattening only one level leaves the choice to the developer.

lucaong · December 30, 2022, 2:38pm

Finally, and possibly more importantly, flattening a single level on flat_map is a common choice on many programming languages, making it the expected behavior for many developers.

Examples are JavaScript:

> [[[123]]].flatMap(x => x)
[[123]]

Ruby:

> [[[123]]].flat_map { |x| x }
=> [[123]]

Sebb · December 30, 2022, 2:41pm

that’s why its called differently.

flat_map is very handy when you map over something and you get results that you just want to omit.

> Enum.flat_map([1,2,3,4,5], fn n -> if rem(n,2)==0, do: [n], else: [] end)
[2, 4]

josevalim · December 30, 2022, 2:46pm

Hi everyone, I believe @davaeron understands the differences between them. The question is about naming.

Correct. This is common nomenclature in all functional languages. Also add Scala and Erlang to your list.

You should read it as a “flat map operation”, i.e. as a map operation that joins its consecutive results, not as a “map plus flatten”.

pertsevds · December 30, 2022, 2:52pm

Thank you. Now it is clear.