Addition of a compact_map/2 function to the Enum module

I propose adding compact_map/2 to the Enum module.

What is it?

Sometimes you want to map over a collection, but sometimes you want to map over a collection keeping only what’s not nil.

Can you do it now?

Yes, you can definitely already do it. You have some options.

Most common, and shortest is:

Enum.map(list, &fun/1)
|> Enum.reject(&is_nil/1)

The reason I don’t love this code, is because mentally I don’t think of it as a map step, and a separate filter step, and when reading the code the reject step feels a bit like noise. Sure if you are rejecting anything less than 3, that’s a part of the logic that should be explicit. But mapping without the nils in my mind feels more like a particular flavor of mapping, rather than two separate steps. These two steps also mean an additional pass over the list. You can do it more efficiently(about 2.5-3.5x more efficiently based on some quick benchmarks), but it requires breaking out reduce/3, which while it’s certainly the most powerful of the Enum functions, it can sometimes feel verbose for simpler things:

Enum.reduce(list, [], fn element, acc ->
  case fun.(element) do
    nil -> acc
    result -> [result | acc]
  end  
end)
|> Enum.reverse()

It feels like a mouthful for just mapping without the nils.

There is a single line solution using just stdlib functions, but this is peak code golf/“just because you can, doesn’t mean you should” level coding. Really smelly stuff to me:

Enum.flat_map(list, &List.wrap(fun.(&1)))

Why not just put it in your own module?

I generally do have a number of functions that are additions to Enum, and it can sometimes be annoying to remember some functions are Enum functions, and others are in some weirdly named module that wanted to be called Enum but couldn’t because Enum™ was already in use. But to me compact_map/2 feels like it’s a core enough functionality that it deserves to be in Enum™ and not some straight-to-DVD knock off MyEnum

Is anyone else doing it? Anyone cool?

Apple is doing it. A lot of people think they’re cool(and other people hate them because so many people think they’re cool).

You marked this as “Easy”. Is that just because you’re hoping someone else will do the work, so it’ll be easy for you?

No. I’m happy to do the work. Here’s an implementation. I think i’ve followed all the current Enum naming conventions, and i’m happy to add some tests, or alter it if there are any suggestions. Or maybe people hate the whole idea, that’s what I hope this post will determine.

  @doc """
  Returns a list where each element is the result of invoking
  `fun` on each element of `enumerable`,
  where the result of `fun` is not `nil`.

  ## Examples

    iex> lookup = %{a: 1, c: 3}
    iex> Enum.compact_map([:a, :b, :c, :d], &lookup[&1])
    [1, 3]

  """
  @spec compact_map(t, (element -> any)) :: list
  def compact_map(enumerable, fun) do
    reduce(enumerable, [], fn element, acc ->
      case fun.(element) do
        nil -> acc
        result -> [result | acc]
      end
    end)
    |> :lists.reverse()
  end

I think it’s cool. Apple thinks it’s cool. Do you think it’s cool?

Moderators, sorry about the double post, my first one had the text deleted when I posted it, and I didn’t want to leave a mangled post up while I rewrote it, so I deleted it, but it seems to still be visible.

2 Likes

Imo flat_map is the operation to use here (even over reduce). What you’re trying to do it a subset of mapping to zero or more results per element on the original enumerable. Especially if you don’t use the capture shorthand this really isn’t that "code golf"y anymore.

Enum.flat_map(list, fn el -> List.wrap(fun.(el)) end)

I’d love to see some some usecases where this would be particularly useful over the current a bit more generic apis. The example on the example implementation doesn’t have me particularly excited.

2 Likes

The reason it’s code golf-y to me is because the only reason to wrap each item, and use flat_map is to get rid of the nil, which I would say is not exactly clear. It’s being wrapped and unwrapped to exploit a quirk of that combination of functions. I think if someone who was new to Elixir, or even had been using it for a while saw that code, they would take quite a long time to under the intention. You need to know that nil wraps to [], and not [nil], as well as realizing that [1], [], [2] flat_maps to [1, 2]. It might be obvious when you think about it, but it is still sort of different than what someone probably first thinks of when they think of what you might use flat_map for.

To me its sort of like map |> Jason.encode!() |> Jason.decode!(keys: :atoms). This code has nothing really to do with encoding or decoding, it’s merely a somewhat easy way to convert string keys to atom keys, but that intention is not really clear from the code alone.

Another example:

"ß" |> String.downcase() |> String.upcase()

If someone doesn’t know the capitalization rules of the German Eszett, they’re likely to think it’s a noop. Sort of like wrapping and immediately unwrapping again.

The reason the example is not that exciting is because I tried to keep it in line with the other examples which are short and to the point. I’m not sure any of the documentation examples alone get me excited about a function.

iex> Enum.map([1, 2, 3], fn x -> x * 2 end)
[2, 4, 6]

Does that get you excited? For me it doesn’t. It just demonstrates in a different way what the words are trying to explain.

Using List.wrap is just because you want the code even shorter. IMO keeping the original: map + reject is much clearer when read. Shorter is not always more readable.

Shame we don’t have Rust’s filter_map though, it would solve your problem right away.

1 Like

Shame we don’t have Rust’s filter_map though, it would solve your problem right away.

My proposal is the fix that very thing. It appears that filter_map in Rust is exactly the same as the compact_map proposed here, and the compactMap in Swift. Though I would say that naming is a bit obscure for Elixir where filter typically implies filtering on a condition, but it looks like in Rust filter() typically implies removing nils(or rather None in Rust)

@deprecated “Use Enum.filter/2 + Enum.map/2 or for comprehensions instead”

Enum.filter_map exists, but is deprecated.

filter in Rust works on predicates (booleans), while filter_map works on option types, which is different to Elixir’s previous filter_map.

My experience with compact in Ruby does not make me optimistic for making it part of the language. A lot of times it was added to remove nils from collection when the better question was to ask why nil is being part of the list in the first place. |> Enum.reject(&is_nil/1) is clear enough without introducing a new verb (compact) for everyone to learn.

I don’t rule out adding functions such as filter_ok and find_ok though, that explicitly look for ok tuples, but then we need to decide if we discard everything else or if we discard specific error results.

Thanks for starting this discussion!

7 Likes

hi there Peter!

for what it’s worth, example you are citing never gets old to me and always gets me excited :smiley:

apologies if i sound too harsh, but what is not getting me excited is a language with bloated syntax, and i am always looking at proposals like yours while keeping this in mind.

imagine how Elixir would look after accepting n such proposals, motivated and justified in similar way, over time t. when i go through this exercise or look at languages which “felt a victim” of syntax bloat, i think it’s harmful.
personally, i prefer to have more generic functions at core syntax level.

I wasn’t aware of that function, but while it has the same name as the Rust function, it appears to be quite different. It’s a more generic filter. So you can do Enum.filter_map(1..10, & &1 < 5, & &1 + 1), whereas the Rust function merely removes nil(actually None, the Rust equivalent)

It doesn’t sound too harsh. I think it’s a valid opinion. But what is the dividing line of bloat vs value? The Enum module has by my count(Enum.__info__(:functions) |> length) 112 functions. I believe every one can be implemented with reduce. Is your argument that anything beyond reduce is bloat?

this is exactly what i think we should not be doing: judging value of your proposal in a relative vacuum of 2-3 functions, like reduce which you mention.

i would prefer to consider Elixir syntax and “built-in” functions as a whole.

  • how verbose it (Elixir) is already?
  • what is the cognitive load associated with language learning, usage and maintaining code written in it?
  • how many ways we have already to do something?
  • is language syntax flat or deep?

this screen taken from José’s post is illustrating my sentiment pretty well, i think:

IMHO it’s easier to add stuff than remove from a language, and more verbose, complex, expressive syntax is, higher entry threshold for proposal as yours should be.

1 Like

You are correct about filter working on predicates. I very quickly glanced this map().filter().map() in the filter_map docs, and assumed there was a filter() function without a predicate. But it does appear from the example that filter_map does merely remove None, and doesn’t take a predicate.

let a = ["1", "two", "NaN", "four", "5"];

let mut iter = a.iter().filter_map(|s| s.parse().ok());

assert_eq!(iter.next(), Some(1));
assert_eq!(iter.next(), Some(5));
assert_eq!(iter.next(), None);

I’m still not sure I understand your point. Is your point that nothing new should be added to the language, period? Since reduce covers the Enum operations, should we have stopped there? Everything beyond that adds some mental overhead for a beginner. If your point is to just not add anything new, fair enough, point taken. But if it’s more nuanced, then surely at some point you’ll have to judge something in a relative vacuum, right?

no, not at all. as i wrote above:

IMHO it’s easier to add stuff than remove from a language, and more verbose, complex, expressive syntax is, higher entry threshold for proposal as yours should be.

my point is that, if you will grow language syntax, make it more complex for relatively small wins, you will bloat it, because you will never be able to decommission and deprecate enough other, alternative syntaxes to balance the growth.

i think our reference frame to judge such additions should be broader, to see where language will go after - let’s say - 30 such additions over 2-5 years.

discussing “proposal vs reduce” is just “too confined”, in my opinion.

i don’t want to send contradictory signals, but i still think such proposals are worth discussing, and of course i am not saying language is finished and the best, and there is nothing to improve. it just doesn’t mean it will be easy to push such proposals through or that even majority of them are worth it.

i don’t believe way you are framing it is correct. overhead we are talking about is a cost for everyone, no matter how advanced or smart you are. language complexity will always impose load on you, no matter who you are.

also, who is average user of the language? genius developer? blue collar coder?

again, i am not saying nothing new should be added. please check what i wrote above.

personally, i am quite happy with tools Elixir gives me already, and to me cost of introducing changes you are proposing is much higher than perceived benefits.
just an opinion of a random guy from the internet :slight_smile:

I’m kinda surprised no one has mentioned for comprehensions. When I think of map/flat_map with a filter, that’s where my mind goes. Of course, to some people’s taste, this may be even more golf-y than flat_map :person_shrugging: Here’s how I’d do it in one line, a single pass through the Enumerable, and no extra List.wrap:

iex(29)> for x <- [1, 2, 3, 4], y = x + 1, Integer.is_odd(y), do: y
[3, 5]

edit: I thought of a better example where we use a helper function that may return nil or not which fits more closely with the original situation

iex(30)> odd_to_nil = fn x -> if Integer.is_odd(x), do: nil, else: x end
#Function<42.105768164/1 in :erl_eval.expr/6>
iex(31)> for x <- [1, 2, 3, 4], y = odd_to_nil.(x), do: y
[2, 4]

The main piece to understand are filters in for comprehensions. If you have a clause that is not a generator, it’s a filter. If it evaluates to falsy, then the do block is not executed and the next iteration begins.

9 Likes

I’m generally happy with status quo but would use it if it were available. Even though it makes sense and I can’t think of anything better, there’s something about the term “compact” that I never liked for this use-case, so in a weird way I’ve enjoyed being forced to “reject nils” in so many words.

Also, there is something off about a function that has “map” in its name that doesn’t return a list of the same size that went into it. Though there are probably examples I’m not thinking of… like map_reduce, though the original length map is still in there :thinking:

EDIT: I re-read the thread and had missed @dimitarvp’s message re: filter_map which I believe Elixir actually used to have? In any event, there is precedence when it comes to my differing-length concerns so I suppose it’s a moot point.

3 Likes

And I was wondering why I can’t quickly find filter_map… It was last directly accessible in the docs all the way back in Elixir 1.4.5, woah.

I understand the liking of the for comprehension and used it a lot but the whole truthy / falsy thing just rings my alarms as I am an other side of the spectrum and I want explicitness all the way to the point of ju-u-ust before it becomes unwieldy and annoying. To me the truthy / falsy semantics will be never welcome but oh well. Just adding one more anecdote because arguing for/against it is fruitless; it’s just a preference in the end, and it is clear that the core team is not on my side in this instance. I’ll just never ever approve of the if value ... thing.

1 Like

But that’s why I guess I’m still confused by your point. You initially stated you get excited by map, but don’t like bloat. map is just reduce keeping all the elements.

personally, i am quite happy with tools Elixir gives me already

Ok, so if we look at something like sum/1. It can be implemented as simply as reduce(list, & &1 + &2). It’s also a function that only works on a list of a specific subset of types. There’s a number of functions that come in pairs like uniq/1 and uniq_by/2 where uniq/1 could easily be called as uniq_by(enum, & &1). There’s max and min, but also min_max(which of course could be achieved through min and max)

So your point is you don’t like bloat, but you like everything that’s currently in Elixir. So by what metric is sum and uniq, which have very simple implementations with reduce and uniq_by, not bloat? It seems like you define bloat as anything not currently in the language.

I’m not suggesting any of those functions are by my definition bloat, but it just seems like a confusing point to say “I don’t like bloat, everything currently in Elixir isn’t bloat, but we can’t compare anything because that’s too narrow”.

I agree with your statement that adding things for small wins will bloat the language, but I’m confused why you think min_max or sum provides value that couldn’t be achieved otherwise, but that anything else is bloat. Why is min_max(functionality already well served by min and max) more valuable than say mean/median/mode, something that’s a bit more complicated to implement with the current tools in Enum?

so, i am excited about stuff like map, reduce in general - they are pretty universal.
i am not saying map on its own is enough to solve your case, but you can add stuff like filter to your recipe and you are probably good.

reg. “I don’t like bloat, everything currently in Elixir isn’t bloat, but we can’t compare anything because that’s too narrow” - let me put that this way: what i mean is that, adding your proposal to the language won’t make it slimmer or more elegant syntactically :wink: . if there is any bloat in Elixir, i don’t think it’s map or reduce.

btw. i appreciate you iterating over the matter, trying to get to the bottom of it. thank you!

I’m now seeing the bloat argument (I’m also very against bloat and am often loud about).

I think a good metric is what José mentioned: the question is “why do you need compact so often?” Everyone once in a while we’re going to need to “compact” (which again I dislike the name even if it is understood generally) but does it really need to be codified? On the other hand, sum and uniq are common needs and with super clear and expressive names… well, uniq could be spelled properly :wink: