Introducing `for-let` and `for-reduce`

josevalim · December 22, 2021, 10:38am

This is a proposal for introducing let into for-comprehensions. This proposal introduces let in the form of a “Getting Started” guide that could be hosted on the Elixir website. The goal is to show how for can be useful to solve several problems in a format that developers may be familiar with, while still building an intuition on functional ideas.

If you want a fun challenge, try to rewrite all of the for uses below using the Enum module. You can consider doing so in two variants: using a single Enum function and using a pipeline of Enum functions.

This proposal has been previously debated in the mailing list.

The `for` construct

While Elixir does not have loops as found in traditional languages, it does have a powerful for construct, typical to many programming languages, where we can generate, filter, transform, and accumulate collections. In Elixir, we call it for-comprehension.

In this chapter, we will learn how to fully leverage the power behind for-comprehensions to perform many tasks similar to imperative languages, but in a functional manner.

Generators

Let’s start with a simple problem. You have a list of numbers and you want to multiply each element in the list by two. We can do this:

iex> for i <- [1, 2, 3] do
...>   i * 2
...> end
[2, 4, 6]

The part i <- [1, 2, 3] is a generator. It gets each value in the list [1, 2, 3] and binds them to the variable i one at a time. Once i is bound, it executes the contents of the do-end block. The new list is formed by the results of the do-end block.

A comprehension can have multiple generators too. One use of multiple generators is to find all possible combinations between two lists. Imagine for example you are interested in a new car. You have identifier three colors that you like: green, blue, and yellow. You are also divided between three brands: Ford, Volkswagen, and Toyota. What are all combinations available?

Let’s first define variables:

iex> colors = [:green, :blue, :yellow]
iex> cars = [:ford, :volkswagen, :toyota]

Now let’s find the combinations:

iex> for color <- colors, car <- cars do
...>   "#{color} #{car}"
...> end
["green ford", "green volkswagen", "green toyota", "blue ford",
 "blue volkswagen", "blue toyota", "yellow ford", "yellow volkswagen",
 "yellow toyota"]

By having two generators, we were able to combine all options into strings.

Multiple generators are also useful to extract all possible values that are nested within other colors. Imagine that you have a list of users and their favorite programming languages:

iex> users = [
...>   %{
...>     name: "John",
...>     languages: ["JavaScript", "Elixir"]
...>   },
...>   %{
...>     name: "Mary",
...>     languages: ["Erlang", "Haskell", "Elixir"]
...>   }
...> ]

If we want to get all languages from all users, we could use two generators. One to traverse all users and another to traverse all languages:

iex> for user <- users, language <- user.languages do
...>   language
...> end
["JavaScript", "Elixir", "Erlang", "Haskell", "Elixir"]

The comprehension worked as if it retrieved the languages lists of all users and flattened it into a list, with no nesting.

The important concept about for-comprehensions so far is that we never use them to mutate values. Instead, we explicitly use them to explicitly map inputs to outputs: the lists that we want to traverse are given as inputs and for returns a new list as output, based on the values returned by the do-end block.

The `:uniq` option

In the example above, you may be wondering: what if we want all languages from all users but with no duplicates? You are in lucky, comprehensions also accept options, one of them being :uniq:

iex> for user <- users, language <- user.languages, uniq: true do
...>   language
...> end
["JavaScript", "Elixir", "Erlang", "Haskell"]

Comprehension options are always given as the last argument of for, just before the do keyword.

Filters

So far we used comprehensions to map inputs to outputs, to generate combinations, or to flatten lists nested inside other lists. We can also use comprehensions to filter the input, keeping only the entries that match a certain condition. For example, imagine we have a list of positive and negative numbers, and we want to keep only the positive ones and then multiply them by two:

iex> for i <- [-5, -3, -2, 1, 2, 4, 8], i > 0 do
...>   i * 2
...> end
[2, 4, 8, 16]

Filters are given as part of the comprehension arguments. If the filter returns a truthy value (anything except false and nil), the comprehension continues. Otherwise it skips to the next value.

You can give as many filters as you want, including mixed with other generators. Let’s go back to our users example and add some arbitrary rules. Imagine that we only want to consider programming languages from users that have the letter “a” in their name:

iex> for user <- users, String.contains?(user.name, "a"), language <- user.languages do
...>   language
...> end
["Erlang", "Haskell", "Elixir"]

As you can see, due to the filter, we skipped John’s languages.

What if we want only the programming languages that start with the letter “E”?

iex> for user <- users, language <- user.languages, String.starts_with?(language, "E") do
...>   language
...> end
["Elixir", "Erlang", "Elixir"]

Now we got languages from both, including the duplicates, but returned only the ones starting with “E”. You can still use the :uniq option, give it a try!

Computing additional values with `let value = initial`

So far, our comprehensions have always returned a single output. However, sometimes we want to traverse a collection and get multiple properties out of it too.

Let’s go back to our initial example. Imagine that you want to traverse a list of numbers, multiple each element in it by two while returning the sum of the original list at the same time.

In most non-functional programming languages, you might achieve this task like this:

sum = 0
list = []

for(element of [1, 2, 3]) {
  list.append(element * 2)
  sum += element
}

list /* [2, 4, 6] */
sum /* 6 */

This is quite different from how we have been doing things so far. In the example above, the for loop is changing the values of list and sum directly, which is then reflected in those variables once the loop is over.

However, we have already learned that comprehensions in Elixir explicitly receive all inputs and return all outputs. Therefore, the way to tackle this in Elixir is by explicitly declaring all additional variables we want to be looped and returned by the comprehension, using the let qualifier:

iex> for let sum = 0, i <- [1, 2, 3] do
...>   sum = sum + i
...>   {i * 2, sum}
...> end
{[2, 4, 6], 6}

Let’s break it down.

Instead of starting with a generator, our comprehension starts with a let variable = initial expression. let introduces a new variable sum, exclusive to the comprehension, and it starts with an initial value of 0. The same way that i changes on every element of the list, sum will have a new value on each iteration too.

Now that we have an additional variable as input to the comprehension, it must also be returned as output. Therefore, the comprehension do-end block must return two elements: the new element of the list, as previously, and the new value for sum. Those elements are returned in a tuple. Once completed, the comprehension also returns a two-element tuple, with the new list and the final sum as elements. In other words, the shape returned by for matches the return of the do-end block.

If you add IO.inspect/1 at the top of the do-end block, you can see the values of i and sum as the comprehension traverses the collection:

iex> for let sum = 0, i <- [1, 2, 3] do
...>   IO.inspect({i, sum})
...>   sum = sum + i
...>   {i * 2, sum}
...> end

And you will see this before the result:

{1, 0}
{2, 1}
{3, 3}

As you can see, both i and sum change throughout the comprehension.

Given the comprehension now returns a tuple, you can pattern match on it too. In fact, that’s most likely the pattern you will see in actual code, like this:

{doubled, sum} =
  for let sum = 0, i <- [1, 2, 3] do
    sum = sum + i
    {i * 2, sum}
  end

And then you can further transform the doubled list and the sum variable as necessary.

The let qualifier allows us to accumulate additional values within for. Albeit a bit more verbose than other languages, it is explicit: we can immediately look at it and see the inputs and outputs.

Accumulating multiple values

Sometimes you may need to accumulate multiple properties from a collection. Imagine we want to multiply each element in the list by two, while also getting its sum and count. To do so, we could give a tuple of variables to let:

iex> for let {sum, count} = {0, 0}, i <- [1, 2, 3] do
...>   sum = sum + i
...>   count = count + 1
...>   {i * 2, {sum, count}}
...> end
{[2, 4, 6], {6, 3}}

Once again, the shape we declare in let (a two-element tuple) matches the shape we return from the do-block and of the result returned by for.

You could move the initialization of the let variables to before the comprehension:

iex> sum = 0
iex> count = 0
iex> for let {sum, count}, i <- [1, 2, 3] do
...>   sum = sum + i
...>   count = count + 1
...>   {i * 2, {sum, count}}
...> end
{[2, 4, 6], {6, 3}}

let can be a variable or a tuple of variables. If the variables are not initialized, it is expected for such variable to already exist, as in the example above.

Reducing a collection

We have learned how to use let to traverse a collection and accumulate different properties from it at the same time. However, what happens when we are only interested in the properties and not in returning a new collection? In other words, how can we get only the sum and count out of a list, skipping the multiplication of each element by 2?

One option is to use let and simply discard the list result:

{_doubled, {sum, count}} =
  for let {sum, count} = {0, 0}, i <- [1, 2, 3] do
    sum = sum + i
    count = count + 1
    {i, {sum, count}}
  end

However, it seems wasteful to compute a new list, only to discard it! In such cases, you can convert the :let into a :reduce:

{sum, count} =
  for reduce {sum, count} = {0, 0}, i <- [1, 2, 3] do
    sum = sum + i
    count = count + 1
    {sum, count}
  end

By using reduce, we now only need to return the reduce shape from the do-end block, which once again is reflected in the result of the comprehension.

In other words, for-reduce is a special case of for-let, where we are not interested in returning a new collection. It is called reduce precisely because we are reducing a collection into a set of accumulated values. Given that, you could consider let to be a “map and reduce”, as it maps inputs to outputs and reduces the collection into a set of accumulated values at the same time.

Proposal comment: if this proposal is to be accepted, the :reduce option in for will be deprecated.

Summary

In this chapter we have learned the power behind Elixir’s for-comprehensions and how it uses a functional approach, where we list our inputs and outputs, to mimic the power of imperative loops.

While we have used for-comprehensions to perform multiple tasks, such as computing the sum and count, in practice most developers would use the Enum module to perform such trivial tasks. The Enum module contains a series of recipes for the most common (and some also uncommon) operations. For example:

iex> Enum.map([1, 2, 3], fn i -> i * 2 end) 
[2, 4, 6]
iex> Enum.sum([1, 2, 3])
6
iex> Enum.count([1, 2, 3])
3

Still, for-comprehensions can be useful for handling more complex scenarios.

Note we didn’t explore the full power of comprehensions either. We will discuss the additional features behind comprehensions whenever relevant in future chapters.

Proposal notes

This section is not part of the guide but it provides further context and topics from the proposal. My hope is the guide above shows how for can be both a power user tool but also useful in introducing a series of new idioms, unified by a single construct, without imposing all of the functional terminology (such as flatten, map, filter, map_reduce, etc) upfront. Those words are mentioned, but their introduction is casual, rather than the starting point.

Thank you to Saša Jurić and Ben Wilson for reviewing several revisions of this proposal and giving feedback. Note it does not imply their endorsement though.

Error messages

By declaring the shape we want to return in let/reduce, we can provide really good error messages. For example, imagine the user makes this error:

iex> for let {sum, count} = {0, 0}, i <- [1, 2, 3] do
...>   sum = sum + i
...>   count = count + 1
...>   {i * 2, sum}
...> end

The error message could say:

** (ComprehensionError) expected do-end block to return {output, {sum, count}}, got: {2, 1}

Why `let`/`reduce` at the beginning?

One of the things we discovered as we explored this proposal is that, by declaring let and reduce at the beginning, it makes those constructs much more powerful. For example, we could implement a take version of a collection easily:

for let count = 0, count < 5, x <- element do
  {x, count + 1}
end

Or we could even have actual recursion:

for let acc = [:root], acc != [], x <- acc do
  # Compute some notes and return new nodes to traverse
end

While we won’t support these features in the initial implementation (a generator must immediately follow let and reduce), it shows how they are generalized versions of the previous proposal.

Furthermore, the introduction of let and reduce qualifiers opens up the option for new qualifiers in the future, such as for async that is built on top of Task.async_stream/3.

Naming

One aspect to consider is how we should name the qualifiers. let could be called map_reduce but that is both verbose and somewhat ambiguous, as for with no qualifiers already stands for “mapping”. One alternative considered is to use given instead of let:

iex> for given({sum, count} = {0, 0}), i <- [1, 2, 3] do
...>   sum = sum + i
...>   count = count + 1
...>   {i * 2, {sum, count}}
...> end

Variations such as with, using, map_reduce, and acc have been considered, without an obvious improvement over let or given. Other options are for reduce as a replacement for let and use for reduce_only for the reduce variant.

Parens, underscore, or none?

So far, we have used this syntax:

{sum, count} =
  for reduce {sum, count} = {0, 0}, i <- [1, 2, 3] do
    sum = sum + i
    count = count + 1
    {sum, count}
  end

However, should we force parenthesis?

{sum, count} =
  for reduce({sum, count} = {0, 0}), i <- [1, 2, 3] do
    sum = sum + i
    count = count + 1
    {sum, count}
  end

Or perhaps, those should be separate functions altogether?

{sum, count} =
  for_reduce {sum, count} = {0, 0}, i <- [1, 2, 3] do
    sum = sum + i
    count = count + 1
    {sum, count}
  end

Features not covered in this guide

:into, enumerable generators, pattern matching in generators, and binary generators.

massimo · December 22, 2021, 11:58am

Hi Jose,

I’ve followed the streams on Twitch but could not participate live, so I am writing my opinion here.

What do you think of using a where clause like Haskell does and have something like

iex> for let i <- [1, 2, 3], where sum = 0 do 
...>   sum = sum + i
...>   {i * 2, sum}
...> end
{[2, 4, 6], 6}

this could also work in the take example

for let count < 5, x <- element, where count = 0 do
  {x, count + 1}
end

josevalim · December 22, 2021, 12:41pm

I commented this on stream but the reason I don’t like where is because I expect something given to where to be static and never change. Which is what you would find in Haskell and not what we want here.

stefanchrobot · December 22, 2021, 12:44pm

Plus I don’t see the reason to require both “let” and “where” to do one thing. Either “let” or “where”. I think I prefer “let”.

dimitarvp · December 22, 2021, 1:01pm

To me where will be confusing because it implies querying for something and refining the search (as in Ecto). So I’d be against it.

thiagomajesk · December 22, 2021, 2:51pm

I don’t have that much experience with elixir to make a convincing argument, but perhaps the opinion of someone that is still very close to the learning curve can help bring a different context…

So far, I can’t shake the “weird feeling” out of the proposal after reading it because I haven’t experienced many cases in the language where you have such specific semantics or syntax sugar for something like for let or for reduce. I always had this impression of elixir as being an extremely concise language where everything is composable with very few exceptions - like when I discovered that do/end can be represented as a keyword list foo(do: block), I stopped seeing do/end as just syntax sugar.

I don’t recall reading about the concept of ‘qualifiers’ in elixir, so I’m assuming this is a new and exclusive concept to this proposal that only applies to for. If this is only to make elixir more approachable from other languages’ perspectives, I’d ask: “at what cost?”, it’s not clear to me yet what it fixes and if it will yield any more of a positive outcome than adding more friction to something very simple to learn like comprehensions.

PS.: I don’t mean any disrespect with this comment and I’m willing to assume I’m just too dumb to see the real benefits of the proposal (most likely). I just wanted to comment a “first impressions” after reading it. I see that most people agreed with this on the mailing list, but I’m not yet convinced that having many “modes” to for makes it any easier from a beginner’s perspective.

benwilson512 · December 22, 2021, 2:55pm

Interestingly, this is actually sort of the same idea:

iex(3)> quote do: for let counter = 1, x <- list, do: {x * 2, counter + 1}
{:for, [],
 [
   {:let, [],
    [
      {:=, [], [{:counter, [if_undefined: :apply], Elixir}, 1]},
      {:<-, [],
       [
         {:x, [if_undefined: :apply], Elixir},
         {:list, [if_undefined: :apply], Elixir}
       ]},
      [
        do: {{:*, [context: Elixir, import: Kernel],
          [{:x, [if_undefined: :apply], Elixir}, 2]},
         {:+, [context: Elixir, import: Kernel],
          [{:counter, [if_undefined: :apply], Elixir}, 1]}}
      ]
    ]}
 ]}

The proposed syntax is actually already valid syntax and you can write similar macros yourself.

massimo · December 22, 2021, 2:58pm

there’s no reason for both, I left it there because Jose said more than one time that he prefered to have something that signal it’s not a simple for.

I understand, but It actually doesn’t, in plain English.

I think this is pretty clear (I mean what the where clauses does)

fib = (map fib' [0 ..] !!)
    where
      fib' 0 = 0
      fib' 1 = 1
      fib' n = fib (n - 1) + fib (n - 2)

in Haskell where is bound to a surrounding syntactic construct and it’s used to bind local variables but it’s not an expression, while let is.

In my opinion is not where or let the problem, but the order.

If put at the end it makes more sense and it’s more declarative.

If it has to be at the beginning, when sounds more natural to me

iex> for when sum = 0, i <- [1, 2, 3] do 
...>   sum = sum + i
...>   {i * 2, sum}
...> end
{[2, 4, 6], 6}

or maybe prepend it to for making it a let-for

iex> let sum = 0 for i <- [1, 2, 3, 4], rem(i, 2) == 0 do 
...>   sum = sum + i
...>   {i * 2, sum}
...> end
{[4, 8], 6}

benwilson512 · December 22, 2021, 3:00pm

I think my primary concern here is that this sounds conditional. when sum = 0 do this stuff, otherwise don’t. This is already the use of when in Elixir code guard clauses.

Meta point: Given that 95% of the commentary here is now (totally valid) bike shedding about let vs other words, I think the overall proposal is a success. The original proposal had much more core critiques, so @josevalim I think this iteration definitely has legs.

thiagomajesk · December 22, 2021, 3:01pm

The difference being that we use do/end everywhere to define blocks of code, instead of a specific place. If I understood it right, let and reduce do not have other usage outside of the for scope right?

joaoevangelista · December 22, 2021, 3:02pm

Totally worth addition and really liked that we still need to return the rebound “let” variables instead of them being magically updated.

But the qualifier aspect I think that it is not present on other parts of language, as presented, could it not be a keyword, such as for let: {sum, count} = {0,0} or a function call as presented by given({sum, count} = {0, 0}) which returns the AST ? Just for it to be more concise.

The naming part, I would vote for given since it transmit a sense of providing something, let is too generic for this case and it could be more useful in the future.

massimo · December 22, 2021, 3:05pm

that was my idea: the code is valid only when sum = <something numeric> in that snippet, otherwise it’s not.

But I understand the possible confusion.

that’s a valid point

josevalim · December 22, 2021, 3:06pm

I want to be very clear it is not only about making Elixir more approachable. The point is that it is both more approachable and often more elegant than the options currently available in Elixir.

Here is a simple question, can you write the example below using Enum? If so, how? And, once you do, which solution do you prefer?

josevalim:

iex> for color <- colors, car <- cars do
...>   "#{color} #{car}"
...> end
["green ford", "green volkswagen", "green toyota", "blue ford",
 "blue volkswagen", "blue toyota", "yellow ford", "yellow volkswagen",
 "yellow toyota"]

Therefore, a good way for you to answer the question of “at what cost?” is for you to go through each example in the guide and try to write them without using for, using a single function in Enum or recursion. If you can write the Enum variants and if you think they all look better, then this is probably not worth it. But what will most likely happen is that you will find the for variants to be cleaner.

Sebb · December 22, 2021, 3:18pm

I’m not sure if I like it. Obviously we’d normally get a list of tuples when we have a tuple in the for block.

iex> for i <- [1, 2, 3], do: {i, 0}           
[{1, 0}, {2, 0}, {3, 0}]

And with let that magically changes.

Maybe something more explicit, like

iex> for let sum = 0, i <- [1, 2, 3] do
...>   sum = sum + i
...>   i * 2
...> yield
...>  {i, sum}
...> end

Anyway, it’s a great feature.

fuelen · December 22, 2021, 3:33pm

Can’t we simply detect operation by the first argument passed to for? Like, if = is present in expression at the beginning, then treat it as accumulator for reduce operation.

Other than that, I like for_reduce because output has different type.

stefanchrobot · December 22, 2021, 3:36pm

Coming from an imperative background, I had restraints towards for and always preferred Enum. But this part from the original proposal (I suggest looking at the problem stated there) changed my mind:

Comprehensions

Comprehensions in Elixir have always been a syntax sugar to more complex data-structure traversals. Do you want to have the cartersian product between all points in x and y? You could write this:
Enum.flat_map(x, fn i ->
 Enum.map(y, fn j -> {i, j} end)
end)
Or with a comprehension:
for i <- x, j <- y, do: {i, j}
Or maybe you want to brute force your way into finding Pythagorean Triples?
Enum.flat_map(1..20, fn a ->
 Enum.flat_map(1..20, fn b ->
   1..20
   |> Enum.filter(fn c -> a*a + b*b == c*c end)
   |> Enum.map(fn c -> {a, b, c} end)
 end)
end)
Or with a comprehension:
for a <- 1..20,
   b <- 1..20,
   c <- 1..20,
   a*a + b*b == c*c,
   do: {a, b, c}
There is no question the comprehensions are more concise and clearer, once you understand their basic syntax elements (which are, at this point, common to many languages).
As mentioned in the introduction, we can express map, filter, reduce, and collect inside comprehensions. But how can we represent map_reduce in a clear and concise way?

I began to see where I can apply for to make the code cleaner. See here.

You don’t actually need for, but learning it - including the newest additions - will make your code better. I’m really looking forward to this and even more to a way to early-exit the comprehension so that I can get rid of all the not-so-pretty Enum.reduce_while calls.

stefanchrobot · December 22, 2021, 3:37pm

You can use = already in for and it means something else. I think we need the explicitness here.

fuelen · December 22, 2021, 3:38pm

> for a = 1, b <- [1, 2, 3], do: b
** (CompileError) iex:4: for comprehensions must start with a generator

stefanchrobot · December 22, 2021, 3:39pm

Regarding where - as noted by @benwilson512, it already has a specific meaning in Elixir - it tests a condition.

Doesn’t given imply a constant? The value of the variable changes on each iteration. I think let works better here.