Introducing `for let` and `for reduce`

I am resubmitting the proposal from earlier today with more context and more focus on the important parts. Some concerns and praises stayed in the old thread but feel free to drop them again and I will gladly follow-up.

Hi everyone,

This is a proposal for introducing for let and for reduce comprehensions. This proposal is divided in three parts:

  1. Problem statement

  2. A tour into for

  3. Conclusion

Problem statement

I have been on the record a couple times saying that, while some problems are more cleanly solved with recursion, there is a category of problems that are much more elegant with imperative loops. One of those problems have been described in the “nested-data-structures-traversal” repository, with solutions available in many different languages. Please read the problem statement in said repository, as I will assume from now on that you are familiar with it.

Personally speaking, the most concise and clear solution is the Python one, which I reproduce here:

section_counter = 1
lesson_counter = 1

for section in sections:
    if section["reset_lesson_position"]:
        lesson_counter = 1

    section["position"] = section_counter
    section_counter += 1

    for lesson in section["lessons"]:
        lesson["position"] = lesson_counter
        lesson_counter += 1

There are many things that make this solution clear:

  • Reassignment
  • Mutability
  • Sensitive whitespace

Let’s compare it with the Elixir solution I wrote and personally prefer. I am pasting an image below which highlights certain aspects:

Screenshot 2021-12-13 at 10 02 48

  • Lack of reassignment: in Elixir, we can’t reassign variables, we can only rebind them. The difference is, when you do var = some_value inside a if, for, etc, the value won’t “leak” to the outer scope. This implies two things in the snippet above:

    1. We need to use Enum.map_reduce/3 and pass the state in and out (highlighted in red)
    2. When resetting the lesson counter, we need both sides of the conditional (hihhlighted in yellow)
  • Lack of mutability: even though we set the lesson counter inside the inner map_reduce, we still need to update the lesson inside the session (highlighted in green)

  • Lack of sensitive whitespace: we have two additional lines with end in them (highlighted in blue)

As you can see, do-end blocks add very litte noise to the final solution compared to sensitive whitespace. In fact, the only reason while I brought it up is so we can confidentaly discard it from the discussion from now on. And also because there is zero chance of the language suddenly becoming whitespace sensitive. :slight_smile:

There is also zero chance of us introducing reassignment and making mutability first class in Elixir too. The reason for this is because we all agree that, the majority of the times, lack of reassignment and lack of mutability are features that make our code more readabily and understandable in the long term. The snippet above is one of the few examples where we are in the wrong end of the trade-offs.

Therefore, how can we move forward?

Comprehensions

Comprehensions in Elixir have always been a syntax sugar to more complex data-structure traversals. Do you want to have the cartesian product between all points in x and y? You could write this:

Enum.flat_map(x, fn i ->
  Enum.map(y, fn j -> {i, j} end)
end)

Or with a comprehension:

for i <- x, j <- y, do: {i, j}

Or maybe you want to brute force your way into finding Pythagorean Triples?

Enum.flat_map(1..20, fn a ->
  Enum.flat_map(1..20, fn b ->
    1..20
    |> Enum.filter(fn c -> a*a + b*b == c*c end)
    |> Enum.map(fn c -> {a, b, c} end)
  end)
end)

Or with a comprehension:

for a <- 1..20,
    b <- 1..20,
    c <- 1..20,
    a*a + b*b == c*c,
    do: {a, b, c}

There is no question the comprehensions are more concise and clearer, once you understand their basic syntax elements (which are, at this point, common to many languages).

However, a common point that arose during the discussion of how to solve this problem using comprehensions is that comprehensions are currently under utilized in Elixir. Therefore, I would like to introduce for let and for reduce in the form of a “Getting Started” guide that could be hosted on the Elixir website.

The goal is to show how for can be both a power-user tool and still useful to solve several problems in a format that developers may be familiar with, while still building an intuition on functional ideas.


A tour into for

While Elixir does not have loops as found in traditional languages, it does have a powerful for construct, typical to many programming languages, where we can generate, filter, transform, and accumulate collections. In Elixir, we call it for-comprehension.

In this chapter, we will learn how to fully leverage the power behind for-comprehensions to perform many tasks similar to imperative languages, but in a functional manner.

If you want a fun challenge, try to rewrite all of the for uses below using the Enum module. You can consider doing so in two variants: using a single Enum function and using a pipeline of Enum functions.

Generators

Let’s start with a simple problem. You have a list of numbers and you want to multiply each element in the list by two. We can do this:

iex> for i <- [1, 2, 3] do
...>   i * 2
...> end
[2, 4, 6]

The part i <- [1, 2, 3] is a generator. It gets each value in the list [1, 2, 3] and binds them to the variable i one at a time. Once i is bound, it executes the contents of the do-end block. The new list is formed by the results of the do-end block.

A comprehension can have multiple generators too. One use of multiple generators is to find all possible combinations between two lists. Imagine for example you are interested in a new car. You have identifier three colors that you like: green, blue, and yellow. You are also divided between three brands: Ford, Volkswagen, and Toyota. What are all combinations available?

Let’s first define variables:

iex> colors = [:green, :blue, :yellow]
iex> cars = [:ford, :volkswagen, :toyota]

Now let’s find the combinations:

iex> for color <- colors, car <- cars do
...>   "#{color} #{car}"
...> end
["green ford", "green volkswagen", "green toyota", "blue ford",
 "blue volkswagen", "blue toyota", "yellow ford", "yellow volkswagen",
 "yellow toyota"]

By having two generators, we were able to combine all options into strings.

Multiple generators are also useful to extract all possible values that are nested within other colors. Imagine that you have a list of users and their favorite programming languages:

iex> users = [
...>   %{
...>     name: "John",
...>     languages: ["JavaScript", "Elixir"]
...>   },
...>   %{
...>     name: "Mary",
...>     languages: ["Erlang", "Haskell", "Elixir"]
...>   }
...> ]

If we want to get all languages from all users, we could use two generators. One to traverse all users and another to traverse all languages:

iex> for user <- users, language <- user.languages do
...>   language
...> end
["JavaScript", "Elixir", "Erlang", "Haskell", "Elixir"]

The comprehension worked as if it retrieved the languages lists of all users and flattened it into a list, with no nesting.

The important concept about for-comprehensions so far is that we never use them to mutate values. Instead, we explicitly use them to explicitly map inputs to outputs: the lists that we want to traverse are given as inputs and for returns a new list as output, based on the values returned by the do-end block.

The :uniq option

In the example above, you may be wondering: what if we want all languages from all users but with no duplicates? You are in lucky, comprehensions also accept options, one of them being :uniq:

iex> for user <- users, language <- user.languages, uniq: true do
...>   language
...> end
["JavaScript", "Elixir", "Erlang", "Haskell"]

Comprehension options are always given as the last argument of for, just before the do keyword.

Filters

So far we used comprehensions to map inputs to outputs, to generate combinations, or to flatten lists nested inside other lists. We can also use comprehensions to filter the input, keeping only the entries that match a certain condition. For example, imagine we have a list of positive and negative numbers, and we want to keep only the positive ones and then multiply them by two:

iex> for i <- [-5, -3, -2, 1, 2, 4, 8], i > 0 do
...>   i * 2
...> end
[2, 4, 8, 16]

Filters are given as part of the comprehension arguments. If the filter returns a truthy value (anything except false and nil), the comprehension continues. Otherwise it skips to the next value.

You can give as many filters as you want, including mixed with other generators. Let’s go back to our users example and add some arbitrary rules. Imagine that we only want to consider programming languages from users that have the letter “a” in their name:

iex> for user <- users, String.contains?(user.name, "a"), language <- user.languages do
...>   language
...> end
["Erlang", "Haskell", "Elixir"]

As you can see, due to the filter, we skipped John’s languages.

What if we want only the programming languages that start with the letter “E”?

iex> for user <- users, language <- user.languages, String.starts_with?(language, "E") do
...>   language
...> end
["Elixir", "Erlang", "Elixir"]

Now we got languages from both, including the duplicates, but returned only the ones starting with “E”. You can still use the :uniq option, give it a try!

Computing additional values with for let value = initial

So far, our comprehensions have always returned a single output. However, sometimes we want to traverse a collection and get multiple properties out of it too.

Let’s go back to our initial example. Imagine that you want to traverse a list of numbers, multiple each element in it by two while returning the sum of the original list at the same time.

In most non-functional programming languages, you might achieve this task like this:

sum = 0
list = []

for(element of [1, 2, 3]) {
  list.append(element * 2)
  sum += element
}

list /* [2, 4, 6] */
sum /* 6 */

This is quite different from how we have been doing things so far. In the example above, the for loop is changing the values of list and sum directly, which is then reflected in those variables once the loop is over.

However, we have already learned that comprehensions in Elixir explicitly receive all inputs and return all outputs. Therefore, the way to tackle this in Elixir is by explicitly declaring all additional variables we want to be looped and returned by the comprehension, using the for let:

iex> for let sum = 0, i <- [1, 2, 3] do
...>   sum = sum + i
...>   {i * 2, sum}
...> end
{[2, 4, 6], 6}

Let’s break it down.

Instead of starting with a generator, our comprehension starts with a let variable = initial expression. let introduces a new variable sum, exclusive to the comprehension, and it starts with an initial value of 0. The same way that i changes on every element of the list, sum will have a new value on each iteration too.

Now that we have an additional variable as input to the comprehension, it must also be returned as output. Therefore, the comprehension do-end block must return two elements: the new element of the list, as previously, and the new value for sum. Those elements are returned in a tuple. Once completed, the comprehension also returns a two-element tuple, with the new list and the final sum as elements. In other words, the shape returned by for matches the return of the do-end block.

If you add IO.inspect/1 at the top of the do-end block, you can see the values of i and sum as the comprehension traverses the collection:

iex> for let sum = 0, i <- [1, 2, 3] do
...>   IO.inspect({i, sum})
...>   sum = sum + i
...>   {i * 2, sum}
...> end

And you will see this before the result:

{1, 0}
{2, 1}
{3, 3}

As you can see, both i and sum change throughout the comprehension.

Given the comprehension now returns a tuple, you can pattern match on it too. In fact, that’s most likely the pattern you will see in actual code, like this:

{doubled, sum} =
  for let sum = 0, i <- [1, 2, 3] do
    sum = sum + i
    {i * 2, sum}
  end

And then you can further transform the doubled list and the sum variable as necessary.

for let augments for to allow us to accumulate additional values. Albeit a bit more verbose than other languages, it is explicit: we can immediately look at it and see the inputs and outputs.

Accumulating multiple values

Sometimes you may need to accumulate multiple properties from a collection. Imagine we want to multiply each element in the list by two, while also getting its sum and count. To do so, we could give a tuple of variables to let:

iex> for let {sum, count} = {0, 0}, i <- [1, 2, 3] do
...>   sum = sum + i
...>   count = count + 1
...>   {i * 2, {sum, count}}
...> end
{[2, 4, 6], {6, 3}}

Once again, the shape we declare in let (a two-element tuple) matches the shape we return from the do-block and of the result returned by for.

You could move the initialization of the let variables to before the comprehension:

iex> sum = 0
iex> count = 0
iex> for let {sum, count}, i <- [1, 2, 3] do
...>   sum = sum + i
...>   count = count + 1
...>   {i * 2, {sum, count}}
...> end
{[2, 4, 6], {6, 3}}

let can be a variable or a tuple of variables. If the variables are not initialized, it is expected for such variable to already exist, as in the example above.

Reducing a collection

We have learned how to use let to traverse a collection and accumulate different properties from it at the same time. However, what happens when we are only interested in the properties and not in returning a new collection? In other words, how can we get only the sum and count out of a list, skipping the multiplication of each element by 2?

One option is to use let and simply discard the list result:

{_doubled, {sum, count}} =
  for let {sum, count} = {0, 0}, i <- [1, 2, 3] do
    sum = sum + i
    count = count + 1
    {i, {sum, count}}
  end

However, it seems wasteful to compute a new list, only to discard it! In such cases, you can convert the :let into a :reduce:

{sum, count} =
  for reduce {sum, count} = {0, 0}, i <- [1, 2, 3] do
    sum = sum + i
    count = count + 1
    {sum, count}
  end

By using reduce, we now only need to return the reduce shape from the do-end block, which once again is reflected in the result of the comprehension.

In other words, for reduce is a special case of for let, where we are not interested in returning a new collection. It is called reduce precisely because we are reducing a collection into a set of accumulated values. Given that, you could consider let to be a “map and reduce”, as it maps inputs to outputs and reduces the collection into a set of accumulated values at the same time.

Summing up the guide

In this chapter we have learned the power behind Elixir’s for-comprehensions and how it uses a functional approach, where we list our inputs and outputs, to mimic the power of imperative loops.

While we have used for-comprehensions to perform multiple tasks, such as computing the sum and count, in practice most developers would use the Enum module to perform such trivial tasks. The Enum module contains a series of recipes for the most common (and some also uncommon) operations. For example:

iex> Enum.map([1, 2, 3], fn i -> i * 2 end) 
[2, 4, 6]
iex> Enum.sum([1, 2, 3])
6
iex> Enum.count([1, 2, 3])
3

Still, for-comprehensions can be useful for handling more complex scenarios.

Note we didn’t explore the full power of comprehensions either. We will discuss the additional features behind comprehensions whenever relevant in future chapters.


Conclusion

My hope is the guide above shows how for can be both a power user tool but also useful in introducing a series of new idioms, unified by a single construct, without imposing all of the functional terminology (such as flatten, map, filter, map_reduce, etc) upfront. Those words are mentioned, but their introduction is casual, rather than the starting point.

With that said, we can now go back to the original problem and see how an Elixir solution could look like:

{sections, _acc} =
  for let {section_counter, lesson_counter} = {1, 1}, section <- sections do
    lesson_counter = if section["reset_lesson_position"], do: 1, else: lesson_counter
    
    {lessons, lesson_counter} =
      for let lesson_counter, lesson <- section["lessons"] do
        {Map.put(lesson, "position", lesson_counter), lesson_counter + 1}
      end
    
    section =
      section
      |> Map.put("lessons", lessons)
      |> Map.put("position", section_counter)

    {section, {section_counter + 1, lesson_counter}}
  end

Personally speaking, it considerably reduces the amount of noise, while still keeping clear what is the state/accumulator of each comprehension. If for let and for reduce are added, the existing :reduce option will be deprecated.

There are still some implementation details that could be further explored, described in the sections below:

Syntax considerations

No new syntax is necessary to support the constructs above. Similar syntax has already been in use in projects like StreamData.

Similarly, one of the reasons we have for let instead of let for is because the second would require making let a special form, which would definitely break existing code. Furthermore, let is very specific to for and it doesn’t apply to other constructs, so it makes more sense as a part of for than otherwise.

Error messages

By declaring the shape we want to return in let/reduce, we can provide really good error messages. For example, imagine the user makes this error:

iex> for let {sum, count} = {0, 0}, i <- [1, 2, 3] do
...>   sum = sum + i
...>   count = count + 1
...>   {i * 2, sum}
...> end

The error message could say:

** (ComprehensionError) expected do-end block to return {output, {sum, count}}, got: {2, 1}

Why let/reduce at the beginning?

One of the things we discovered as we explored this proposal is that, by declaring let and reduce at the beginning, it makes those constructs much more powerful. For example, we could implement a take version of a collection easily:

for let count = 0, count < 5, x <- element do
  {x, count + 1}
end

Or we could even have actual recursion:

for let acc = [:root], acc != [], x <- acc do
  # Compute some notes and return new nodes to traverse
end

While we won’t support these features in the initial implementation (a generator must immediately follow let and reduce), it shows much more can be achieved with the foundation outlined here.

Furthermore, the introduction of for let and for reduce opens up the possibility of new combinations in the future, such as for async that is built on top of Task.async_stream/3.

73 Likes

Generally for whole idea a big :+1: from me … However I’m a bit worried that constructs like for async let acc … would join Java's public static void main memes. :joy:

Can we have always something like -> with optional reduce word instead using of one of let/reduce words?

{_doubled, {sum, count}} =
  for {0, 0} -> {sum, count}, i <- [1, 2, 3] do
    sum = sum + i
    count = count + 1
    {i, {sum, count}}
  end

{sum, count} =
  for reduce {0, 0} -> {sum, count}, i <- [1, 2, 3] do
    sum = sum + i
    count = count + 1
    {sum, count}
  end

Here -> reminds case statement where {0, 0} looks like case's input pattern match and therefore {sum, count} looks like output.

Also if we would start to use let and rebind operator in for then many confused newbies would try to write an EcmaScript 6 loops. :smiling_imp:

for let i = 0, i < 5, i = i + 1 do
  i
end
3 Likes

Fear not because they are all independent. After all, serially looping a state across async computations would defeat the point of async. :slight_smile:

I would prefer to not use (or reuse) existing operators as I believe that will contribute negatively to readability of the solution.

2 Likes

One thing that popped up in the previous thread is the notion of “qualifiers”:

I don’t recall reading about the concept of ‘qualifiers’ in elixir, so I’m assuming this is a new and exclusive concept to this proposal that only applies to for .

Having read through the previous iterations of the proposal, I know where for let is coming from, but I think it would be surprising to see it without that context.

I would take this as an argument of bringing back required parens:

for let(sum = 0), i <- [1, 2, 3] do
  # ...
end

The fact that a similar concept is used in StreamData is not really convincing. My judgement is that it’s a pretty niche library and I haven’t seen that stuff elsewhere, so check all <stuff> makes me say “what is this thing?!” (I would assume that it’s probably some macro and probably some function is invoked). It is surprising and personally I don’t like it - I guess mostly because it’s unfamiliar. While this is not ambiguous, it kind of feels like it is, so I’d apply the same rule as elsewhere and require the parens.

We could say that for let is a totally different thing than for (why not for_let or forlet then?), but I don’t think this is true going forward where - I’m assuming - there will be a possibility to mix let and async in one for.

So I’d say I have a strong preference for required parens after reconsidering this.

EDIT: Saw the reply:

Fear not because they are all independent. After all, serially looping a state across async computations would defeat the point of async . :slight_smile:

Is this always going to be the case?

4 Likes

I think we should have a discussion between for let(...), for_let, and for let, but perhaps we should do this after the general proposal gets accepted, because there is already a lot to unpack here.

I am not saying it is common. I am saying it doesn’t require new syntax and I gave a reference. :slight_smile: (I agree with you that it isn’t common.)

Yes, there is no meaningful way to combine async with let. But then, we might not add async at all. The point being made is not that we will add all of those possible combinations, the point is that it gives us the possibility to have this conversation in the future if we want to (and that’s better than a design that would not allow us to explore future options).

2 Likes

Makes perfect sense. I’m all for (oh the pun) moving forward with the proposal.

Agreed - no new syntax is a big plus.

I think it’s fair to say that another way to put it is that it’s unfamiliar. Now, does it affect newcomers? Hopefully not so much, as most of the stuff is unfamiliar. And maybe people like me - who have seen some Elixir already - will just get used to it?

Make sense. I’d love to see something like for let ... while ... though.

2 Likes

How is it gonna be implemented ? Will let be a macro in kernel ? or the syntax parsing for for will be augmented with new capabilities ?

In both case I think that code will not be executable on older elixir versions. And if I am right, I don’t see the point of not bringing new syntax to elixir then. Optional parens are confusing. I’d really like optional parens allowed for macros only (though I guess that has to be handled at the parsing level, way before knowing that a call is on a macro or function.)

Qualifiers as it has been said (though I would like some pointers to the earlier discussions on that topic) seems good to me as first class. And it exists already actually, like in the bit syntax <<num :: integer-big-3>> or the range syntax.

1 Like

let can be something that exists only in for. No extra public macros or functions need to be defined.

No, you won’t be able to backport this to a previous Elixir version. But we already had this with the introduction of with.

1 Like

This is true with all additions to for in the past like into: or reduce:.

Tagentially, I think you’re confusing syntax with capabilities. The syntax is already valid:

iex(4)> quote do: for let counter = 1, x <- list, do: {x * 2, counter + 1}
{:for, [],
 [
   {:let, [],
    [
      {:=, [], [{:counter, [if_undefined: :apply], Elixir}, 1]},
      {:<-, [],
       [
         {:x, [if_undefined: :apply], Elixir},
         {:list, [if_undefined: :apply], Elixir}
       ]},
      [
        do: {{:*, [context: Elixir, import: Kernel],
          [{:x, [if_undefined: :apply], Elixir}, 2]},
         {:+, [context: Elixir, import: Kernel],
          [{:counter, [if_undefined: :apply], Elixir}, 1]}}
      ]
    ]}
 ]}

That is to say, for let is lexically and grammatically valid in Elixir today. It’s just that the for macro today doesn’t know what to do with it. Adding additional capabilities to the runtime is always allowed for new elixir versions. Strictly speaking, new syntax is too.

2 Likes

Yest but you both missed my point. If that new code cannot be ran by previous elixir versions then what is the strong argument about bringing new syntax, if it is simpler ?

But anyway, syntax doesn’t matter. @josevalim I think this is a great addition to the language.

What if we use for reduce with also the :reduce option ? Would that be a compilation error ?

Edit: Sorry I meant “what is the strong argument against bringing new syntax”.

Compilation error and :reduce will be deprecated at some point. I think adding no new syntax is a positive.

1 Like

I’m a big fan of comprehensions, and I love the idea of making them even more useful.

My first impression is that for let and for reduce don’t read like most of the language. Even though the :reduce option is a little strange, matching on the acc reminds me of a case or fn.

Edit: to me it’s not clear that the variables bound in for let and for reduce are repeatedly bound to. I’m not used to seeing that elsewhere in the language (except for the left side of a generator; maybe I’m only okay with that because I’m used to it, but I think the <- operator helps).

5 Likes

But it is like a case !

for i <- 1..10, reduce: {:even, 0} do
  {:even, n} -> {:odd, n + i}
  {:odd, n} -> {:even, (n + 1) * i}
end

Edit: to me it’s not clear that the variables bound in for let and for reduce are repeatedly bound to. I’m not used to seeing that elsewhere in the language (except for the left side of a generator; maybe I’m only okay with that because I’m used to it, but I think the <- operator helps).

I think the same. Something like for {sum, count} <- let {0, 0}, i <- my_list, do: ... could be cool too.

Edit: I like it actually because in that case let would be just a thing that implements the ForLoop protocol :wink: and not specific :smiley:

That was my point: I feel like the :reduce option feels more like the rest of the language than the proposed for let and for reduce.

8 Likes

The proposal brings no new syntax. With current Elixir you can already write your own version of for:

myfor let sum = 0, i <- [1, 2, 3] do
  :bar
end

because it’s equivalent to this:

myfor(let(sum = 0, i <- [1, 2, 3], do: :bar))

See @benwilson512’s answer.

The more I think about it, the more of a curve ball I think for let is, specially for newcomers.

If I read this elixir line:

for let {count, sum} = {0, 0}, i <- [1, 2, 3] do

In plain English I read:

for let count & sum equal to 0, item in elements do

Which is not very descriptive on it’s own, and a little confusing considering the meaning in other languages. Certainly I would never expect it to be basically Enum.map_reduce.

I don’t know the intricacies of Elixir lang, but I would try to make it understandable on a english language level (that’s why the Python version is so understandable in my opinion).

Uninformed & probably silly proposal:

for map_reduce i <- [1, 2, 3], acc: {count, sum} = {0, 0} do
7 Likes

I do, though, like the flat_map-style interface. I wonder if an option to do more of a flat_map_reduce-style interface would cover most of these cases. You could conditionally emit zero, one, or many output values for every generator value or even halt iteration. I don’t think it’s that big a deal to throw out some part of an acc (or the collection paired with the acc, in this case).

I’m picturing something like this:

{results, acc} =
  for value <- my_enumerable(), acc: @initial_acc do
    acc ->
      case do_something_with(value, acc) do
        {:one, result} ->
          {[result], update_acc(result, acc)}

        {:many, results} ->
          {results, Enum.reduce(results, acc, &update_acc/2)}

        :ignore ->
          {[], acc}

        :error ->
          {:halt, acc}
      end
  end

I agree that this makes the comprehension “read” oddly in English. What about this kind of qualifier syntax:

for i <- [1,2,3,4],
let {count, sum} = {0, 0} do 
    sum = sum + i
    count = count + 1
    {i, {sum, count}}
end

Now in English it’s more obvious what this parses out to: “For every item in the list let variables count and sum be zero and do the following functions using the item, count, and sum.”
Actually, as an implementation detail this might be the simplest as behind the scenes the let keyword just wraps the initial assignment in an enumerable:
let {count, sum} = {0,0} being equivalent to {count, sum} <- [{0,0}]

8 Likes

To me the let word carries the imperative / mutable connotation still. Has it been considered that using e.g. acc and not let might be steering the reader into the right way of thinking about this new for syntax?

I like this, a lot, and here let actually makes more sense (to me at least).

7 Likes