Introducing `for let` and `for reduce`

christhekeele · December 24, 2021, 12:25am

What I don’t really like about let is that it’s very specific to assignment. (init is even worse in this regard IMO.) let var = 0 means cause var to become 0, but that’s only the surface of what this qualifier is doing in the proposal. Sure, it causes var to become 0; but then it rebinds var each loop based on the output of the previous one.

for (let x = 0... feels natural in JS, but that is because all the let keyword is doing semantically in that case is actual assignment. In our case, let is actually changing what happens as for generates values, by rebinding certain variables in the block’s scope before it is re-executed.

I guess, calling this qualifier let because it lets variables start off as something… before rebinding them each loop, feels a lot like if Elixir decided to call = the assignment operator, because it assigns to variables… if they were unbound as the end of a very powerful pattern-match feature. let is doing much more here, it deserves a better name!

I like bind because it means more literally “keep these things close together”, whereas let is a more dismissive, one-time thing. We are not just letting var = 0, we are binding var to 0 and binding the lifetime of var to the execution of the for loop. I guess? It is hard to talk about how words feel in programming…

That’s fair, shouldn’t bring it back into discussion!

brettbeatty · December 24, 2021, 4:33am

I lost track of the message responding to my plug for more of a flat_map_reduce than flat_map, but since I’ve seen early halt come up again: I think my point was that the flat_map_reduce would allow us to put the acc after the generators and filters (which I feel reads more like the rest of Elixir) and still have a mechanism to filter or emit more values within the do end, which for more complicated logic I feel would be clearer than trying to cram it all into generators and filters.

Taking the reduce-as-a-complex-find example:

  for reduce(value = nil),
      value == nil,
      section <- notebook.sections,
      %Cell.Elixir{} = cell <- section.cells,
      output <- cell.outputs do
    case output do
      {:js_static, %{assets: %{hash: ^hash} = assets_info}, _data} -> assets_info
      {:js_dynamic, %{assets: %{hash: ^hash} = assets_info}, _pid} -> assets_info
      _ -> nil
    end
  end

That would look more like this:

  for section <- notebook.sections,
      %Cell.Elixir{} = cell <- section.cells,
      output <- cell.outputs,
      acc: nil do
    acc ->
      case output do
        {:js_static, %{assets: %{hash: ^hash} = assets_info}, _data} -> {:halt, assets_info}
        {:js_dynamic, %{assets: %{hash: ^hash} = assets_info}, _pid} -> {:halt, assets_info}
        _ -> {[], acc}
      end
  end
  #=> {[], assets_info | nil}

To me that seems easier to see what’s going on.

cmo · December 24, 2021, 7:04am

I like these proposed syntax but agree that it is tricky to find a single, short word that coneys the meaning. I tend to prefer let and acc over the other suggestions, but I like for with(x=0), i <- stuff do or something crazy like for |x=0, y=0|, i <- stuff do myself.

I’m not put off by with being used elsewhere but unsure if it’s possible to have both. Other synonyms could be carrying, using, keeping, holding, rebinding…

opsb · December 24, 2021, 9:02am

Since the beginning of this thread I’ve had a strong aversion to the use of let and I realise now it’s because in other functional languages let is used to introduce a binding which can’t be rebound. When I read let I see a value which will never change so I get some cognitive dissonance seeing it used in this proposal.

Regarding the example

for let(sources),
    sub <- subs,
    sub_formatter = Path.join(sub, ".formatter.exs"),
    File.exists?(sub_formatter) do
  formatter_opts = eval_file_with_keyword_list(sub_formatter)

  {formatter_opts_and_subs, sources} =
    eval_deps_and_subdirectories(:in_memory, [sub], formatter_opts, sources)

  {{sub, formatter_opts_and_subs}, sources}
end

As other’s have suggested using map_reduce does seem far clearer to me. When I see this I can understand that we’re using a particular type of for comprehension. It specifies how variables are bound, what they’re initialised to, what the folding function is and what the return shape is (I have a hard time associating all of this with let).

for map_reduce(sources),
    sub <- subs,
    sub_formatter = Path.join(sub, ".formatter.exs"),
    File.exists?(sub_formatter) do
  formatter_opts = eval_file_with_keyword_list(sub_formatter)

  {formatter_opts_and_subs, sources} =
    eval_deps_and_subdirectories(:in_memory, [sub], formatter_opts, sources)

  {{sub, formatter_opts_and_subs}, sources}
end

It also opens the door for other functions

for group_by, x <- 1..10, y <- 1..10 do
  case {x > 5, y > 5} do
    {true, true} -> :top_right
    {true, false} -> :bottom_right
    {false, false} -> :bottom_left
    {true, false} -> :top_left
  end
end

(one advantage of having the function at the start vs the current keyword style is that you don’t need an associated value for the syntax to make sense i.e.

for x <- 1..10, y <- 1..10, group_by: ? do

I do still have some concerns about the rebinding being a bit mysterious but the examples are quite compelling so I can get behind just learning it as a new special form.

wmnnd · December 24, 2021, 11:07am

I think the proposal overall is fantastic and very well thought through!

let seems clear enough once you know what it does – I’m not a big fan of either init (too ambiguous) or let with parentheses (looks too much like an external function).

I wonder if we’re seeing the adoption of let as a common signal for scoped variable bindings in Elixir; after all it was recently introduced in heex templates recently (<.form let={f} for={@changeset} phx-change="validate" phx-submit="save">).

benwilson512 · December 24, 2021, 12:15pm

I am largely happy with let but I think bind might be better for this reason. It communicates that you are binding and (re)binding the variable to the loop. +1 from me.

alco · December 24, 2021, 1:07pm

I’m sorry if I’m repeating someone else’s argument. This post already had 66 messages when I first saw it and I haven’t been able to read through the whole thing yet. The general sentiment I got from a quick look is that most people are in support of the proposal and what’s being discussed at this point are syntactical and implementation details.

I wanted to share an observation about the original post that starts with a problem statement but then goes on to present a new syntax without making a strong case for how it solves the original problem.

Jose’s solution to the traversal problem in Elixir:

{sections, _acc} =
  Enum.map_reduce(sections, {1, 1}, fn section, {section_counter, lesson_counter} ->
    lesson_counter = if section["reset_lesson_position"], do: 1, else: lesson_counter

    {lessons, lesson_counter} =
      Enum.map_reduce(section["lessons"], lesson_counter, fn lesson, lesson_counter ->
        {Map.put(lesson, "position", lesson_counter), lesson_counter + 1}
      end)

    section =
      section
      |> Map.put("lessons", lessons)
      |> Map.put("position", section_counter)

    {section, {section_counter + 1, lesson_counter}}
  end)

The new solution the uses for let:

{sections, _acc} =
  for let {section_counter, lesson_counter} = {1, 1}, section <- sections do
    lesson_counter = if section["reset_lesson_position"], do: 1, else: lesson_counter
    
    {lessons, lesson_counter} =
      for let lesson_counter, lesson <- section["lessons"] do
        {Map.put(lesson, "position", lesson_counter), lesson_counter + 1}
      end
    
    section =
      section
      |> Map.put("lessons", lessons)
      |> Map.put("position", section_counter)

    {section, {section_counter + 1, lesson_counter}}
  end

They looks mostly the same: both have the same shape, the same nested “map-reduce” traversals. So all of the original observations Jose made about the first solution–lack of reassignment, lack of mutability–hold for the second one as well.

Ultimately, the question posed in the original post “Therefore, how can we move forward?” remains unanswered. To me it looks like there was a motivating example to showcase how the new proposal could by applied in practice, but by the time we got to the end of the proposal, the motivating example had been forgotten and turned out to be irrelevant to the discussion.

I don’t really get the rationale behind the additional syntax being more beginner-friendly. On the contrary, it seems to bring more dilution to the notion of “idiomatic Elixir” because different people will prefer using different constructs to do the same thing. For example,

iex> sum = 0
iex> count = 0
iex> for let {sum, count}, i <- [1, 2, 3] do
...>   sum = sum + i
...>   count = count + 1
...>   {i * 2, {sum, count}}
...> end

can already be written as

Enum.map_reduce([1, 2, 3], {0, 0}, fn i, {sum, count} ->
  sum = sum + i
  count = count + 1
  {i*2, {sum, count}}
end)

and the differences between the two are superficial.

So far, the interesting bits about the proposal are the possibilities Jose mentions towards the end of the original post, the ones that aren’t going to be supported initially. But the tradeoff of starting on the path towards those possibilities is making it more difficult to decide which way of writing any given piece of code is the preferred one, given that the number of alternatives keeps increasing.

alco · December 24, 2021, 1:15pm

Not meaning to derail the discussion but I just have a feeling that for's self-containedness is what’s motivating the addition of new syntax. What if we could “break out” of its “do-end” box and start using it as part of idiomatic Elixir pipelines? This could be achieved by adding a way to get a stream out of for, say,

{combinations, {sum, count}} =
  for(i <- [1,2,3], j <- [:a,:b,:c], as: :stream, do: {i, j})
  |> Enum.map_reduce({0, 0}, fn {i, j}, {sum, count} ->
    sum = sum + i
    count = count + 1
    [{i, j}, {sum, count}}
  end)

pickme467 · December 24, 2021, 1:19pm

Hi,

How about introducing something similar to LOOP macro in Lisp? With some properties like :collecting, :maximize, etc. It would give a complete set of options to be generic for most cases. If that would be reduce or map would depend on the annotations made by the user. I’m curious how difficult it would be to write LOOP macro in Elixir? In fact I always saw list comprehensions as a smaller brother of Lisp’s LOOP macro. But maybe that was only me…

P.

josevalim · December 24, 2021, 2:21pm

Yes, they look quite similar, but that’s partially the point. If we introduce something too different, then it is ultimately going to be rejected because it is not similar to anything in Elixir.

At the same time, if it is too similar, then people may say “well, it doesn’t add much”. And overall that’s a hard line to balance.

The motivating example brings two concerns:

The current solution has too much noise
The current solution requires knowing precisely the magic incantation to solve the problem (the word map_reduce)

If you go through the proposed guide and try to solve it with Enum, you are going to see that each step requires knowing a particular function. First you map, then you flat map, then flat map+map, then you filter, then you map reduce, etc. We have even discussed examples that have to use flat+map+reduce.

The whole point of for is that with the addition of two constructs (let and reduce), we can express almost everything in the Enum module. From any?, to find_value, to flat_map_reduce, to map_reduce, and to reduce, but without imposing all of this naming upfront. You have a single construct with one or two variations on top, and that’s it. Oh, and it also slightly reduces the amount of noise too!

This argument does not hold by itself because of this:

Well, we already have recursion, so why have the functions in Enum?
Well, we already have Enum, so why have for comprehension in the fist place?

Let’s say we go back in time and use this argument to not add for to the language. After all, for is a different construct for doing the same thing as Enum. Here is what would happen. We would have to rewrite this:

    for {encoding1, value1} <- alphabet,
        {encoding2, value2} <- alphabet do
      encoding = bsl(encoding1, 8) + encoding2
      value = bsl(value1, shift) + value2
      [clause] = quote(do: (unquote(value) -> unquote(encoding)))
      clause
    end

to this:

    Enum.flat_map(alphabet, fn {encoding1, value1} ->
      Enum.map(alphabet, fn {encoding2, value2} ->
        encoding = bsl(encoding1, 8) + encoding2
        value = bsl(value1, shift) + value2
        [clause] = quote(do: (unquote(value) -> unquote(encoding)))
        clause
      end)
    end)

From this:

      for %{pid: pid} <- files,
          {_, _, ref, ^pid, on, _, _} <- waiting,
          not defining?(on, waiting),
          do: {ref, :not_found}

to this:

      Enum.flat_map(files, fn %{pid: pid} ->
        waiting
        |> Enum.filter(fn {_, _, ref, waiting_pid, on, _, _} -> waiting_pid == pid end)
        |> Enum.filter(fn {_, _, _, _, on, _, _} -> not defining?(on, waiting) end)
        |> Enum.map(fn {_, _, ref, _, _, _, _} -> {ref, :not_found} end)
      end)

From this:

      for {pair, _, meta, _} <- all_defined,
          {local, line, macro_dispatch?} <- out_neighbours(bag, {:local, pair}),
          error = undefined_local_error(set, local, macro_dispatch?),
          do: {build_meta(line, meta), local, error}

to this:

      Enum.flat_map(all_defined, fn {pair, _, meta, _} ->
        Enum.map(out_neighbours(bag, {:local, pair}), fn {local, line, macro_dispatch?} ->
          error = undefined_local_error(set, local, macro_dispatch?)
          {build_meta(line, meta), local, error}
        end)
        |> Enum.filter(fn {meta, local, error} -> error != nil end)
      end)

From this:

          for path <- Mix.Dep.load_paths(dep),
              beam <- Path.wildcard(Path.join(path, "*.beam")),
              Mix.Utils.last_modified(beam) > modified,
              reduce: {modules, exports, new_exports} do
            {modules, exports, new_exports} ->
              ...
              {modules, exports, new_exports}
          end

to this:

          Enum.reduce(Mix.Dep.load_paths(dep), {modules, exports, new_exports}, fn path, acc ->
            Path.wildcard(Path.join(path, "*.beam"))
            |> Enum.filter(&Mix.Utils.last_modified(&1) > modified)
            |> Enum.reduce(acc, fn beam, {modules, exports, new_exports} ->
              ...
              {modules, exports, new_exports}
            end)
          end

I could go on and on. If we didn’t have for, all of those snippets would be worse, noisier, and also slower.

And I didn’t have to look hard either. Those examples came all from lib/elixir/lib, with the exception of the last one were I grepped for reduce: . I am sure you will find several others there, all in a single repository.

So my point is: yes, they ultimately achieve the same thing, but for comprehensions have always enabled us to write complex traversals quite more elegantly than the Enum module. If someone asked me what is Idiomatic Elixir, I would take for over Enum, whenever that’s possible. The same way I take Enum over manual recursion. This proposal allows us to use for in places where we can’t right now and that’s bound to bring the same benefits as we have seen in the snippets above. I have already posted examples of how the proposal can improve existing code.

John-Goff · December 24, 2021, 3:22pm

Just wanted to throw my 0.02c onto the discussion. At first I didn’t like the proposed changes, but the more I’ve read this thread and let it sit with me the more it’s grown on me. I would like to express my support for two proposals others have made, namely I like bind over let, and I also think that it should require parenthesis. So to me, this is ideal in terms of readability of all the proposed options.

for bind({x, y} = {0, 0}), i <- [1, 2, 3] do
  {i, {x + i, y + i + 1}}
end

baldwindavid · December 24, 2021, 3:58pm

Along with the required parentheses, an option to cover both map reduce and reduce without a separate word could be to add something like :returning. This could default to :map_reduce, but be set to :reduce as needed.

Map Reduce

iex> for let(sum = 0), i <- [1, 2, 3] do
...>   sum = sum + i
...>   {i * 2, sum}
...> end
{[2, 4, 6], 6}

Reduce

iex> for let(sum = 0, returning: :reduce), i <- [1, 2, 3] do
...>   sum + i
...> end
6

massimo · December 24, 2021, 5:29pm

Is there a reason why they are slower?

Aren’t they implemented with the same core instructions?

Just for fun I re-implemented some of the examples in two drastically different ways.

For me, for example, Elixir is a great fit because I usually think in terms of pipelines and it gives me the tools to express what I want, almost bit per bit.

The second reason is that in Elixir tooling has been there from day one, to solve any possible problem, you just have to dig a little bit.

And in every release new utility functions have been added to make things easier.

So, for example, this

  Enum.flat_map_reduce(subs, sources, fn sub, sources ->
    sub_formatter = Path.join(sub, ".formatter.exs")

    if File.exists?(sub_formatter) do
      formatter_opts = eval_file_with_keyword_list(sub_formatter)

      {formatter_opts_and_subs, sources} =
        eval_deps_and_subdirectories(:in_memory, [sub], formatter_opts, sources)

      {[{sub, formatter_opts_and_subs}], sources}
    else
      {[], sources}
    end
  end)

can be rewritten as

subs
|> Enum.filter(&File.exists?(Path.join(&1, ".formatter.exs")))
|> Enum.flat_map_reduce(sources, fn sub, sources -> 
  formatter_opts = eval_file_with_keyword_list(Path.join(sub, ".formatter.exs"))

  {formatter_opts_and_subs, sources} =
    eval_deps_and_subdirectories(:in_memory, [sub], formatter_opts, sources)

  {[{sub, formatter_opts_and_subs}], sources}
end)

or this

def find_asset_info(notebook, hash) do
  Enum.find_value(notebook.sections, fn section ->
    Enum.find_value(section.cells, fn cell ->
      is_struct(cell, Cell.Elixir) &&
        Enum.find_value(cell.outputs, fn
          {:js_static, %{assets: %{hash: ^hash} = assets_info}, _data} -> assets_info
          {:js_dynamic, %{assets: %{hash: ^hash} = assets_info}, _pid} -> assets_info
          _ -> nil
        end)
    end)
  end)
end

as

def find_asset_info(notebook, hash) do
  filter = [
    :sections, all(), :cells,
    filter(&is_struct(&1, Cell.Elixir)),
    key(:outputs), all(), elem(1), :assets
  ]

  get_in(notebook, filter)
  |> List.flatten()
  |> Enum.find_value(fn
    %{hash: ^hash} = assets_info -> assets_info
    _ -> nil
  end)
end

admittedly an extreme example of some lesser-known Elixir facilities, but still pretty easy to follow

benwilson512 · December 24, 2021, 7:00pm

Nope, no reason :p. In all seriousness a lot of it boils down to this: The for special form (AKA fancy macro) is provided all of the relevant filters, generators, and so forth up front, and so it can emit compact code that accomplishes more with fewer iterations at compile time. Enum.* calls are functions, and only have access to the collection and the function at runtime. Stream can compose stuff but has similar issues where it all happens at runtime.

I say “issues” but that’s perhaps too strong; it’s a trade off.

For people objecting to for and a possible expansion of its functionality, have you used other functional languages? It’s extremely common in functional languages to have at least these core data structure manipulation mechanisms: raw recursion, basic functions, and then some sort of fancy list comprehension that ties a number of features together in an ergonomic way. Haskell, Clojure, Erlang, Scala (quasi-functional), F# all have these same three.

That isn’t to say that Elixir’s version is or needs to be exactly the same as the rest, but I think some of the points here basically argue that for itself is superfluous, and I think that’s a strange stance to take within the language ecosystem.

josevalim · December 24, 2021, 7:25pm

Your examples show why they are usually slower. You are doing two traversals and computing the Path.join/2 twice.

Path.join/2 is cheap, so it is fine, but if it was expensive, then changing the code to compute it once would be an extra pass and more noise, or you would have to revert to the original version. With for, you don’t have to juggle this.

AstonJ · December 24, 2021, 10:15pm

Firstly I have to say how well written the guide is - I was reading it in-between doing other stuff yesterday and kept thinking to myself that if that’s what the rest of the guides are like (I haven’t actually read them for a few years now) then anyone new coming in to the language is in for a treat!!

With regards to the proposal, if this is still on the cards it’s my fave as well:

Because if…

…then it just makes sense and is more immediately obvious of what it is or might be (to me at least). However Elixir is my first functional language so I appreciate I may not be seeing things from the same perspective as others.

I also like set as mentioned by @mgwidmann.

Personally my favourite would be José’s version but the init at the end:

for x <- [1, 2, 3], init(sum = 0) do
  {x * 2, sum + x}
end

Because the primary focus of the for is the first part, the initialised variables are just something to be used with it.

Having said that:

On this I would personally support whatever José and the ECT think is best because I am sure they will have a million things in their head that I almost certainly won’t (including further possible extensions to the language). As I said tho I don’t have the same kind of functional language background as many of you but I just thought it was worth adding my thoughts as my perspective might be representative of others in similar shoes.

xsteve · December 24, 2021, 10:54pm

I am new to Elixir and I followed the discussion of this proposal.

This version is also good to read and understand by me:

for x ← [1, 2, 3], init(sum = 0) do
{x * 2, sum + x}
end

soup · December 25, 2021, 5:28am

I’ve tried to read the whole thread, but it was split between a few days…

I prefer the term bind over let over init, as “binding” is already an Elixir concept, but let has precedence in heex templates now, so I can see it becoming “more normal”. Indeed, this proposal can just “make it normal”.

After playing around, I think the initial proposal is my favourite. No parens and we’re “just pattern matching”.

What would occur when I do:

{doubled, sum} =
  for let {sum, _} = {0, 10}, i <- [1, 2, 3] do # _ behaves "as normal" and _ is discarded?
    sum = sum + i
    {i * 2, sum}
  end

Honestly let (or bind ) are fine. Both hide the eventual behaviour, but so does init, and accumulator is pretty wordy, if not preciser. ~~Maybe carry as in “carry this value between calls”.~~

Perhaps I would ask, is let transportable as concept to other forms? Is bind? Is init? Can a user familiar with for reckon against what let/bind/init means in other contexts? If we added let to with or try, can it follow the same conceptual train?

Why was let chosen in heex? Does that solidify the semantics of the term, in Elixir?

# I know heex *is not* elixir, the language, but if we are intending to reduce the 
# burden on learning.
<.form let={f}>
  # "'let' means the value of f inside this 'block' comes from somewhere else",
  # pretty similar to for let x = 0 really.
</.form>

<.form bind={f}>
  # "I know form() 'returns' something and I want to bind that to f in this block"
  # honestly let feels more intuitive here.
</.form>

<.form init={f}>
  # "initialise .. ah, f with what comes from the form? or am I initialising the form with f?
  # (obviously i am biased :))
</.form>

try let {rescue, after} = {my_rescue_fn, my_after_fn} do
...
end

try init(rescue = my_rescue, after = my_after_fn) do # weirdddd
...
end

I find the init(sum = 0) structure odd, with required parens. Nothing else is really written like that in Elixir, with a bind inside a function call (which is what it looks like), at least not that I’ve seen.

I dont think there was a discussion on multiple inputs to init. Am I calling

for init({a, b} = {0, 10}), x <- 1..3 do
  {x, {a + x, b - x}}
end

or

# I assume this one
for init(a = 0, b = 10), x <- 1..3 do
  {x, {a + x, b - x}}
end

for init({a, b} = {0, 10}), ... reads as “do this pattern match and … call a function with … the result of the match? Is that true if they match or the values or …?”, init(a = 0, b = 10) is even worse IMO since it looks a lot like default arguments in other languages, used in the wrong context.

a = 99
for init(a = 0, b = 10), x <- 1..3 do
  # is a 99 or 0? A new user may wonder can I call init(a, b \\ 100)?
  {x, {a + x, b - x}}
end

“Why can’t I call var = init(y = 0)? What does that even mean? Why doesn’t DateTime.new(time = t, date = d) work? Can I provide my own init function?”

I get that you can explain why this function call isn’t a function call, but it feels like setting your own mouse traps in the dark.

Without the parens, it’s a bit less confusing because init takes on the shape of a language keyword, even it if’s not (forget that you can call most special forms with parens… most users wont know/ignore this).

Anyway…

https://thumbs.gfycat.com/LastBrokenHorsefly-size_restricted.gif

ityonemo · December 25, 2021, 6:13am

I think also I have always been a bit uncomfortable about how for will implicitly skip over stuff that doesn’t match… I guess mentally I’m expecting a matchError… I have never been comfortable with how the subsequent predicates are implicitly filters without saying so, and also how for implicitly goes into “reduce” mode with the forward arrows.

If I had a magic wand mapreduce would look like this:

for {i <- 1..20, bind: sum, init: 0}, filter: rem(i, 2) == 0 do
  {i, sum + i}
end #==> {[2, 4, 6...], 110}

And for reduce it would be:

for reduce: i <- 1..20, bind: sum, init: 0, filter: rem(i, 2) == 0 do
  sum + i
end # ==> 110

And the plain old for would have to be explicit about filter predicates:

for i <- 1..20, j <- 1..20, filter: i < j, filter: rem(i, 2) == 0 do
  {i, j}
end

thousandsofthem · December 25, 2021, 11:26am

No one praised the error message so far? I think that’s the thing that makes it work

as for voting,

for init {sum, count}, ...

looks good. for reduce no so much. returning: :reduce is good enough though