Introducing `for let` and `for reduce`

Thanks for putting together a detailed & well-presented proposal :pray:t6:

Iā€™m for the most part happy with it but would like to share a couple aspects of it that felt a bit ā€˜offā€™ at first encounter & would love to hear what others think:

  1. The implicit asymmetry between return value of single iteration and return value of whole comprehension as shown in screenshot below. The fact that first element of tuple accumulates via mapping while second element accumulates via reduction is not explicit and tripped me up

  2. The case for for reduce doesnā€™t seem very compelling. AFAIK the same functionality can be accomplished with current :reduce option and itā€™s not clear to me what advantage the new syntax brings in this case?

  3. Initialization before the comprehension: as others previously pointed out, I can see this leading to some confusion. For ex: seeing for let lesson_counter, lesson <- section["lessons"] do somewhere in the codebase without lesson_counterā€™s initialization co-located, when itā€™s a variable that will potentially be updated in each iteration

  4. One of the great things about Elixir is the focus on explicitness and Iā€™m a bit concerned we would be giving up some ground here with some aspects of the proposed solution, ex: use of tuple as return value with implicit reliance on position within the tuple for things like error messages. The examples used reductions over simple values like integers but would ComprehensionError for instance still work with values like nested tuples that could end up looking like valid output?

2 Likes

What if for ā€œmagicallyā€ scooped up re-bound let variables and returned them at the end without needing to explicitly return them inside the for?

for let count = 0, sum = 0, i <- [1, 2, 3] do
  sum = sum + i
  count = count + 1
  i * 2
end

# returns {[2, 4, 6], %{sum: 6, count: 3}}

You can re-bind the let variables in each iteration, and at the end for scoops up the final values and returns them in a map.

To me, the original proposal where you have to return a tuple is destined to be confusing. Folks are used to whatever you return inside for getting put in a list. Itā€™s bizarre to me that youā€™d do that and it would gather the first part of the tuple in a list, but ā€œdiscardā€ the 2nd part until the final iteration.

My proposal above involves ā€œmagicā€ (implicit) behavior, but to me seems less confusing than the tuple convention.

If returning a map is controversial, maybe we can return a tuple instead:

{[2, 4, 6], {6, 3}}

If we introduce for-let, then I think we should have for-reduce for consistency and deprecate the :reduce option. The proposal explains why having it at the beginning allows more possibilities, but other than that they are quite close.

The tuple contract is how it works with map_reduce and that is what we are trying to mirror here. We could try to steer away from it, but then we venture further into unknown territory, which was more generally disliked in previous proposals. The comprehension error wonā€™t catch false positives though, unless we have a type system.

You can re-bind the let variables in each iteration, and at the end for scoops up the final values and returns them in a map.

This was actually my first proposal but it was generally disliked. You can see the elixir-lang-core mailing list for more discussion on that. I would link but I am currently on my phone.

ā€”-

My thoughts on this topic have changed several times but right now I am closer to for_let and for_reduce than for let. The reason is simple: different return types should have different functions.

10 Likes

This is great to hear @josevalim! One of the things I was reluctant about in the proposal is that this ā€œmodalā€ behavior doesnā€™t exist in other parts of the language like many others mentioned, so it makes total sense to have different expectations about the function in this contextā€¦

However, now I understand (based on your previous comment about the special forms), the ideal solution would be adding no new keywords. So, Iā€™m wondering after reading @soupā€™s comment if something like try let would make sense in Elixir. Does the proposed concept of ā€˜qualifierā€™ works in other parts of the language?

Another thing, have you played with the idea of combining the behavior of both for-map and for-reduce in one place? Functions like Enum.chunk_while and Enum.group_by where you control how the return values are dealt with keep coming to mind when I think about the problem at hand.

PS.: I think the point of Phoenix already using let in heex kinda starts validating the proposal to me (out of practicality mostly).

The tuple contract is how it works with map_reduce and that is what we are trying to mirror here.

Great point :+1: Makes sense to me.

for sum := 0, count := 0, i <- [1, 2, 3] do
  sum = sum + i
  count = count + 1
  {sum, count}
end

This keeps the shape of the for:

Before:

  • Generators are <-
  • Filters are = or function calls, or basically anything that returns a bool value

After:

  • Initializers are :=
  • Generators are <-
  • Filters are = or function calls, or basically anything that returns a bool value

Upside:

  • ā€œShape of thingsā€ is preserved
  • Each separate ā€œthingā€ has itā€™s own operator (:=, <-)

Downside:

  • probably less readable
  • introduces a new operator that doesnā€™t exist in other places in the language

(Another downside that many have already mentioned this stops being a Enum.map, and becomes either a map or a reduce depending on whether or not we have initial values)

5 Likes

Having separate names does seem to allow for a lot of freedom. It makes it easy to document the different types of comprehensions and easy to google them when you come across them for the first time.

I do like for alone, this being a single unusual special form under which all this functionality groups. Maybe itā€™s odd to have options that alter the return form, but breaking them up into for_map and for_reduce may not add a stitch of clarity to the real problem.

The initial binding is the sticky bit because itā€™s also controlling the result shape. How about something explicit for returns {:acc, {sum=0,count=0}}ā€¦ and for returns {sum=0,count=0}ā€¦

2 Likes

Another +1 vote for using init in some way (with or without parens).

For many people coming from other languages, let has so much baggage around mutability that it may be more of a hindrance than a help.

A couple of additional options that feel directionally Elixir-ish:

Combining for and with
While I generally disagree with including additional forms like for_reduce or for_let, there could be an opportunity to mix for and with on this occasion? My mental framework has with reading as something like ā€œassuming (given conditions succeed) do thisā€¦ā€. Combining for and with (either syntactically via punctuation/blocks or literally as for_with) would read something like ā€œassuming weā€™re able to initialize these variables, then execute this comprehensionā€.

Leveraging guard-like syntax
We already have a way to say ā€œperform this secondary check/action when doing this logic blockā€. The when syntax does this with guards. Why not have something similar for comprehensions? Could also combine the above with logic, or use an aforementioned option as well (init remains my personal favorite):

for i <- [1,2,3] with {sum, count} <- {0,0} do...
for i <- [1,2,3] init {sum, count} <- {0,0} do...
for i <- [1,2,3] let {sum, count} <- {0,0} do...
for i <- [1,2,3] first {sum, count} <- {0,0} do...

Yes, there is some overloading here, but itā€™s not so foreign.

1 Like

This is very interesting! I like how it keeps everything under the for namespace of special forms, plus it makes the return value very clear (and even custom ā€“ if you want the ā€œreduceā€ part of map_reduce before the ā€œmapā€ part, you can do that). It also leaves open future extensions such as async or even filters on accumulated values before generators are specified.
The one difference I suggest is that :map (or :values?) be used in place of where you used :acc since sum and count are accumulation variables (the reduce part of map_reduce).

I think the idea is good, but it suffers from the same ā€œproblemsā€ as for let and for reduce - syntax that isnā€™t available outside of this scope. If specifying the ā€˜qualifiersā€™ at the end was feasible, Iā€™d rather leave it like the other :into, and :reduce options. Also, the similarities with the guardā€™s syntax seem only superficial to me.

This is also interesting, but if that were the case, Iā€™d prefer if we could retain compatibility with the Enum.chunk_while/4 return structure or something (by specifying how the results are going to be handled). However, I donā€™t think supporting distinct result types in for would be a good idea if we care about ergonomics, so Iā€™d rather have another keyword in place if that was the case (different expectations for different functions).

I think the idea is good, but it suffers from the same ā€œproblemsā€ as for let and for reduce - syntax that isnā€™t available outside of this scope.

Thereā€™s no reason why an init ā€œguardā€ couldnā€™t be used in other scopes. I could certainly see it being useful outside of comprehensions.

Iā€™ll refrain from responding to the ā€œsuperficialā€ statement.

1 Like

Just to expand on this a little moreā€¦ I personally donā€™t think that the correlation would be easier for beginners because guards syntax is already well-defined in elixir by using the when keyword. Also, is not usual to use guards with functions that donā€™t return a boolean value (if Iā€™m not mistaken, but I might be wrong on this). So, it seems that instead of ā€œsimilar to guardsā€, itā€™s just ā€˜qualifiersā€™ in a different position.

Could you elaborate on other use cases youā€™ve thought would be useful (similarly to the proposed usage, at least)? One of the concerns brought up in previous discussions was that ā€˜qualifiersā€™ were strange and specific to for and not commonly seen in other parts of the language like I said.

Guards are a qualifier in the position that Iā€™m referencing. Thatā€™s the analog.

Could you elaborate on other use cases youā€™ve thought would be useful (similarly to the proposed usage, at least)?

One example, when piping into conditionals like case/cond, it could be nice to have additional data there with the conditional itself, rather than ā€œinit-ingā€ values above the pipe chain. Could possibly also be used in the function head as a more explicit default for recursive functions.

Thatā€™s something Iā€™ve thought would be useful to add to for comprehensions (I actually hacked on a macro for this recently). In my mind it makes comprehensions more composableā€“you can pass the comprehension streams into additional comprehensions or Enum/Stream functions without many traversals of your list. I see stream comprehensions as the Ecto.Query.from/2 macro if the Stream module is the rest of Ecto.Query (hopefully that makes sense).

I think in this discussion folks are more looking for a way to reduce without leaving for, but I like the idea of being able to mix & match.

1 Like

Agreed on this a lot! I particularly donā€™t like how Ectoā€™s Repo.transaction changes behaviour based on whether youā€™re passing a function or a Multi. Unless thereā€™s something Iā€™m missing, there should be Multi.transaction(multi) instead of Repo.transaction(multi) (or maybe Multi.run to mirror Stream.run except that Multi.run is already a thing).

Iā€™d prefer for let and even forlet instead of for_let. While those are all macros or special forms, there are some constructs that are treated as ā€œkeywordsā€ (defmodule, for, while, if, etc.). Somehow for_let doesnā€™t fit in there.

Butā€¦ another approach to solving this would be to declare the shape of the returned value in the forā€™s ā€œheaderā€. Also, thinking about this more, reduce is not friendly to people coming from imperative languages. I think talking about the ā€œreturned valueā€ is way more familiar, so Iā€™m also proposing to get rid of reduce altogether in favour of return.

So a map-reduce would be:

for return {sum = 0, count = 0, i <- [1, 2, 3]} do
  sum = sum + 1
  count = count + 1
  {sum, count, i}
end

and reduce:

for return {sum = 0, count = 0}, i <- [1, 2, 3] do
  sum = sum + 1
  count = count + 1
  {sum, count}
end
3 Likes

I have considered this route but I canā€™t come up with any reasonable syntax. Putting the generator as part of the return type feels incorrect, the generator is not really part of the result and you can have multiple generators, which would not make sense either. We would need a way to refer to the output but all of them would feel magical.

2 Likes

I think that the following would make sense, except that Iā€™m not sure what should be done with pattern matching.

for i <- [1, 2, 3], return {sum = 0, count = 0, i} do
  sum = sum + 1
  count = count + 1
  {sum, count, i}
end

Flipping the order maybe would solve that by ā€œenforcingā€ to bind to a variable:

for return {sum = 0, count = 0, i}, <some pattern> = i <- [1, 2, 3],  do
  sum = sum + 1
  count = count + 1
  {sum, count, i}
end

Did you consider a return-like macro inside the block? I know that some options have been dismissed on the mailing list due to ā€œrefactorabilityā€ of the code, but to me this seems similar to rolling back Ecto transactions - there is some minimal amount of plumbing/wiring needed:

Repo.transaction(fn ->
  # ...
  # do stuff, but at some point:
  |> case do
    {:ok, foo} -> foo
    {:error, reason} -> Repo.rollback(reason)
  end
end)

So how about something like:

for i <- [1, 2, 3], sum = 0, count = 0 do
  sum = sum + 1
  count = count + 1
  continue {i, sum, count}
end

I guess Iā€™m starting to lean towards something like after from the initial proposal, but maybe more of a single expression than imperative rebindings.

EDIT: after would work too, but continue would be best for people from imperative background:

for i <- [1, 2, 3], sum = 0, count = 0 do
  sum = sum + 1
  count = count + 1
after
  {i, sum, count}
end

If we had continue Iā€™d love to see break for early exits.

I feel like for is already an odd construct, because it behaves like a Enum.map (with nesting and filtering). Ideally I think youā€™d want generators to be more integrated with the standard library, but thatā€™s hard to do right now.

To avoid confusion, I would let the for contruct mirror the functions we know as much as possible, instead of making up new terms for the same functions we already have in Enum.

Something like this (map can be the default unless specified):

for map x <- [1,2,3], y <- [4,5,6] do
  {x, y}
end

for map_reduce acc = 0, x <- [1,2,3], y <- [4,5,6] do
  {{x, y}, acc + x + y}
end

for reduce acc = 0, x <- [1,2,3], y <- [4,5,6] do
  acc + x + y
end

This will also leave some room for possible future expansion for any other enumerable functions.

20 Likes

This is a neat idea, I like how explicit and intuitive to understand it is!