Local accumulators for cleaner comprehensions

josevalim · December 4, 2023, 1:50pm

Functions always start a new scope, and that would be the case here, so no shadowing because no module variable leaks in. The model is simpler than you are thinking.

That’s typically a bad idea because Elixir doesn’t really make a distinction between runtime scope and compile time scopes. Runtime and compile-time is the same. That’s why you can use if or any other function/macro inside module bodies and you can also define modules inside functions. Even defmodule itself is just a macro and there is not much special about it.

adw632 · December 4, 2023, 2:24pm

Thanks for clarifying all my questions and imagined edge cases, in particular the limit of @@ when used at module scope.

Aside from the potential for accidental shadowing and a preference for less cryptic looking code I could live with it, but I would still loathe it.

I am still not convinced it is making the language simpler to understand or that the use case it is serving is actually worth the trouble and I also fear the mutable coding style it may foster.

Whilst this proposal may end up in the language it won’t be something I will reach for. I would still be inclined to recommend against it in coding standards and enforce credo checks as it has the hallmarks of a code smell because it promotes an imperative mutable style even though technically it’s not mutable, just a bit leaky.

trnasistor · December 4, 2023, 3:02pm

My opinion as a beginner.

I really enjoy recursion and comprehensions, but I also noticed that sometimes there is no elegant way to solve certain problems,

I like the solution proposed by Jose. It makes the code more readable while technically not compromising functional underpinning of elixir.

However, I am afraid that andrewh may be right:

I do wonder what behaviour will follow over the years. If people actually start using if/else/for constructs in anger with assignments in those blocks this will promote more use of @@ (and visually cryptic looking code).

Currently we think of @@ as the exception, but it doesn’t take too much imagination where “normal coding standards” may evolve over the years

I feel like the risk of this happening outweighs the benefits of the solution provided, despite nothing being wrong with the solution itself. That’s why ultimately I’m not in favor of the proposal in the current form.

Perhaps there is a way to limit “local accumulators” just to comprehensions, for example? Or maybe there is a different way without such drawback?

I’m afraid that I would personally end up abusing @@ if it existed, instead of being constrained to learn the elixir way, which I begun to love.

DaAnalyst · December 4, 2023, 3:27pm

Phew… Glad you’re not going to take (me) away the <~ operator

One question @josevalim: Will these local accumulators work within cond? I sometimes miss this feature to have the a variable set in an above condition and then reused below. Sure we can use with instead, but cond’s are far cleaner and more readable then with true <- something || something_else.

gpopides · December 4, 2023, 3:38pm

Throwing an idea which might be unrealistic.

Based on the the examples provided and in general the places where i needed them in other langauges, its mostly in loops.

Could local accumulators be a feature of the for comprehension instead of general feature?

Generally, i would prefer not to have the mutability because as mentioned above this can easily get out of hand with people that are used with mutability.

We cant ensure that newcommers will not use most of the time those mutable variables because this is what they are used to.

sodapopcan · December 4, 2023, 4:01pm

I just wanted to echo concerns around this type of thing becoming the norm:

@@foo = 0

if true, do: @foo = 1

I guess it wouldn’t be the end of the world just, as others have stated, I’ve come to really appreciate how things read as they are. This adds a new way to think about things.

It’s been a while since I went through the original threads and I never went through them meticulously as I came late and didn’t know as much about Elixir back then, but was anything like let do proposed?

let a = 0, b = 0 do
  a = 2
  b = 2
end

iex> a # 2
iex> b # 2

While it adds a level of indentation, it at least feels more Elixir-y to me and draws a very clear scope around where they may be “mutated.” I also still don’t fully understand what the problem with for let was—I liked that.

D4no0 · December 4, 2023, 4:19pm

I totally agree!

I think that introducing some kind of DSL that can be packaged as a library (or even added to elixir) is much more productive than creating a new language construct, because at the end of the day, this highly specific feature will be abused countless times by people that are used to mutable state.

Functional languages are good because it has less features, that are easier to reason in a codebase that iteratively gets more complex. Introduction of these concepts will also encourage context leaks, which IMO is the biggest downfall of mutable langauges:

@@magic_value = 20
Enum.map([1, 2 ,3], fn el -> if el == :rand.uniform(3), do: @@magic_value = 21 end)

Even assuming things like above cannot happen, I don’t see the justification of this being a language construct and not just limited scope DSL.

mat-hek · December 4, 2023, 4:21pm

No idea if this was already proposed, but the problem with explicit else could be solved with a dedicated ‘conditional assignment’ syntax, for example

x = 4 if a > 10

that would resolve to

x =
  if a > 10 do
    4
  else
    x
  end

It’s still an expression and doesn’t look imperative. Then we’d “only” have the for problem left

P.S. and it’s quite Pythonic

hst337 · December 4, 2023, 4:23pm

I am strongly against this proposal

My points will stay mostly the same as they used to be in the previous post:

Purposes of the proposal are unclear
This feature will bring more confusion than benefits
Existing solution is okay
There are other solutions and approaches
The problem is not important

This post restates the first two points, while last three points are available here: Koka-inspired local mutable variables for cleaner comprehensions - #67 by hissssst. (for the existing solutions point I’d also like to mention the Iteraptor solution and update_and_reduce suggestion from @mudasobwa).

Proposal

Purpose in unclear

It is still unclear what problem it is solving. Comparison of Elixir and Python solution leads me to the thought that this proposal aims at lowering the learning curve for devs coming from mutable languages, but it is just a guess. I am not the only one who doesn’t understand what does this proposal solve in the first place (for example, @dimitarvp and @mudasobwa also stated that original problem is unclear in previous thread).

So let’s enumerate our guesses:

This proposal lowers learning curve for devs coming from mutable languages
This proposal makes writing and reading algorithms easier
This proposal improves performance of loops

In the next paragraph I’ll explain why this proposal won’t achieve any of these goals.

Scope is unclear

It is called “local accumulators for cleaner comprehensions”, but it is expected to be implemented in if, case, cond and receive too. These constructs have nothing to do with comprehensions. I sense that they were mentioned to add a feeling of consistency to the feature, but it will never be consistent, since most enumerable traversals are already implemented with high order functions. (this is a fact)

This will bring more confusion

1.Context dependency
2. Unclear edge cases
3. It will be misused
4. Exotic performance penalties

Context dependency

This feature depends on technical implementation of the travesal of enumerable, while it is implemented on the level of the language semantics. If you have a solution with for, and you want to translate it into using Stream or Enum, you’ll have to change the semantics of the algorithm (one uses out-of-the-context accumulators (aka local mutable variables), and other one is not) (it is a fact)

Inconsistency, when the same algorithm is solved with two function doing exactly the same, but one version of this algorithm has compiler-built-in magic and other one is not, is an exception by definition, which negatively impacts learning curve and ease of writing and reading code. (it is a conclusion)

Unclear edge-cases

Context puns. Breaking immutability in some cases leads to context-inheritance puns. Here is an example:
```
list = [123]
@@acc = 0
for i <- list do
  [
    @@acc = @@acc + 100500,
    @@acc = @@acc + 1
  ]
end
```
It is unclear what will it return, and moreover, devs coming from oo languages will expect it behave like regular mutability (which it is not), thus it will make learning curve higher and it will make reading and writing this code harder (this is a conclusion).
Loop break Here’s an example with early return from the loop (which is a pretty common case for languages with such kind of mutability):
```
@@acc = 0
try do
  for i <- 1..100 do
    @@acc = @@acc + i
    if i >= 10, do: throw(:break)
  end
catch
  :break -> @@acc
end
@@acc
```
What will it return? It will return 0, but developer would expect it to return 55.

Please note that the two examples above can’t be easily fixed with current state of Elixir’s compiler, and would require an extremely huge rewrite of Elixir compiler in places with exception handling and instantiation of lists and maps. (that’s my strong opinion, which I am 100% sure about, since I am an Elixir compiler developer).

It will be misused

Developers coming from mutable languages will expect it to behave like regular mutability (which it is not), thus they will write code as above expecting it to behave differently to how it will actually behave. Early return and context puns examples are applicable here. (this argument is a second conclusion of the examples above)

Exotic performance penalties

Take a look at this example:

@@acc1 = 0
@@acc2 = 1
@@acc3 = 1
for item <- list do
  if extremely_rare_case(item) do
    @@acc1 = @@acc1 + item 
    @@acc2 = @@acc2 + item * item
    @@acc3 = @@acc3 + item ** 3
  end
end

Every developer will expect it to behave like regular mutable variable, but depending on frequency of positive result of extremely_rare_case, it might even be faster to use pdict to store this value, instead of using this feature (which will introduce tuple build and match on every iteration). Right now this example is optimized by the Erlang compiler, but we can just add a few remote functions to make it completely impossible for analysis by Elixir’s and Erlang’s compiler. (it is a fact)

It makes it extremely hard to reason about this feature in terms of performance in tight loops without knowing deep details of it’s implementation in the language. (it is my strong opinion)

sodapopcan · December 4, 2023, 4:25pm

I’d say it’s very Ruby-ish (in fact, it’s completely Ruby-ish) making it fit in quite nicely with Elixir.

thiagomajesk · December 4, 2023, 5:07pm

One thing is certain, we’ll never get everyone to agree here for a number of reasons and personal preferences… However, I’m very inclined to say that (IMHO), this is (comparatively) the best version of this proposal yet.

I see a lot of people worried about others abusing this feature, but honestly, you can argue about that using other aspects of the language as well. I would still rather live with the possible “misuse” of those features in the language than not having them at all.

IMHO, to be more objective about this; I think we should first evaluate if we actually want to support this use-case (and considering @josevalim`s drive to elaborate multiple proposals so far, I’d say that the problem statement is legitimate) and then comparing the current solution with previous proposals to see what are the pros/cons.

All in all, I think we will never reach the ideal solution, but from the possible solutions we already discussed, I personally think that “local accumulators” has the most potential so far.

gregvaughn · December 4, 2023, 5:14pm

The last time we talked about this, we were looking at changes just to the for special form. I was in favor of that at the time. This local accumulator approach has broader language implications, and I am less comfortable with how it might affect ease of learning and evolved style.

I wonder if we could go in some other direction such as better using the into option and a custom Collectable. Or, now that I think about it, what if into could also accept an anonymous or named function instead of needing a defimpl module? I need to spend more time thinking about this since it’s a fresh idea.

konstantine · December 4, 2023, 5:22pm

Regardless of the merits of local accumulators, the @@ token is obscene.

stevensonmt · December 4, 2023, 5:32pm

Am I misunderstanding or would the proposal be equivalent to the following:

my_var = 0

my_var = 
  for i <- 1..3, reduce: my_var do
    acc -> acc + i
  end

my_var == 6 # true

The @@my_var just eliminates the need for rebinding the accumulator to the result of the comprehension. Am I missing something more impactful?

hst337 · December 4, 2023, 5:40pm

You’re missing a map_reduce semantics. Currently for behaves only in map or reduce mode, not map_reduce. Instead of adding this mode, author proposes changing language semantics

stevensonmt · December 4, 2023, 5:43pm

Can you give an example of what map_reduce might look like with current comprehensions?

hst337 · December 4, 2023, 5:45pm

for i <- list, map_reduce: 0 do
  acc -> {i * i, acc + i * i}
end

UPD: I am not proposing it as a solution to the original problem, since the original problem is completely unclear for me. I am 100% comfortable with existing Elixir solution to the lesson counting task, and I don’t understand what is the purpose of this proposal.

christhekeele · December 4, 2023, 5:51pm

There is a lot of prior discussion around a for-loop extension to solve this problem over here you may be interested in—it directly influenced this one FWIW.

stevensonmt · December 4, 2023, 5:52pm

So maybe more like:

my_var = 0
my_list = [1,2,3]
{my_list, my_var} =
  for i <- my_list, reduce: {my_list, my_var} = _acc do
    {[i*i | my_list], my_var + i * i}
  end

my_var == 6 # true
my_list == [9, 4, 1] # true

sodapopcan · December 4, 2023, 6:00pm

Actually, what is wrong with this if map reduce is the main thing it’s accomplishing? Was this ever proposed before? I assume I should probably re-read the original thread which may hold the answer, but this is certainly nice.