Local accumulators for cleaner comprehensions

thiagomajesk · December 4, 2023, 8:22pm

I agree that maybe gatekeeping is not the best word to use here but also to be fair, it does seem that some people actually think it’s going to be the end of the world if this proposal goes forward .

sezaru · December 4, 2023, 8:39pm

It is not the end of the world, but it can potentially make things harder.

As a ex-C++ developer, one of the things I appreciate the most about Elixir is its consistency. You have a really small set of rules and knowing them you are good, you can understand any elixir code.

I remember a project that I got in my first job as a Elixir developer, the guy that created the project wrote the code like it was imperative, the code was a mess and a lot of functions had more than 300 lines of code each (!).

But because Elixir rules are clear and we have immutability, even a code like that was still easy to read without having to have a lot of variables context in my mind.

Now I wonder how that code would look if we had this feature at the time, I’m pretty sure it would be way harder to reason about it.

But that’s just my opinion ofc.

Nefcairon · December 4, 2023, 8:40pm

if I could have used @@… when I tried to grasp Elixir, I would have propably never understood what’s so positive about immutability.

Nicd · December 4, 2023, 9:10pm

I am against this proposal, due to the following:

Personal opinion: I feel that Elixir already has too much syntax. There are many ways to accomplish the same things (for vs Enum, but there’s also Map.new; we have various syntax sugars for keyword lists and maps and their updates and accesses; sigils; three ways to do charlists; pattern matching in function heads vs case inside function; moving catch et al to the top level of functions vs inside functions) and to new users there is already a considerable amount to learn. Not to mention all the macros in Phoenix and other popular projects. Elixir should be striving for less syntax, not more.
Personal opinion: I do not like @@, @ is already used for module attributes and EEx templates, and this will create confusion for new users.
Fear: This will confuse new users who already have a hard time with immutability, as they will expect it to do mutability the familiar way (and in some ways it does). Then they’ll try to extend that line of thinking and hit a wall where it won’t work anymore. In my opinion it would be better to teach them one way of working right away, instead of having to balance two ways. Eventually, as the user learns how immutability and scoping work, this syntax will be a burden as it now requires keeping in mind two different ways of working. And such a burden will be a burden forever, while learning new things is only a burden once.
Personal opinion: The solution to the original problem should be functions, not more special syntax. The code being noisy should be a signal to the author to split parts off to small defps to make it clearer. With that in mind, I don’t really agree with the initial premise of the proposal.

We should definitely not add syntax to the language just to deal with Advent of Code and other nonrealistic coding challenges. All syntax is a burden, and I don’t see programming puzzles as a good enough reason to add more. There has to be additional impact. (This is only a response to the quoted part, not to the original post.)

kwando · December 4, 2023, 9:40pm

Not sure what to think about this. The only times I feel the need for something like this is when advent if code is active

Just make the syntax ugly enough so we wont be tempted to use it more than when absolutely necessary:)

josevalim · December 4, 2023, 9:46pm

Because there is no monopoly or unified way in how we teach people the language, nor in how people learn. We can’t assume Enum will be taught and, even if it is, different people will take different amount of times to grasp it. Is it really worth saying: you will be unable to solve certain problems unless you fully grasp this concept, while elsewhere it is considerably easier to tackle it?

Look, I completely understand those who don’t like the solution, consider it may bloat the language, or don’t like the proposed syntax. But I was honestly hoping more empathy towards the problem statement. Telling people “oh, you can use map_reduce” to me is somewhat equivalent to the famous joke:

A monad is just a monoid in the category of endofunctors, what’s the problem?

It is not even about learning. There is one assumption and one personal preference here:

I assume that something like local accumulators will provide several users a better on-boarding ramp
I personally prefer the Python (and other imperative language) solutions to this problem

And this is not about Python. Pick any imperative language and their solution will be clearer. Python just happens to be the most concise one. As I said in the other thread, give any Elixir developer both the Python and the Elixir solution, and I believe the majority will most likely understand what is happening on the Python one in less time, because of the amount of boilerplate in the Elixir one.

There is a meme that is applicable here. Some of you may have seen a slide about design patterns in OO and FP languages:

Yet we have this here:

FP	Imperative
`map`	loop
`reduce`	loop
`flat_map`	loop
`flat_map_reduce`	loop
`scan`	loop
`count_by`	Oh my, loop
`count_until`	loop+break
`take_while`	loop+break
`split_while`	loop+break
`drop_while`	loop+break
`reduce_while`	loop+break
…	…

And it goes on and on.

Look, I love the Enum module. It provides a great shared vocabulary. But we need to acknowledge it is a sizeable step in our learning curve. Once again, I don’t think it means we need to accept this, but I do think the status quo can be improved considerably.

josevalim · December 4, 2023, 9:54pm

You are right and this has effectively turned me off this proposal. We have considerably less syntax than the other languages mentioned here (Python, JS, etc) but the syntax is also different and part of the learning curve. Those learning get confused with if true do ... end vs if true, do: ... and I can see this adding to the same style of confusion.

Perhaps the option is to either introduce accum session_counter = 0 (similar to the previous one by using another keyword) or introduce a new block construct accumulate session_counter <- 0, lesson_counter <- 0 do.

I believe we are making progress… it just takes a while.

hst337 · December 4, 2023, 10:10pm

That’s a great goal. But as I’ve described, this solution won’t make it easier for Python devs to onboard, because it won’t behave like mutable variable in any sense. It will just mislead them into thinking that it is, and then they’ll learn on their mistakes that it is actually not. (this is a counterargument based on my detailed explanations in previous post)

Most likely it will be taught. Elixir is a mature language with mature ecosystem. Every new-to-the-language developer will occasionally come across the old code with Enum, and every developer will have to learn what Enum does and how to use it. (it is my strong opinion and I think that it can be perceived as a fact).

I feel it, and I agree that Python has cleaner solution. But it is clearly a tradeoff. Python has clean solutions for AoC because of mutability, and Elixir has codebases which are easier to maintain because of immutability. And the proposed solution with local semi-mutability will bring confusion for both Python and Elixir devs, since it has nothing to do with mutability and it has really strange runtime implications. (this is a counterargument based on my detailed explanations in previous post)

Yes, I agree. But in the end, these implications bring more good than bad. The feature you’re suggesting will lead to incorrect understanding of the performance and how it actually works, which will lead to wrong (in my example with loop+break) or inefficient (in my final example) code. (here I just restated my points about this proposals lowering the learning curve)

And as I’ve stated before, developers would still have to learn how Enum works, since it is present in literally every Elixir project. (it is a fact). So you’re not suggesting learning for with accumulators instead of Enum, you’re actually suggesting learning for with accumulators and Enum. (this is a conclusion)

thiagomajesk · December 4, 2023, 10:13pm

I feel that no matter where we go with this proposal we will always have people in favor and people against it (same as before) so don’t sweat over it.

That being said, I still do think that we landed in a better position with this proposal (with the current syntax) than the others before it. I personally still think that something like @@acc deviates a lot less from standard language constructs than accumm acc or mut acc and would cause less confusion.

To me at least, it looks a little bit like some kind of special one-way message passing and kinda reminds me of Svelte’s $ variables, but I totally understand why some people are getting hung up on the idea of mutability… And now I’ll be forever curious what other use cases we would have made possible by allowing this (especially in Phoenix/LiveView ).

I was re-reading the thread and saw this and wanted to acknowledge how beautiful this looks… I have never worked with Ruby professionally, but I do love its expressiveness. If this were possible in Elixir as well it would be haha.

edisonywh · December 4, 2023, 10:21pm

What about an extension to for instead like Greg suggested (or maybe I misunderstood)?

list = [1, 2]
for val <- list, binding: [a: 0, b: 1] do
  a = a + val
  b = b + val
end

$ Kernel.binding()
# => [a: 3, b: 4]

Something along that line, makes it look less like a mutable variable?

EDIT: edited to make it more obvious that it’s accumulating values

AstonJ · December 4, 2023, 10:22pm

I haven’t been able to read all of the thread/s so apologise in advance if this is no longer possible or has since been agreed not to pursue (feel free to ignore this comment in that case) but just want to comment on this from the original thread:

I personally liked José’s original proposal without the explicit syntax as it’s the least jarring and looks the cleanest and easiest to read to me:

mut section_counter = 0
mut lesson_counter = 0

for section <- sections do
  if section["reset_lesson_position"] do
    lesson_counter = 0
  end

  section_counter = section_counter + 1

  lessons =
    for lesson <- section["lessons"] do
      lesson_counter = lesson_counter + 1
      Map.put(lesson, "position", lesson_counter)
    end

  section
  |> Map.put("lessons", lessons)
  |> Map.put("position", section_counter)
end

It also let’s people use their own signifier:

mut m_section_counter = 0
mut m_lesson_counter = 0

for section <- sections do
  if section["reset_lesson_position"] do
    m_lesson_counter = 0
  end

  m_section_counter = m_section_counter + 1

  lessons =
    for lesson <- section["lessons"] do
      m_lesson_counter = m_lesson_counter + 1
      Map.put(lesson, "position", m_lesson_counter)
    end

  section
  |> Map.put("lessons", lessons)
  |> Map.put("position", m_section_counter)
end

or:

mut acc_section_counter = 0
mut acc_lesson_counter = 0

for section <- sections do
  if section["reset_lesson_position"] do
    acc_lesson_counter = 0
  end

  acc_section_counter = acc_section_counter + 1

  lessons =
    for lesson <- section["lessons"] do
      acc_lesson_counter = acc_lesson_counter + 1
      Map.put(lesson, "position", acc_lesson_counter)
    end

  section
  |> Map.put("lessons", lessons)
  |> Map.put("position", acc_section_counter)
end

Personally I would use the original. Declaring them, and seeing how they’re used in code is enough for me to differentiate them.

josevalim · December 4, 2023, 10:24pm

@hissssst, my whole reply was about the problem statement. I understand the cons of the current solution, but the cons of the current solution isn’t a dismissal of the problem statement itself. Those are two separate things.

Your argument is based on corner cases of how the same variable is reassigned within the same list. I honestly can’t recall seeing these examples in either Elixir or Python/Ruby/JS code. I can be convinced that it can be confusing, but that argument ain’t it.

Sure, but is this really a problem?

Elixir doesn’t strictly need the Task module. You can do everything it does with either GenServer and regular process abstractions. But it is an extremely useful component for both Elixir on boarding experience and in actual applications.

Imagine if every time we wanted to do async/await, we told people to write this:

list
|> Enum.map(fn item ->
  ref = make_ref()
  parent = self()
  spawn_link(fn -> send(parent, {ref, ...}) end)
  ref
end)
|> Enum.map(fn ref ->
  receive do
    {^ref, reply} -> reply
  end
end)

Which is akin to how Erlang developers would express it.

We often try to make common patterns in Elixir more accessible. And the duality you mention exists pretty much in Ruby, Python, JS, Java, Rust, etc. Once again, I don’t think we should do it because those languages do it, but the costs of having two approaches to these problems are minimal (and common).

To be clear: I am not speaking about this particular solution, just the problem statement in itself and the assumption that “Enum is fine”.

thiagomajesk · December 4, 2023, 10:28pm

I might be wrong, but I think @josevalim drafted this new proposal because we ended up agreeing that mut or other keywords would not make that obvious where the accumulators are being defined and being used - in contrast with using @@, which makes it extremely explicit in both cases.

josevalim · December 4, 2023, 10:34pm

A compromise would be a block:

accum session_counter <- 0, lesson_counter <- 0 do
  ...
end

It has benefits from both approaches:

The block helps limit the scope of where the variables are used, so we worry less about large functions
No additional syntax noise such as @@ (although that can be a cons as usage is less clear)

It also helps keep the theme that all of for, with, and then accum are actually monads (but let’s not call them that ).

Of course, the other option is to introduce the accumulators directly into comprehensions, as others suggestions:

list = [1, 2]
for val <- list, binding: [a: 0, b: 1] do
  a = a + val
  b = b + val
end

But the comprehension syntax is already quite overloaded, unfortunately, and different enough to add to the confusion (imo)

hst337 · December 4, 2023, 10:37pm

That’s where we disagree. Having Task module containing only functions to make one-off processes simpler is not the same as having a special syntax, special semantics and compiler-magic for list traversals. I am not against having two ways of doing things instead of one. This makes sense, if these two ways cover different tradeoffs (performance vs expression, etc).

However, it is not like there are more cons than pros. This proposal has zero pros. It won’t lower learning barrier, because

One more thing to learn
Works differently from Python, Elixir, Ruby and every other popular language.
Has extremely difficult to understand cases (breaking the loop, for example) due to possible internal implementation (because I think that you won’t rewrite the whole compiler just for this feature)

(I’ve explained all of the points in detail in my first post in this thread)

AstonJ · December 4, 2023, 10:40pm

I did see a couple of comments to that effect and wondered if there was another reason. If there isn’t then I would be inclined to respectfully disagree. I think the declaring keyword (whether it is mut or acc or anything else) may actually be enough, and if anyone thought it wasn’t and wanted something more immediately explicit then they could either use their own prefix as per my example, or (perhaps after a lot of feedback) the Elixir Core Team can later introduce a special character and use deprecation notices to encourage people to update their code.

josevalim:

A compromise would be a block:
accum session_counter <- 0, lesson_counter <- 0 do
  ...
end
It has benefits from both approaches:

The block helps limit the scope of where the variables are used, so we worry less about large functions

But without introducing a notation such as @@

It also helps keep the theme that all of for, with, and then accum are actually monads (but let’s not call them that ).

I like that too!

josevalim · December 4, 2023, 10:41pm

Yes, I thought about this as well, but I struggle on both:

how to make it efficient
how to make it work in this case

For example, one of my ideas is to allow a tuple to given to into, so we can collect multiple things at once:

for i <- [1, 2, 3], into: {[], ""} do
  {i * 2, i + ?0}
end
#=> {[2, 4, 6], "123"}

And we could quite optimize it. However, the issue about into is that it assumes we know how the value is updated. So if we take an integer as the initial value for into, we have two options:

Assume it will always be a sum, which does not help directly in this case
Assume it will always be “last value wins” but then it mismatches the list semantics

At the end of the day, the into bit is more about the map part of map_reduce than the reduce one, which is why I don’t think it is a good fit, unfortunately.

sodapopcan · December 4, 2023, 10:47pm

I really like accum with a block (which of course I do because I suggested something similar above and likely wasn’t the first). Even if lots of people “abuse it,” it’s very clear what the scope of the accumulators is. It also looks like other cases in elixir where variables appear to “mutate.”

thiagomajesk · December 4, 2023, 10:49pm

There’s no doubt one version is more explicit than the other as we would have an exclusive identifier for that purpose (it’s right on your face ), but I think you are disagreeing with the idea of needing it to be more explicit, right!? In that case, I think I still that being explicit about it would be the way to go just to avoid any possible speck of confusion like we discussed before (so, respectfully disagreeing with your disagreement haha ).

dorgan · December 4, 2023, 10:58pm

Code.set_rebinding_scope(:foo, :lax) # Defaults to :strict

foo = 1

if do_it? do
  foo = 2
end

(?