Elixir 1.7.4 backwards incompatibility

OvermindDL1 · November 14, 2018, 5:15pm

As reported in: What would you remove from Elixir? - #77 by OvermindDL1

Does anyone know what’s up? Elixir 1.7.4 broke my comprehension code when it worked fine in 1.6.6.

Here is the bulk of the linked post:

I have my own project that replaces for with a macro comprehension. ^.^

Mine decorates what you pass in with types (or ‘Access’ if not otherwise known) so it can generate optimal code for the specific types being used, meaning it’s faster than for. It’s in one of my playground library and I should probably pull it out into it’s own project as it is quite functional…

My fairly trivial benchmark:

defmodule Helpers do
  use ExCore.Comprehension

  # map * 2

  def elixir_0(l) do
    for\
      x <- l,
      do: x * 2
  end

  def ex_core_0(l) do
    comp do
      x <- list l
      x * 2
    end
  end

  # Into map value to value*2 after adding 1

  def elixir_1(l) do
    for\
      x <- l,
      y = x + 1,
      into: %{},
      do: {x, y * 2}
  end

  def ex_core_1(l) do
    comp do
      x <- list l
      y = x + 1
      {x, y * 2} -> %{} # line 35
    end
  end
end

inputs = %{
  "List - 10000 - map*2" => {:lists.seq(0, 10000), &Helpers.elixir_0/1, &Helpers.ex_core_0/1},
  "List - 10000 - into map +1 even *2" => {:lists.seq(0, 10000), &Helpers.elixir_1/1, &Helpers.ex_core_1/1},
}


actions = %{
  "Elixir.for"  => fn {input, elx, _core} -> elx.(input) end,
  "ExCore.comp" => fn {input, _elx, core} -> core.(input) end,
}


Benchee.run actions, inputs: inputs, time: 5, warmup: 5, print: %{fast_warning: false}

And the results locally right now:

Operating System: Linux
CPU Information: Blah
Number of Available Cores: 6
Available memory: 16.430148 GB
Elixir 1.6.6
Erlang 21.1.1
Benchmark suite executing with the following configuration:
warmup: 5.00 s
time: 5.00 s
parallel: 1
inputs: List - 10000 - into map +1 even *2, List - 10000 - map*2
Estimated total run time: 40.00 s



Benchmarking with input List - 10000 - into map +1 even *2:
Benchmarking Elixir.for...
Benchmarking ExCore.comp...

Benchmarking with input List - 10000 - map*2:
Benchmarking Elixir.for...
Benchmarking ExCore.comp...

##### With input List - 10000 - into map +1 even *2 #####
Name                  ips        average  deviation         median
ExCore.comp        370.81        2.70 ms     ±2.76%        2.67 ms
Elixir.for         245.68        4.07 ms    ±21.72%        3.90 ms

Comparison: 
ExCore.comp        370.81
Elixir.for         245.68 - 1.51x slower

##### With input List - 10000 - map*2 #####
Name                  ips        average  deviation         median
ExCore.comp        2.50 K      399.55 μs     ±9.28%      405.00 μs
Elixir.for         1.92 K      521.94 μs     ±7.26%      535.00 μs

Comparison: 
ExCore.comp        2.50 K
Elixir.for         1.92 K - 1.31x slower

Interestingly it won’t run on Elixir 1.7.4, get a syntax error, @josevalim did something change in a backwards incompatible way with Elixir 1.7.4? o.O

╰─➤  mix bench comprehension                                                                                                      1 ↵
** (SyntaxError) bench/comprehension_bench.exs:35: unexpected operator ->. If you want to define multiple clauses, the first expression must use ->. Syntax error before: '->'
    (elixir) lib/code.ex:767: Code.require_file/2
    (mix) lib/mix/tasks/run.ex:146: Mix.Tasks.Run.run/5
    (mix) lib/mix/tasks/run.ex:85: Mix.Tasks.Run.run/1
    (elixir) lib/enum.ex:1314: Enum."-map/2-lists^map/1-0-"/2
    (mix) lib/mix/task.ex:355: Mix.Task.run_alias/3
    (mix) lib/mix/task.ex:279: Mix.Task.run/2

I notated line 35 as # line 35 in the above benchmark source. Feel free to clone git clone https://github.com/OvermindDL1/ex_core.git && mix deps.get && mix compile && mix bench comprehension

OvermindDL1 · November 14, 2018, 5:16pm

Just tested Elixir 1.7.0 and it is broken there as well, 1.6.6 works fine.

Shouldn’t a backwards incompatible syntax change necessitate an Elixir 2.* version?

Should I report this to the git tracker?

OvermindDL1 · November 14, 2018, 5:24pm

Went ahead and reported it as it really does seem to be a legit bug now that I have a minimal example: Elixir 1.7.* syntax backwards incompatability with 1.6* and earlier · Issue #8386 · elixir-lang/elixir · GitHub

LostKobrakai · November 14, 2018, 5:34pm

Was this ever officially supported syntax though? Probably a tough question, but I seem to remember that fixing bugs late is still not considered a breaking change by the core team.

OvermindDL1 · November 14, 2018, 5:35pm

If it was useable by macro’s though then that means it is fully useable syntax, not just a ‘quirk’.

OvermindDL1 · November 14, 2018, 5:37pm

I can of course entirely see deprecating it for use ‘inside’ something like ‘case’, but as that wouldn’t have compiled previously anyway then it’s still not a backwards incompatible change.

tmbb · November 14, 2018, 5:40pm

I don’t think so. I’ve found bugs in the parser that @josevalim has identified as legitimate bugs (even though I could use the “wrong” syntax in macros). Since there isn’t a formal spec for the Elixir languagr anywhere, I think that what is a but or not depends on core team’s judgement.

OvermindDL1 · November 14, 2018, 5:42pm

That’s one of the major problems of Elixir though, it has no spec, it is still flabbergasting to me that a language can exist without a syntax spec, and because of that is why they get all these little bugs in to begin with. Regardless, this syntax worked in the macro’s before, and now it doesn’t, that is still a backwards incompatibility issue that should necessitate a a major version bump as it does indeed very much break an existing library in a way that is not easily fixable.

tmbb · November 14, 2018, 5:47pm

Unrelated: I think you should wrap the for special form in a macro defined by you. For example, something like:

import ExCore.Comprehensions, only: [comprehension: 1]

comprehension(
  for a <- list as, b <- map bs, into: ..., do: f(a, b)
)

josevalim · November 14, 2018, 5:58pm

Not really. Even with specs, an implementation can have flaws, especially when we are talking about things like parsers, which is hard to foresee how all of the rules interplay. So having more implementations, such as the one in IntelliJ by Luke or the one in Makeup by @tmbb help iron those bugs out.

We have been improving our syntax specification over the years and still things can definitely be better. Hopefully you can channel all of this flabbergasting into a more complete specification, which would be very welcome.

tmbb · November 14, 2018, 6:04pm

Makeup is emphatically not a correct Elixir parser and shouldn’t be treated as such. It’s a more or less correct Elixir lexer. With a little care it could probably converted into a full lexer, though.

The bug I found in the paraer has nothing to do with Makeup. I found it accidentally when “stress testing” the Elixir compiler when trying to develop DSL for a new pattern matching library for maps.

tmbb · November 14, 2018, 6:10pm

Yeah, it’s hard to test aa parser because of that. If the language is context free, I think it migjt be possible to generate random files that satisfy the grammar and test if the parser parses them as valid.

The main think that might make this hard in Elixir is that there seems to be some steps which I think should be rejected by the parser and are instead rejected later in the compilation pipeline. That way, you can get a compile error at runtime because the compilation error is only detected when the code is run.

Another problem with the parser is that I think it takes some shortcuts that are not strictly semantically correct. I have to look into it.

OvermindDL1 · November 14, 2018, 6:13pm

Eh that still has all the little comma-droppings everywhere though, and I really exceptionally hate those as it feels SO out of place with Elixir (this isn’t Erlang after all!).

But with a spec then we ‘know’ what’s a bug and what’s not then.

Optimally I love the C++ spec, it is SO complete, but C++ is so large that it does not make a great example to point at… ^.^;

OCaml, on the other hand, is both a fairly complex syntactical language and has a beautiful spec, like it’s absolutely gorgeous, and it maps to it’s parser precisely (Elixir’s parser is a bit scattered with weird multiple passes and all, and slower than OCaml’s even in interpreted mode interestingly…). Like this is what I imagine for Elixir’s spec (different theme of course): OCaml - The OCaml language
It has everything from a simple definition of the language that the spec itself is written in (a form of extended BNF), to definitions, and you get beauty like this (remember I invert my web colors):

To beautiful things like that that even define complex OCaml constructs:

And you can click on any BNF name to jump directly to that location in the spec to see it’s definition and it is just so sublime to use and navigate. You can quite literally implement all of OCaml-the-language via Just this spec (and it even defines the standard library in another section too!).

This is really the kind of thing (maybe not this ‘fancy’, but the existence of it) that should exist before a to-be-publicly-used language reaches 1.0.0. Rust has a wonderful spec as well though not as easy to use and navigate as OCaml’s. C++ has an anally detailed spec that is…less than easy to navigate. ^.^;

And being able to use -> as an operator in macro’s is so useful, I’m really saddened by this backwards incompatibility (which it really is since there was no spec defining proper allowed syntax). My only other real option is to define the output type as an argument to the comp itself, which removes it from the place of usage and separates the concerns, which makes it harder and more ugly to use… (EDIT: Hmm, or maybe wrap the output expression with a collector expression somehow…)

Yeah, Elixir’s compiler is a bit odd in that it has some ‘passes’/stages broken up that are traditionally not broken up in most compilers I’ve worked inside of…

Shortcuts should not be used unless they are defined by the spec. Like take OCaml’s spec, it’s BNF syntax is almost a direct copy from it’s actual syntax definition (though with OCaml code added).

tmbb · November 14, 2018, 6:27pm

For maximum MONADIC POWER!!! (haskell-style) you could have something like:

comp do
  a1 <- x1
  a2 <- x2
  ...
  return f(a1, a2, ...), as: %{}
end

josevalim · November 14, 2018, 6:29pm

Right. I am aware. I did not mean to say the bug report was a direct result from Makeup but rather from having more people “messing around”. But I did think you were messing around because of Makeup, thanks for correcting me.

tmbb · November 14, 2018, 6:31pm

But what are the new rules for the -> operator? It can only ne used in case statements and anonymous functions now?

tmbb · November 14, 2018, 6:38pm

For future reference, the post where I describe the bug: Proposal: Add field puns/map shorthand to Elixir

And the post where you (@josevalim) say it’s actually a bug: Proposal: Add field puns/map shorthand to Elixir

OvermindDL1 · November 14, 2018, 6:56pm

Lol, I did something like that originally! Problem is that the return expression often ended up having to be wrapped up in parenthesis. It might be worth changing it back to that though…

It can only be used when it is the only operator type in a body, but you can have many of them, just not any ast elements.

tmbb · November 14, 2018, 6:58pm

This makes it a syntax node whose validity depends on its siblings (and not on the children or parent)… I don’t think we have anything like this anywhere else in the language, and honestly I don’t see the point.

What’s the advantage of this restriction? It seems to make the language more complex for little benefit. Pinging @josevalim for a canonical explanation

OvermindDL1 · November 14, 2018, 7:09pm

Exactly! Now you understand my confusion when I hit this bug! (yes I fully consider it a bug that it no long works) It is another language inconsistency!

So this is a valid AST:

{:__block__, [],
 [
   {:+, [context: Elixir, import: Kernel], [1, 1]},
   {:+, [context: Elixir, import: Kernel], [2, 2]}
 ]}

But this is now not when it was before:

{:__block__, [],
 [
   {:+, [context: Elixir, import: Kernel], [1, 1]},
   {:->, [], [2, 2]}
 ]}

And it’s definitely not failing at the parsing stage but is rather failing in one of those weird later passes. If the purpose was just to enforce that in a case do block that all top-level expressions where ->'s then the equivalent of Enum.any?(block_list, fn {:->, _, _} -> true, _ -> false end) || throw %WhateverCaseException{}, not in one of those weird passes.

Honestly the part of Elixir that exists in Erlang seems way too much, it needs to be reduced substantially to basically the syntax and interpretation with a very restricted pure-function language spec, then implement macro’s in the language to define everything from def (takes the ast, massages it, passes it to one of those low level function calls), defmodule, case, etc… etc… It would make for a MUCH more uniform language, much more expressive from user code, etc… etc… Plus a lot more of the language would be implemented in the language itself.