Elixir 1.7.4 backwards incompatibility

As reported in: What would you remove from Elixir?

Does anyone know what’s up? Elixir 1.7.4 broke my comprehension code when it worked fine in 1.6.6.

Here is the bulk of the linked post:

I have my own project that replaces for with a macro comprehension. ^.^

Mine decorates what you pass in with types (or ‘Access’ if not otherwise known) so it can generate optimal code for the specific types being used, meaning it’s faster than for. It’s in one of my playground library and I should probably pull it out into it’s own project as it is quite functional…

My fairly trivial benchmark:

defmodule Helpers do
  use ExCore.Comprehension

  # map * 2

  def elixir_0(l) do
      x <- l,
      do: x * 2

  def ex_core_0(l) do
    comp do
      x <- list l
      x * 2

  # Into map value to value*2 after adding 1

  def elixir_1(l) do
      x <- l,
      y = x + 1,
      into: %{},
      do: {x, y * 2}

  def ex_core_1(l) do
    comp do
      x <- list l
      y = x + 1
      {x, y * 2} -> %{} # line 35

inputs = %{
  "List - 10000 - map*2" => {:lists.seq(0, 10000), &Helpers.elixir_0/1, &Helpers.ex_core_0/1},
  "List - 10000 - into map +1 even *2" => {:lists.seq(0, 10000), &Helpers.elixir_1/1, &Helpers.ex_core_1/1},

actions = %{
  "Elixir.for"  => fn {input, elx, _core} -> elx.(input) end,
  "ExCore.comp" => fn {input, _elx, core} -> core.(input) end,

Benchee.run actions, inputs: inputs, time: 5, warmup: 5, print: %{fast_warning: false}

And the results locally right now:

Operating System: Linux
CPU Information: Blah
Number of Available Cores: 6
Available memory: 16.430148 GB
Elixir 1.6.6
Erlang 21.1.1
Benchmark suite executing with the following configuration:
warmup: 5.00 s
time: 5.00 s
parallel: 1
inputs: List - 10000 - into map +1 even *2, List - 10000 - map*2
Estimated total run time: 40.00 s

Benchmarking with input List - 10000 - into map +1 even *2:
Benchmarking Elixir.for...
Benchmarking ExCore.comp...

Benchmarking with input List - 10000 - map*2:
Benchmarking Elixir.for...
Benchmarking ExCore.comp...

##### With input List - 10000 - into map +1 even *2 #####
Name                  ips        average  deviation         median
ExCore.comp        370.81        2.70 ms     ±2.76%        2.67 ms
Elixir.for         245.68        4.07 ms    ±21.72%        3.90 ms

ExCore.comp        370.81
Elixir.for         245.68 - 1.51x slower

##### With input List - 10000 - map*2 #####
Name                  ips        average  deviation         median
ExCore.comp        2.50 K      399.55 μs     ±9.28%      405.00 μs
Elixir.for         1.92 K      521.94 μs     ±7.26%      535.00 μs

ExCore.comp        2.50 K
Elixir.for         1.92 K - 1.31x slower

Interestingly it won’t run on Elixir 1.7.4, get a syntax error, @josevalim did something change in a backwards incompatible way with Elixir 1.7.4? o.O

╰─➤  mix bench comprehension                                                                                                      1 ↵
** (SyntaxError) bench/comprehension_bench.exs:35: unexpected operator ->. If you want to define multiple clauses, the first expression must use ->. Syntax error before: '->'
    (elixir) lib/code.ex:767: Code.require_file/2
    (mix) lib/mix/tasks/run.ex:146: Mix.Tasks.Run.run/5
    (mix) lib/mix/tasks/run.ex:85: Mix.Tasks.Run.run/1
    (elixir) lib/enum.ex:1314: Enum."-map/2-lists^map/1-0-"/2
    (mix) lib/mix/task.ex:355: Mix.Task.run_alias/3
    (mix) lib/mix/task.ex:279: Mix.Task.run/2

I notated line 35 as # line 35 in the above benchmark source. Feel free to clone git clone https://github.com/OvermindDL1/ex_core.git && mix deps.get && mix compile && mix bench comprehension

Just tested Elixir 1.7.0 and it is broken there as well, 1.6.6 works fine.

Shouldn’t a backwards incompatible syntax change necessitate an Elixir 2.* version?

Should I report this to the git tracker?

Went ahead and reported it as it really does seem to be a legit bug now that I have a minimal example: https://github.com/elixir-lang/elixir/issues/8386

Was this ever officially supported syntax though? Probably a tough question, but I seem to remember that fixing bugs late is still not considered a breaking change by the core team.

If it was useable by macro’s though then that means it is fully useable syntax, not just a ‘quirk’.

I can of course entirely see deprecating it for use ‘inside’ something like ‘case’, but as that wouldn’t have compiled previously anyway then it’s still not a backwards incompatible change.

I don’t think so. I’ve found bugs in the parser that @josevalim has identified as legitimate bugs (even though I could use the “wrong” syntax in macros). Since there isn’t a formal spec for the Elixir languagr anywhere, I think that what is a but or not depends on core team’s judgement.

That’s one of the major problems of Elixir though, it has no spec, it is still flabbergasting to me that a language can exist without a syntax spec, and because of that is why they get all these little bugs in to begin with. Regardless, this syntax worked in the macro’s before, and now it doesn’t, that is still a backwards incompatibility issue that should necessitate a a major version bump as it does indeed very much break an existing library in a way that is not easily fixable.

Unrelated: I think you should wrap the for special form in a macro defined by you. For example, something like:

import ExCore.Comprehensions, only: [comprehension: 1]

  for a <- list as, b <- map bs, into: ..., do: f(a, b)

Not really. Even with specs, an implementation can have flaws, especially when we are talking about things like parsers, which is hard to foresee how all of the rules interplay. So having more implementations, such as the one in IntelliJ by Luke or the one in Makeup by @tmbb help iron those bugs out.

We have been improving our syntax specification over the years and still things can definitely be better. Hopefully you can channel all of this flabbergasting into a more complete specification, which would be very welcome. :slight_smile:

1 Like

Makeup is emphatically not a correct Elixir parser and shouldn’t be treated as such. It’s a more or less correct Elixir lexer. With a little care it could probably converted into a full lexer, though.

The bug I found in the paraer has nothing to do with Makeup. I found it accidentally when “stress testing” the Elixir compiler when trying to develop DSL for a new pattern matching library for maps.

Yeah, it’s hard to test aa parser because of that. If the language is context free, I think it migjt be possible to generate random files that satisfy the grammar and test if the parser parses them as valid.

The main think that might make this hard in Elixir is that there seems to be some steps which I think should be rejected by the parser and are instead rejected later in the compilation pipeline. That way, you can get a compile error at runtime because the compilation error is only detected when the code is run.

Another problem with the parser is that I think it takes some shortcuts that are not strictly semantically correct. I have to look into it.

Eh that still has all the little comma-droppings everywhere though, and I really exceptionally hate those as it feels SO out of place with Elixir (this isn’t Erlang after all!).

But with a spec then we ‘know’ what’s a bug and what’s not then.

Optimally I love the C++ spec, it is SO complete, but C++ is so large that it does not make a great example to point at… ^.^;

OCaml, on the other hand, is both a fairly complex syntactical language and has a beautiful spec, like it’s absolutely gorgeous, and it maps to it’s parser precisely (Elixir’s parser is a bit scattered with weird multiple passes and all, and slower than OCaml’s even in interpreted mode interestingly…). Like this is what I imagine for Elixir’s spec (different theme of course): https://caml.inria.fr/pub/docs/manual-ocaml/language.html
It has everything from a simple definition of the language that the spec itself is written in (a form of extended BNF), to definitions, and you get beauty like this (remember I invert my web colors):
To beautiful things like that that even define complex OCaml constructs:

And you can click on any BNF name to jump directly to that location in the spec to see it’s definition and it is just so sublime to use and navigate. You can quite literally implement all of OCaml-the-language via Just this spec (and it even defines the standard library in another section too!). :slight_smile:

This is really the kind of thing (maybe not this ‘fancy’, but the existence of it) that should exist before a to-be-publicly-used language reaches 1.0.0. Rust has a wonderful spec as well though not as easy to use and navigate as OCaml’s. C++ has an anally detailed spec that is…less than easy to navigate. ^.^;

And being able to use -> as an operator in macro’s is so useful, I’m really saddened by this backwards incompatibility (which it really is since there was no spec defining proper allowed syntax). My only other real option is to define the output type as an argument to the comp itself, which removes it from the place of usage and separates the concerns, which makes it harder and more ugly to use… (EDIT: Hmm, or maybe wrap the output expression with a collector expression somehow…)

Yeah, Elixir’s compiler is a bit odd in that it has some ‘passes’/stages broken up that are traditionally not broken up in most compilers I’ve worked inside of…

Shortcuts should not be used unless they are defined by the spec. Like take OCaml’s spec, it’s BNF syntax is almost a direct copy from it’s actual syntax definition (though with OCaml code added).

For maximum MONADIC POWER!!! (haskell-style) you could have something like:

comp do
  a1 <- x1
  a2 <- x2
  return f(a1, a2, ...), as: %{}

Right. I am aware. :slight_smile: I did not mean to say the bug report was a direct result from Makeup but rather from having more people “messing around”. But I did think you were messing around because of Makeup, thanks for correcting me.

1 Like

But what are the new rules for the -> operator? It can only ne used in case statements and anonymous functions now?

For future reference, the post where I describe the bug: Proposal: Add field puns/map shorthand to Elixir

And the post where you (@josevalim) say it’s actually a bug: Proposal: Add field puns/map shorthand to Elixir

Lol, I did something like that originally! Problem is that the return expression often ended up having to be wrapped up in parenthesis. It might be worth changing it back to that though…

It can only be used when it is the only operator type in a body, but you can have many of them, just not any ast elements.

This makes it a syntax node whose validity depends on its siblings (and not on the children or parent)… I don’t think we have anything like this anywhere else in the language, and honestly I don’t see the point.

What’s the advantage of this restriction? It seems to make the language more complex for little benefit. Pinging @josevalim for a canonical explanation

Exactly! Now you understand my confusion when I hit this bug! (yes I fully consider it a bug that it no long works) It is another language inconsistency!

So this is a valid AST:

{:__block__, [],
   {:+, [context: Elixir, import: Kernel], [1, 1]},
   {:+, [context: Elixir, import: Kernel], [2, 2]}

But this is now not when it was before:

{:__block__, [],
   {:+, [context: Elixir, import: Kernel], [1, 1]},
   {:->, [], [2, 2]}

And it’s definitely not failing at the parsing stage but is rather failing in one of those weird later passes. If the purpose was just to enforce that in a case do block that all top-level expressions where ->'s then the equivalent of Enum.any?(block_list, fn {:->, _, _} -> true, _ -> false end) || throw %WhateverCaseException{}, not in one of those weird passes.

Honestly the part of Elixir that exists in Erlang seems way too much, it needs to be reduced substantially to basically the syntax and interpretation with a very restricted pure-function language spec, then implement macro’s in the language to define everything from def (takes the ast, massages it, passes it to one of those low level function calls), defmodule, case, etc… etc… It would make for a MUCH more uniform language, much more expressive from user code, etc… etc… Plus a lot more of the language would be implemented in the language itself.