Elixir language is in mature state and no breaking or heavy changes are expected to happen. Therefore “now” is the right time to start thinking about creating somewhat specification of the language. Current proposal is supposed to be the place to talk about necessity, usefulness and implementation of the language specification
Language and compiler specification
What is it?
Language specification is a somewhat formal definition and description of how the language runtime must/should/can/shouldn’t behave to be valid Elixir runtime. It may also define properties of standard library (like “Kernel.+ is always a functions”), etc.
Compiler specification is also a somewhat formal definition and description
Why should we need one?
In case someone (like me) takes an opportunity to implement compatible compiler or language runtime or both. Anyway, for the language which is mature enough, it is only a question of time when third-party compilers and runtimes will appear, and reference-oriented implementations are almost always doomed to be incompatible in some edge cases. This leads to UBs among different compilers and runtimes, which is bad.
Specification prevents exploits and incorrect usage of undocumented or unexpected behavior. I can provide some examples of undocumented things, which can be exploited to hack module compilation order, create weak inter-module dependencies, detect function overrides and so on. https://www.hyrumslaw.com/
Specification defines more clearer path of language improvement, so that users could expect that some parts of language may change in the future. For example, marking that protocol functions (like Enumerable.reduce) may never be called if the compiler can derive the type of the first argument, can create a room for type-inference oriented compile-time optimizations.
Process specification of development is also a process of exploration of improvement possibilities
Why can’t we just document this stuff?
Documentation is intended for users of the language, not for compiler or runtime developers. Therefore, some topics just do not belong in documentation, while others just do not need deep explanation.
Won’t it take a ton of time?
Specifications may (and they definitely will) evolve over time. It is not necessary to define every aspect of the language beforehand, specification can be ad hoc and reactive, defining just the most important points.
Can’t we just discuss language details here?
Yes, we can, but these discussions can be lost, they can become outdated or they can just be incorrectly answered by mistake.
What do we need to do to create the specification?
Define a process of specification development. For example, “contributors create RFC (whose format defined here), core team approves”
Define specification format and rules. For example, “MUST defines that every Elixir compiler must behave this way to be considered compatible”
Without taking into account Elixir core team and myself there are several third-party projects who might benefit in having language specification and
Elixir linters can detect unreliable and exploiting behaviors (cc @rrrene)
LSP can have better completion around what and where can be called (cc @axelson)
Tools leveraging reflection and elixir metaprogramming approaches (like sourceror (cc @dorgan) and patch (cc @mdnowack)) could have a better understanding of how Elixir modules must work and how AST must behave, how variable contexts behave and so on.
erts developers could have a better understanding of how beam generated by Elixir programs is supposed to be executed
Personally I think that this is not worth the time, you should understand that elixir without erlang is pretty much an empty shell, because some of the more advanced features like pattern matching are part of erlang, not elixir.
Shouldn’t it be the other way round? Beam code being the standard (which has defined specs for erts to follow) and elixir just like other beam languages is a consumer of that spec?
This most probably will never happen, the only exception is lumen and who knows how many years it is from completion.
The other thing is that there are talks about elixir 2.0, so there might be a lot of breaking changes.
In general I think that having specifications is a good thing, however I don’t see how the effort invested in making a specification for elixir will be paid off.
I just don’t see an already overworked core team working on this ever. And the community, while very practical and of high quality, is not big enough or with enough free bandwidth to tackle this. We seem to mostly address functional stuff that helps us do our jobs. Not too much foundational work (exceptions exist of course, but they kind of prove the rule).
I do agree with the rationale btw, I am just not seeing this idea happen.
One thing I’d explore if I had all the time and energy in the world: an alternative Beam VM implementation without hot code reloading. Maybe skipping that will allow for both more performance optimizations and a way to spec the whole thing better. Maybe.
…Or maybe we can collaboratively agree on e.g. AST format for code-generating libraries as a start. The value of that is up in the air though.
It works in both ways. ERTS developers are interested in optimizing runtime for frequently used patterns, and Elixir developers are interested in generating efficient code. The specification would help to define what kind of features Elixir relies the most and what features are not that necessary.
There is also Tria, Eir project, and a lot of companies are interested in development of statically typed elixir compilers (@josevalim gave several talks about it)
It is neither scheduled, nor expected to happen yet.
It is actually a good point, thanks. I think that there are several approaches to this problem. As I’ve said, it is not required to have 100% complete spec of the language. And I think that it should not be the job of core team to gather specification requirements, specification format, specification implementation and the specification of language itself
I would suggest something like this:
Specification proposals are gathered in terms of RFC which can be accepted, declined or can have an update requested. Core team should not be responsible for writing these RFCs.
One time a year (perhaps synchronously with Elixir release) all RFC must be accepted, declined or must have an update requested. The decision to accept or decline RFC can be the only job core team must do.
Accepted RFCs and the current specification will form the new version of language specification
Yeah like this talk, and the conclusion of that talk is that having static types that are solved at compile-time don’t solve much and why elixir will never be typed like languages such as java.
This seems to be a big overstatement, but it is no news that people want their classic OOP language, either because they are too lazy to learn a new paradigm or because that is the only way they know how to architect and write software, a good example is this discussion.
I don’t know if it is an overstatement or not but several companies have already switched from Elixir because of lack of static typing. In the video you’ve linked above, Jose states that it is an elephant in the room (important topic in other words) and dashbitco has already been paid to implement type checking in the compiler (and that’s why it is present there).
So, language is already moving towards compile time type inference, there are research groups working in set-theoretic type inference analysis, there is me, there is facebook with eqwalizer, etc. And I think that setting up language and compiler specification process is a good step in this direction, which would make a process of compiler development more coordinated and define the compatibility boundaries of compilers and runtime capabilities.
That’s not the conclusion. And even if this is the conclusion, I can argue about points like “static typing won’t improve performance”, because I already have an existing compiler which is capable to perform compile time optimizations based on very-very-very basic type inference in Enum pipelines, and I have drafts of compile-time protocol resolution.
The problem is that, while these optimizations are compatible with reference compiler in 99.999999999% of cases and really useful, some non-obvious use cases will be broken (like using Patch library to check if function was called). So specification is required to define what is a reliable behavior and what is not and may vary from version to version, or from compiler to compiler
Or maybe because their engineers are incapable, who knows. There are a lot of stories like these about other languages and frameworks too and none of them are technically based.
This is a great thing, however there is a big difference between having inference and static typing at compile-time, and I would go with the first as it gives a lot more freedom and this is kind of the point that talk was having.
I never talked about performance, if let’s say you improve the performance of the code by 20% but at the same increase the complexity of tooling and development, I will instantly drop that optimization, as performance nowadays is overrated and not the deciding factor in a lot of cases.
My hat down to you if you want to improve this area, as this is one of the most complex one that only few can tackle with.
My opinion is that you will invest a lot of effort into this and have little to no gain, after all the practical aspects of the language are the most important ones (development/refactor speed, how happy you are writing code in the language, the libraries and tooling around them).
I appreciate your opinion, but things you’re talking about are arguable and completely irrelevant to the topic of this thread. If you want to have a discussion about usefulness of optimizing compilers and how static typing is not a question of being capable or not, but really a question of applicability of static typing as a tool to solve problems, we can create a separate thread or continue in private messages.
There is a half-finished Erlang Language Specification that anyone who has the time could start to contribute to. As stated in this thread, Erlang is the basis of Elixir, so if you have a spec for Erlang much of what Elixir is would already be defined in the Erlang spec.
Thanks for the link to erlang spec, but the purpose of this topic is discussion about Elixir compiler and runtime specification. While Elixir is built on top of Erlang, there are important for compiler developers things that are not covered and can’t be covered by Erlang spec. This includes, for example module compilation order, or inter-module dependency tracking, or possibility of compile-time evaluation of some runtime code etc.