Typechecker for Elixir

ityonemo · September 20, 2020, 12:20am

I just quit my job and after some soul-searching am deciding on what project I want to pursue, and thought it would be a good idea to throw my hat into the ring and work on some static typechecking for elixir. Looking at what’s available now and how I’d like to differentiate from existing projects are:

a custom type-system for the beam that is not based on any theoretical typesystems, but is based on experience with the BEAM.
not a DSL in the elixir language (no macros in the checked code)
Hooks into the elixir compiler tracer, and has a way of invocation that works even if it should require tracer message that doesn’t exist yet.
Pluggable.

I think I have a way forward for this and will almost certainly work on it full-time over the next month and a half, but I wanted to 1) gauge the community’s interest on the matter and 2) take opinions on features you’d like to see 3) take opinions about specifics of what you do and/or don’t like about existing typechecker solutions.

Dusty · September 20, 2020, 2:50am

I think proper compile-time type checking for Elixir would be a game changer. Here are my thoughts:

Thanks for doing this! I know that not everyone in the Elixir community wants static types, but I also think that some individuals and orgs are skipping Elixir altogether for exactly this reason. Since the trend in PLs more broadly seems to be towards static types, I think a good static type checker would go a long way towards ensuring Elixir’s longevity.
How does FB’s work on static Erlang affect your work? Can you piggyback off of that work at all? If so, would there be any benefit to waiting for that release?
Does the core team have any position on these efforts? It’s impossible to please all of the people all of the time, but I do think there is benefit to getting as much input as possible from key players before forging ahead. Thus far, attempts to deal with Elixir types feel a bit fragmented, as opposed to a unified effort.

As for my own input, I would be thinking about the following:

Compile-time checking.
Potential for integration with ElixirLS and IDE support.
Separate type declarations. Elixir function heads are already a busy space, with lots of pattern matching and such. I think inline types would make this much worse. I don’t think the type declarations have to look like @spec currently does, but I do think they should get their own line.
Gradual addition to existing codebases. This isn’t particularly important to me personally, but I think it’s probably essential to wider adoption.
Can we leverage behaviours or another technique to achieve generics/polymorphism?
Have you considered how you want the community to support you? How much collaboration do you want? Given that you are doing this full time, will you be setting up Github sponsors or a Patreon?

ityonemo · September 20, 2020, 3:26am

static erlang: I have no idea. For personal reasons (having quit my job), I’m not really interested in waiting though. The idea of the project is that it will use dialyzer typespecs in its default plugin, though what it does with it is a bit opinionated. As these will be pluggable, I think for the ecosystem a good strategy is to make a diversity of options available and let natural selection shake out the winners. And in the case of what I’m doing, if FB comes up with something dramatically different than what we have now, it should be pluggable.
core team: My understanding is that the core team considered working on this and then abandoned it. And in any case I can’t guarantee that what I’m trying will work, so, might be worth waiting, until I can prove that I can walk the walk and not just talk the talk! I asked about a specific compiler tracer option during ElixirConf and Jose provided a great suggestion (parse the post-compiled module binary) that I think I’m going to run with.

yes!
With my experience from Zigler, I think this should be a natural consequence of raising CompileError; I presume there’s a natural way to do warnings, too.
There should be no type declarations necessary, by my plans, it will do inference on the (compiled) functions, overridden by the dialyzer typespecs (if any), or finally by user-provided shims.
Should be possible. In the first iteration it’s an @after_compile hook, and if the core team provides an after_compile compiler tracer in the “obvious way” then it will work as a directly provided tracer.
Maybe. You read my mind, I really really want this. Actually checking behaviours and polymorphism was one of the motivating factors for me to think about even wanting this months ago. There is a hacky way that this can be done (and I’ll probably implement it) as a plugin. Ideally there would be an elixir (or erlang) builtin type module([behaviour_modules]).
I have sufficient runway =D and I plan on doing “something else” full time by february (I will of course keep maintaining this project), but maybe I should set up a github sponsors if someone wants to help extend my runway “as a tip jar”! I’ll set that up when I sync my main repo this weekend.

Sanjibukai · September 20, 2020, 5:38am

They can be many rationale for wanting a typechecker… But in my case using them as a spec guiding rules (useful when doing DDD) like what’s can be done in F# is the only thing I think it’s missing in Elixir (in addition to the lack of executable standalone binaries).

Good luck!

I don’t think I have enough knowledge of Elixir internals but I would like to contribute if I can!

wolf4earth · September 20, 2020, 7:36am

You probably are already aware of it but there is the gradualizer project which tries to introduce gradual typing by making use of the dialyzer type specs. So in that regard it’s similar to what you’re proposing here.

Maybe it’s a potential source of inspiration.

ityonemo · September 20, 2020, 8:04am

Indeed, I have taken some inspriation from gradualizer.

Since erlang is getting its own static type analysis framework, I feel freer to not have to support erlang and can use some of elixir’s coding conveniences (prototypes and operator overloading) to make the the tests on the type analysis bits easier on the eyes and brain (I’m working on that as a detachable library here: https://github.com/ityonemo/mavis, but pardon, master is pretty far behind where I am currently)

There are two outstanding issues i see with gradualizer. The first is the choice to unify term/0 with any/0, and the second is that it uses strict subtyping as its core method of analysis; but I’m increasingly convinced that lambdas and maps have to be treated using something akin to ‘liquid types’ (I just learned about these today). I don’t want to be bound by a formalized type theory framework at the moment, but I’m working right now on a system that has a separate ‘usable_as’ descriptor that is distinct from strict subtyping. Finally, development of gradualizer seems to have stalled.

As an example of the sort of thing I want: Let’s say we have the following code:

@type d20permute :: (1..20 -> 1..20)

@spec functor1(1..20 -> integer()) :: integer()
@spec functor2(integer() -> 1..20) :: integer()
@spec functor3(1..20 -> 1..20) :: integer()
@spec functor4(1..10 -> 1..10) :: integer() 
@spec functor5(1..10 -> 1..20) :: integer()
@spec functor6(1..20 -> atom()) :: integer()

a d20permute function passed into functor1 should be :ok, :maybe into functor2, :ok into functor, :maybe into functor4, and :ok into functor5, and :error into functor6.

opsb · September 21, 2020, 8:22am

Static type checking really feels like the missing piece with Elixir at the moment. There’s so much to like about Elixir but I feel the lack of static typing every day haven gotten used to it in other languages. It also tends to come up when I’m pitching Elixir to folks, the lack of static typing is a bit of a turn off as more and more languages are headed in that direction.

mbuhot · September 21, 2020, 11:13am

How do you plan on typing enumerables? Many of the Enum module function operate on Enumerable.t which doesn’t specify a type for the elements of the enumerable.

I think the problem extends to protocols more broadly. I’d like to declare protocols with type parameters and use those types in type specs on the protocols functions.

josevalim · September 21, 2020, 11:29am

I will like to expand on this a little bit. What we have tried about 3 years ago was an idea that I knew it would fail: perform type inference of an Elixir codebase using a well defined typing-discipline (HM with intersection types). The goal was mostly to assess how much we would have to change the language to provide proper inference. The answer was a lot.

Since then our focus is to use the typing information we already have in our Elixir codebases (which is already considerable, given pattern matching, guards, protocols, etc) and emit warnings based on that. The recent data constructors warnings in v1.11 are built on top of this foundation and we will continue adding more. We don’t want to add any new syntax nor explicit types for this to work.

Also, as a personal parenthesis, I think adding types via type annotations (like we have today) is a missed opportunity since types allow us to express some things a bit more cleanly (like enums, which we don’t have). I did wish some languages that added types after the fact provided a more integrated experience. This is not something we are exploring, but more of a general thought that I have on the topic.

ityonemo · September 21, 2020, 4:57pm

I don’t have a plan for typing enumerables, but I think there is a way forward, where one can provide a “typing function” that overrides the typespec and can be more dynamic than a static override. Some sort of override is going to be necessary anyways, since it’s possible that a library has a competely incorrect typespec and you might not be able to get a patch in the library done. If type errors raise CompileError, you’ll need to unblock it. Quinn was helpful and pointed me to some resources that are fantastic, although I don’t want to be theorybound yet, there’s good theory on what class of functions you would or would not want in that scenario.

ityonemo · September 21, 2020, 5:02pm

Thanks for you historical insight! To me it’s important that a typechecking library not change the language. I’ve actually played around with that idea on a couple of personal projects that have since gone into the garbage bin, because, let’s say I hire someone, I don’t want them learning nonstandard elixir for their career.

Also, thanks for your suggestion to parse the binaries. I am loving the opportunity to spelunk in the low level recesses of how the BEAM works.

stevensonmt · September 21, 2020, 6:01pm

Not sure how relevant it is to your project, but you might look at gleam, which is statically typed and compiles to Erlang.

ityonemo · September 22, 2020, 12:24am

Alright, I registered for Github sponsors, and pushed the basic outline of my strategy: https://github.com/ityonemo/selectrix

vans163 · July 7, 2021, 2:57pm

I like this a lot and was thinking about this constantly at one point. A way to perform type checks without changing the language.

Dialyzer is not good at all it gets lost way too easily on simple things. For example inside reductions or nested functions.

ityonemo · July 7, 2021, 4:49pm

Yes! Sorry, I’m getting back to this, I promise! Just moved halfway across a continent, and just as I settled, got a work project for which I’m spending all of my free time writing a JSONschema compiler (which I will make public)!