How to make Dialyzer more strict?

After reading the discussion here, I have roughly the same question as @tx8 – are there any actionable low-hanging fruits that are foreseeably doable without the doers hating their lives?

I’d love a statically typed Elixir but I understand there are a lot of practical hurdles and it’s most likely not on the roadmap. In the meantime, if us the Elixir devs want to level up our game, what options do we have?

1 Like

In my very uninformed opinion, no. There are no low hanging fruits here. You either go full static typing from the beginning or you’ll never get decent type inference for your program. Type inference is an all or nothing affair: you either have full decidability (e.g. Haskell or OCaml) or you have something like dialyzer. You can’t go from dyalizer to Haskell by incrementally adding more and more rules.

First, you need actual parametrized types, which I think dialyzer doesn’t support. You can’t have type signatures like the one from the List.delete/2 function: delete(list(), any()) :: list(), which expands into delete([any()] -> any()) :: [any()]. In this type signature, there is no relationship between the three types that appear. You’d need something like (pseudocode) delete(['a], 'a) :: 'a, which guarantees that the input and output list have the same type.

At the same time, there is a big problem with macros. Macros operate on the AST. If Elixir were to be strongly typed, the AST would have to be strongly typed as well. The most widely used decidable type system is probably the Hindley-Milner (HM) type system. Elixir’s AST can’t be represnted in its current for in that type system, because it doesn’t support heterogeneous collections. You’d need to define an AST type, with several constructors for the different types, but in such a system, writing macros is a completely different experience. t wouldn’t feel like writing Elixir as it is today. Again, such change might be worth it, but it won’t look like Elixir.

Write another language that compiles to the BEAM with easy (typed) FFI for Elixir and Erlang, uses mix/rebar as a build system and writes the function docs to the new documentation BEAM chunks. For now forget about reusing Elixir’s macros. If you need a macro system, write your own.

Alternatively, there is some new research on type checking algorithms for types that include untagged unions and intersections, which might allow you to have static types without the need for named constructors. The main problem is that type inference is much harder within this system.

1 Like

This is delete(list(a), a) :: list(a) when a: var in elixir specs.

2 Likes

Sorry, I should have reread the typespec docs before posting. In that case, I think some typespecs should be updated to use this

Not that it would make much difference in practice.

I love types, but here

Jose says the future might be Clojure Spec

https://clojure.org/about/spec

which seems sensible to me.

Yet again, Elixir and Clojure show their likeness…

1 Like

Did you try the overspecs/underspecs flags? Erlang -- dialyzer

Dialyzer is fine for the trade-offs it chose to make. Having something more strict most likely means rejecting code that is completely valid today. Or in the best case scenario, forcing types to be explicit and not relying on inference.

@tmbb is correct here. There aren’t low hanging fruits when it comes to types. Adding a type system involves many compromises and the compromises you are willing to make change a lot when the language already exists.

If you want to have type inference, then many Elixir idioms will no longer be supported. Elixir code is based on unions (it returns {:ok, _} | :error) and intersections (different clauses of a function) and those are very expensive for an inference engine.

We could add requirements, such as obligatory typing of inputs, but that doesn’t solve the problem that we have both typed and non-typed code and we should likely check the values at the boundaries, adding runtime cost.

If you want to keep the language the same and support all of the existing idioms and also not worry about conversion costs: that’s Dialyzer.

Long story short: we could have other type systems but you need to start from scratch and you will have to compromise in many ways.

Apologies but I did not mean that Clojure Spec is the future. It is unlikely we will have something like Clojure Spec in Elixir per-se, especially because we have typespecs and guards and adding a third option is going to bring only marginal gains. If somebody wants to explore something like Clojure Spec as a package, then please do!

7 Likes

Would it be possible through a module attribute or macro to have the compiler insert additional pattern matching / guard clauses to check parameters and return values respect the typespecs at runtime?

Something like a combination of dialyzer and expat would be great.

Yes, I tried, it seems :underspecs not affect at all to my example

Dialyzer with flag :overspecs generated weird warning not related to problem in example

lib/hello.ex:1: Type specification 'Elixir.Hello':'__info__'('attributes' | 'compile' | 'functions' | 'macros' | 'md5' | 'module' | 'deprecated') -> atom() | [{atom(),any()} | {atom(),byte(),integer()}] is a subtype of the success typing: 'Elixir.Hello':'__info__'('attributes' | 'compile' | 'deprecated' | 'functions' | 'macros' | 'md5' | 'module') -> any()
done (warnings were emitted)

I also tried other flags, it not resolves the problem

I had a similar question in this post and got a nice answer from @peerreynders.

You could do

  @type good_hello() :: map() 
  @type bad_hello() :: map() 

  @spec hello(integer()) :: good_hello() | bad_hello()
  def hello(int) do
    case int do
      1 -> good()
      _ -> bad()
    end
  end

  @spec good() :: good_hello()
  def good(), do: %{hello: "world"}

  @spec bad() :: bad_hello()
  def bad(), do: :wtf

This way You would catch spec error with dialyzer.

5 Likes

Thanks, so short answer is to not use literals for structures, maps, lists and other terms, but use functional wrapper with its own typespec which will return term as expression without any branching inside?

1 Like

I was trying to port example from [Functional Programming] Domain Modeling Made Functional (Pragprog) (F#) to Elixir. Of course F# is not Elixir, but the example remains interesting.

Thanks, so short answer is to not use literals for structures, maps, lists and other terms, but use functional wrapper with its own typespec which will return term as expression without any branching inside?

Yes, and then compose into bigger types, and bigger functions. Maybe like this.

  @type good_hello() :: map() 
  @type bad_hello() :: map() 
  @type any_hello() :: good_hello() | bad_hello()

  @spec hello(integer()) :: any_hello()

Yes, probably it will work, but I think it looks creepy :smiley:
And of course if you will introduce new type and function every time you need branching - it will gain 3x amount of code, and will gain overhead regarding function calls in runtime

Personnally, I would have written this like that :slight_smile:

  @spec hello(integer()) :: map()
  def hello(1), do: %{hello: "world"}
  def hello(_), do: :wtf

Conditional and branching are easily replaced by pattern matching :slight_smile:

Yes, ofc I like pattern matching - it makes code shorter etc,
but it not resolves problem with typespecs and Dialyzer here - try to compile it and run checks, Dialyzer will be happy with it

Actually it does. :slight_smile:

@type my_dict(key, value) :: [{key, value}]

I’d think you could?

@spec delete([t], t) :: [t]

Although I’m unsure how well dialyzer checks that (wow the syntax highlighting is bad on ::)? But I’m like 80% sure Dialyzer matches based on unbound names for handling passed in types.

Good thing the AST is so simple, just some primitive types, lists, 2-tuples, and 3-tuples. :slight_smile:
/me wishes it were just 3-tuples and primitives-inside-special-3-tuples

There are ways though, and since each form of an AST node is distinct then it could be a headless sum type safely (or heads without internal named distinction, which is easy to do). :slight_smile:

Ah yeah, the when might be necessary, unsure?

I don’t. It is runtime based, not entirely compile-time, which is point defeating.

Yep yep, it is a great positive typer, it was not designed to be an actual complete typing system.

It’s not too bad actually, especially if there is a ‘dynamic’ type that must be matched out for use in statically-strongly typed sections (deft as a typed def, or something that is module-wide?), then you can pretty well incrementally improve it knowing that the only unchecked areas are areas that are not typed. :slight_smile:

At least until you define such an Option.t type or so and baked in a few common patterns.

Not any more than current code already has to do, the dynamic->typed boundaries would still use matchers and guards to safely determine what it is, which is what you have to do anyway to determine the types of something. However inside the typed area then direct calls could be made that entirely bypass guards and other such checks for known paths.

Yeah I really wish Elixir started as a typed language, erlang being dynamically typed is one of the things I’ve hated most about it, only standing it because of dializer. Elixir started from the wrong side of typing even though it was made a-new still seemed entirely broken to me…

Actually I think it was @vic that made a library or two along that path. ^.^

Yep, it definitely would, though that would add overhead that the static type checking means wouldn’t need to exist anyway (proper static typing can remove a lot of checks and can generate a lot better code in a few areas).

Yeah overspecs are way too noisy because the quality of the specs in the overall engine and essentially all libraries are not made quite right in a lot of places (especially once you take macro’s in mind).

Like in a real full typing system! ^.^

Typespecs don’t generate code though.

/me really wants to work more on the TypedElixir experiment… blehg lack of time…

1 Like

Yes, but in an HM typesystem you need named constructors to write your macro, even if the internal implementation strips them… This is not a disaster, though, and you can work around it.

1 Like

I don’t think it is too creepy, one liner functions, more specs and types than code, and…

… no local variables :slight_smile:

  @type good_hello() :: map() 
  @type bad_hello() :: map() 
  @type any_hello() :: good_hello() | bad_hello()

  @spec hello(integer()) :: any_hello()
  def hello(1), do: good()
  def hello(_), do: bad()

  @spec good() :: good_hello()
  def good(), do: %{hello: "world"}

  @spec bad() :: bad_hello()
  def bad(), do: :wtf
2 Likes

That’s true for basic types but pretty much a nightmare for everything else. I can easily assert that something is a list. It is expensive to assert that something is a list of only As and Bs. Other times it is just plain impossible (guarantee that X is a stream capable of emitting only Y). Or assert module Z coming from runtime implements a certain behaviour or protocol.

1 Like

That would be assuming recursiveness, rather it should be dynamic fully internally, so a list would only contain a list of dynamics, each of those you’d test out (would would be very simple if they only expect one thing). Typing to a full resolved type is one of those things I keep saying is not efficient or good to do on dynamic/typed interaction boundaries. ALL integration points should be entirely dynamic typed, thus in OCaml’ize something like:

let sum_ints bb =
  let open BeamTypes in
  match bb with
  | List lst ->
    lst |> List.fold_left 0 (fun acc v-> match v with
      | Integer i -> i+acc
      (* See below *)
    )
  (* Passing in an unhandled type will crash at runtime in OCaml so I will leave it,
     though you also get a compile warning...
  *)

Thus Every-Single-Dynamic-Location has to be matched out only to the BeamTypes types, so a BeamTypes.List gets you a list of more beamtypes. This is really the only safe way to do it on something like the BEAM VM. You can of course make some kind of macro (or in OCaml terms, a PPX) that generates conversion and check functions if you don’t mind the runtime checking cost, but it’s not any more expensive then checking them in the BEAM as well, and if you just handle a single type each, like in the above example, then the codegen can just treat it like it straight anyway without any checks if it shows the usage confirms the type that it will be (I.E. that it would crash if invalid type). Or the user could define a default case and handle the errors any way they want (which would always put in a guard test at codegen).

Only atomically built units of modules could be fully properly typed. Message boundaries and so forth absolutely could not be. And manually hot-code loading non-atomic module-sections is of course ‘undefined behaviour’ (but a rare case even nowadays). ^.^

1 Like

Right, that would require the type-safe code to do conversions and inline code until the traversal point, which violates the modules boundaries. Or am I missing something? To me violating the module semantics is a no-no.