Dialyzer cannot recognize error in function using polymorphic types

Fl4m3Ph03n1x · February 16, 2022, 3:22pm

Background

I am trying out polymorphic typing with dialyzer. As an example I am using the famous Option type (aka, Maybe Monad) that is now prevalent in many other languages these days.

defmodule Test do
  @type option(t) :: some(t) | nothing
  @type some(t) :: [{:some, t}]
  @type nothing :: []

  @spec validate_name(String.t()) :: option(String.t())
  def validate_name(name) do
    if String.length(name) > 0 do
      [{:some, name}]
    else
      nil
    end
  end
end

As you can see, the function validate_name should return (by spec definition) [{:some, String.t}] | [].

The issue here is that in reality, the function is returning [{:some, String.t}] | nil. nil is not the same as empty list [].

Problem

Given this issue, I would expect dialyzer to complain. However it gladly accepts this erroneous spec:

$ mix dialyzer
Compiling 1 file (.ex)
Finding suitable PLTs
Checking PLT...
[:compiler, :currying, :elixir, :gradient, :gradualizer, :kernel, :logger, :stdlib, :syntax_tools]
PLT is up to date!
No :ignore_warnings opt specified in mix.exs and default does not exist.

Starting Dialyzer
[
  check_plt: false,
  init_plt: '/home/user/Workplace/fl4m3/grokking_fp/_build/dev/dialyxir_erlang-24.2.1_elixir-1.13.2_deps-dev.plt',
  files: ['/home/user/Workplace/fl4m3/grokking_fp/_build/dev/lib/grokking_fp/ebin/Elixir.Book.beam',
   '/home/user/Workplace/fl4m3/grokking_fp/_build/dev/lib/grokking_fp/ebin/Elixir.DealingWithListsOfLists.beam',
   '/home/user/Workplace/fl4m3/grokking_fp/_build/dev/lib/grokking_fp/ebin/Elixir.Event.beam',
   '/home/user/Workplace/fl4m3/grokking_fp/_build/dev/lib/grokking_fp/ebin/Elixir.FlatMapsVSForComprehensions.beam',
   '/home/user/Workplace/fl4m3/grokking_fp/_build/dev/lib/grokking_fp/ebin/Elixir.ImmutableValues.beam',
   ...],
  warnings: [:unknown]
]
Total errors: 0, Skipped: 0, Unnecessary Skips: 0
done in 0m1.09s
done (passed successfully)

Furthermore, no matter what I put in the else branch, the result is always a “happy dialyzer”.

Question

At this point, the only logical solution I can think of is that dialyzer is only concerned with the happy path. Meaning, it will ignore my else branch.

If dialzyer is only ever concerned with happy paths, then this would explain the issue (it is called success typing after all) but it also means it will totally miss a bunch of errors in my code.

Is my assumption about dialyzer correct?
Is there a way to make it more precise in finding errors, or is this a limitation of the algorithm used by dialyzer ? (and therefore no fix is possible)

LostKobrakai · February 17, 2022, 8:09am

You can try reversing the conditional and you’ll see it’s not about happy path or not (how would dialyzer even know what a happy path is). Dialyzer only takes issue if it there is no way your function will ever work.

For your validate_name/1 there’s plenty of inputs, which are ok. Basically anything, which is not "" will produce a return value aligning with your typespec. If you remove the conditional and return nil for every case and suddenly dialyzer is no longer as happy. It detected that the only possible return is nil, which doesn’t align with your typespec.

You can dig deeper by adding another function to your module:

  @spec test :: nothing
  def test do
    validate_name("")
  end

This one dialyzer will complain about.

lib/test_dialyzer.ex:34:invalid_contract
The @spec for the function does not match the success typing of the function.

Function:
Test.test/0

Success typing:
@spec test() :: [{:some, binary()}, ...]

As you can see dialyzer never considered the nothing branch of your union a return value of validate_name.

To my knowledge this is because dialyzer inferred validate_name to return nil | [{:some, binary}, ...], but the typespec tells it that it should be [] | [{:some, binary}, ...]. The overlap between both is just [{:some, binary}, ...]. That’s enough for dialyzer to be fine in regards to validate_name/1. Adding test/0 however adds a function, where the typespec and the success typeing have no possible overlap anymore. Therefore you get an error.

The thing to understand here is that success typing is not all all concerned with exhaustive checks (at least with the default flags). It doesn’t care if your typespec has parts, which are off, for as long as there’s some “correct” parts to it. Correct meaning matching to the inferred type of dialyzer. So thinking your typespec is a 100% match just because dialyzer doesn’t complain is an incorrect expectation.

You can make dialyzer catch the issue by enabling the :underspecs (there’s also :overspecs) flag. This will make dialyzer complain if parts of its inferred type is not covered by the typespec in code.

Fl4m3Ph03n1x · February 17, 2022, 11:07am

So, if I understand correctly, as long as 1 path in my code leads to a successful execution that is compatible with the specs I provide, dialyzer will not complain.

There may be many paths that break, but as long as 1 works, dialzyer is happy.

In my specific case, because I have a branch that returns [{:some, String.t}] dialyzer does not complain, because I have 1 branch that succeeds.

Thanks!

I tried this and didn’t get the same result:

mix.exs

  def project do
    [
      app: :my_app,
      version: "0.1.0",
      elixir: "~> 1.13",
      start_permanent: Mix.env() == :prod,
      deps: deps(),
      dialyzer: [
        flags: ["-Wunmatched_returns", :error_handling, :race_conditions, :underspecs]
      ]
    ]
  end

test.ex

defmodule Test do

  @type option(t) :: some(t) | nothing
  @type some(t) :: [t]
  @type nothing :: []

  @spec validate_name(String.t()) :: option(String.t())
  def validate_name(name) do
    if String.length(name) > 0 do
      [name]
    else
      nil
    end
  end
end

dialyzer output

$ mix dialyzer
Compiling 1 file (.ex)
Finding suitable PLTs
Checking PLT...
[:compiler, :currying, :elixir, :gradient, :gradualizer, :kernel, :logger, :stdlib, :syntax_tools]
PLT is up to date!
No :ignore_warnings opt specified in mix.exs and default does not exist.

Starting Dialyzer
[
  check_plt: false,
  init_plt: '/home/user/Workplace/fl4m3/grokking_fp/_build/dev/dialyxir_erlang-24.2.1_elixir-1.13.2_deps-dev.plt',
  files: [...],
  warnings: [:unmatched_returns, :error_handling, :race_conditions, :underspecs,
   ...]
]
Total errors: 0, Skipped: 0, Unnecessary Skips: 0
done in 0m1.15s
done (passed successfully)

Given this example is slightly different (no more tuples inside the list) my understanding is that underspecs should still complain. I will however say that the only resource I could find on the matter was this mail:

[erlang-questions] Dialyzer underspecs and overspecs options

From it:

Let SpecIn be a set of @spec inputs and RealIn be a set of inputs as inferred by Dialyzer from real code, then:

The Input Contract is satisfied when SpecIn <= RealIn (where <= is a non-strict subset operation). See over_in in demo code below.

The Input Contract violation is detected by -Wunderspecs option when SpecIn > RealIn. See under_in below.

It is easy to see in the code:

It’s OK for over_in to declare that it only accepts :a and :b while it also happens to accept :c. Maybe suboptimal, but fine.

It’s NOT OK for under_in to claim that it accepts :a, :b and :c and break if :c is passed. Rejecting :c would break the caller.

Let SpecOut be a set of @spec outputs and RealOut be a set of outputs as inferred by Dialyzer from real code, then:

The Output Contract is satisfied when SpecOut >= RealOut (where >= is a non-strict superset operation). See under_out below.

The Output Contract violation is detected by -Woverspecs option when SpecOut < RealOut. See over_out below.

It is easy to see in the code:

It’s OK for under_out to declare that it returns :a, :b and :c, while currently it only returns :a and :b. Maybe future implementations will return :c as well.

It’s NOT OK for over_out to declare that it returns :a and :b, but to also return :c sometimes. Returning :c would break the caller.

So, in my case, if I understand:

SpecOut is [] | [Any]
RealOut is [Any]

And in this case, my SpecOut >= RealOut. So :underspec will not complain ( it doesnt).

I may very well be reading this all wrong. Keep that in mind.
I am just trying to understand why underspec wont work in my sample.

LostKobrakai · February 17, 2022, 11:13am

I guess this is your problem: An empty list is a valid value for both some(t) and nothing. [t] means any list (even emtpy), but elements of the list need to be of type t. You likely want [t, ...], which means any non empty list, where elements are of type t.

Fl4m3Ph03n1x · February 17, 2022, 11:44am

Still passes, even with :underspec. Reverting back to the tuple {:some, t} didn’t make any difference.
While the email specifies in mathematical detail what underspec is supposed to do, I still cannot find a real life scenario where I would want to use it.

I do think you are correct in the sense that [] | [t, ...] is the issue. I don’t see how though, since [] means empty list, and [t, ....] means a list with at least 1 element.

In my code, nil is neither of those cases.

The following code does raise a warning though:

  @type option(t) :: some(t) | nothing
  @type some(t) :: [t, ...]
  @type nothing :: nil

  @spec validate_name(String.t()) :: option(String.t())
  def validate_name(name) do
    if String.length(name) > 0 do
      [name]
    else
      0
    end
  end

Warning

Extra type:
nil

Success typing:
[binary(), ...] | 0

But this is confusing. There is 1 path that leads to success. Dialyzer should not complain.
Yet, here it does because nil !== 0.

LostKobrakai · February 17, 2022, 11:58am

Seems like :overspecs is the one you want, not :underspec (I tried both this morning and probably got them mixed up).

With that flag I get this:

Starting Dialyzer
[
  check_plt: false,
  init_plt: '/Users/benni/Temp/test_dialyzer/_build/dev/dialyxir_erlang-24.0.4_elixir-1.13.0_deps-dev.plt',
  files: ['/Users/benni/Temp/test_dialyzer/_build/dev/lib/test_dialyzer/ebin/Elixir.PracticingCurrying.beam',
   '/Users/benni/Temp/test_dialyzer/_build/dev/lib/test_dialyzer/ebin/Elixir.Test.beam',
   '/Users/benni/Temp/test_dialyzer/_build/dev/lib/test_dialyzer/ebin/Elixir.TestDialyzer.beam'],
  warnings: [:overspecs, :unknown]
]
Total errors: 1, Skipped: 0, Unnecessary Skips: 0
done in 0m0.88s
lib/test.ex:6:missing_range
The type specification is missing types returned by function.

Function:
Test.validate_name/1

Type specification return types:
[{:some, binary()}]

Missing from spec:
nil

________________________________________________________________________________
done (warnings were emitted)
Halting VM with exit status 2

Fl4m3Ph03n1x · February 17, 2022, 1:35pm

I see. Using overspecs I fall into the clause:

The Output Contract violation is detected by -Woverspecs option when SpecOut < RealOut. See over_out below.

Where RealOut is [] | [t] | nil and SpecOut is [] | [t].
Therefore, a contract violation is detected.

Now, the one final thing I don’t get is:

Why did I need to use this overspec flag in the first place?

If having 1 successful run was all it took dialyzer to be happy, then the code from post:

Should not have thrown a warning, it should have passed.

LostKobrakai · February 17, 2022, 1:42pm

Which flags was this run with?

Fl4m3Ph03n1x · February 17, 2022, 1:43pm

Ignore me, it passes.
I had overspecs enabled (by mistake).

This truly does give me the impression the else branch is being ignored by the default algorithm (with no flags), because I have case where it passes.

To reiterate:

@type option(t) :: some(t) | nothing
  @type some(t) :: [t, ...]
  @type nothing :: nil

  @spec validate_name(String.t()) :: option(String.t())
  def validate_name(name) do
    if String.length(name) > 0 do
      [name]
    else
      0
    end
  end

Makes a happy dialyzer if dialyzer has no flags.

Fl4m3Ph03n1x · February 18, 2022, 1:55pm

Summary

After talking with many folks I have come to understand that as long as 1 path in my code leads to a successful execution that is compatible with the specs I provide, dialyzer will not complain.

There may be many paths that break, but as long as 1 works, dialzyer is happy.

In my specific case, because I have a branch that returns [{:some, String.t}] dialyzer does not complain, because I have 1 branch that succeeds.

This can be better summarized in the quote from Type Specifications and Eralng:

The other option is then to have a type system that will not prove the absence of errors, but will do a best effort at detecting whatever it can. You can make such detection really good, but it will never be perfect. It’s a tradeoff to be made.

In fact, the article mentioned above, has an example case very similar to my own:

main() ->
    X = case fetch() of
        1 -> some_atom;
        2 -> 3.14
    end,
    convert(X).

convert(X) when is_atom(X) -> {atom, X}.

Which also does not trigger dialyzer. According to the article:

From our point of view, it appears that at some point in time, the call to convert/1 will fail.
(…)
Dialyzer doesn’t think so. (…) because there is the possibility that the function call to convert/1 succeeds at some point, Dialyzer will keep silent. No type error is reported in this case.

This is quite enlightening and I am convinced this is what is happening in my case.

Will dialyzer ever understand what is going on?

To be fair, this error can be caught by dialyzer if we use some flags, namelly --overspecs (for this case) and its sister --underspecs (which we don’t need for this case).

After some research I was able to find a mailing list that details the behaviour of these flags in a mathematical format:

[erlang-questions] Dialyzer underspecs and overspecs options

From it:

Let SpecIn be a set of @spec inputs and RealIn be a set of inputs as inferred by Dialyzer from real code, then:

The Input Contract is satisfied when SpecIn <= RealIn (where <= is a non-strict subset operation). See over_in in demo code below.

The Input Contract violation is detected by -Wunderspecs option when SpecIn > RealIn. See under_in below.

It is easy to see in the code:

It’s OK for over_in to declare that it only accepts :a and :b while it also happens to accept :c. Maybe suboptimal, but fine.

It’s NOT OK for under_in to claim that it accepts :a, :b and :c and break if :c is passed. Rejecting :c would break the caller.

Let SpecOut be a set of @spec outputs and RealOut be a set of outputs as inferred by Dialyzer from real code, then:

The Output Contract is satisfied when SpecOut >= RealOut (where >= is a non-strict superset operation). See under_out below.

The Output Contract violation is detected by -Woverspecs option when SpecOut < RealOut. See over_out below.

It is easy to see in the code:

It’s OK for under_out to declare that it returns :a, :b and :c, while currently it only returns :a and :b. Maybe future implementations will return :c as well.

It’s NOT OK for over_out to declare that it returns :a and :b, but to also return :c sometimes. Returning :c would break the caller.

And indeed if I run mix dialyzer --overspecs with this sample, dialzyer does complain becasue:

The Output Contract violation is detected by -Woverspecs option when SpecOut < RealOut. See over_out below.

Where RealOut is [] | [t] | nil and SpecOut is [] | [t].
Therefore, a contract violation is detected.

Error shown:

lib/test.ex:6:missing_range
The type specification is missing types returned by function.

Function:
Test.validate_name/1

Type specification return types:
[{:some, binary()}]

Missing from spec:
nil

This was a wild ride through Dialyzer, and a revision I quite honestly needed. During the whole ordeal I learned about a couple of dialyzer flags and I refreshed my memory on success typing (which I absolutely needed to).

Thank you everyone for participating !

dimitarvp · February 18, 2022, 2:26pm

Quite interesting. I’m definitely going to use the --overspecs flag then because (a) it is more strict and (b) seems to give more readable error messages.

Thanks for sharing!