What is wrong with my auto-generated typespec?

dvcrn · March 5, 2023, 3:24am

Trying to auto-generate a typespec for a longer opts keyword list parameters. Reduce function:

ts = args
|> Enum.map(&{String.to_atom(&1.name), MyMod.type_to_spec(&1.type)})
|> Enum.reduce(fn item, acc -> {:|, [], [item, acc]} end)

Generating 2 types, one hard-coded, one through my generated above:

      @type test_1 :: unquote(ts)
      @type test_2 :: {:user, String.t()} | {:top_p, float()} | {:temperature, float()}

Then using it in a function @spec definition like usual with [test_1] or [test_2].

Checking with t:

iex(65)> t MyMod.test_1
@type test_1() ::
        {:user, String.t()}
        | {:top_p, float()}
        | {:temperature, float()}
        | {:stream, boolean()}
        | {:stop, any()}
        | {:presence_penalty, float()}
        | {:n, integer()}
        | {:max_tokens, integer()}
        | {:logit_bias, map()}
        | {:frequency_penalty, float()}

iex(65)> t MyMod.test_2
@type test_2() ::
        {:user, String.t()} | {:top_p, float()} | {:temperature, float()}

So far so good, they look identical to me, with test_1 being longer

However, dialyzer/LSP are not catching the spec violation when using type_1, but are reporting incorrect usage (as they should) with type_2.

Tried to dig deeper and extracted the types with Code.Typespec.fetch_types:

   {:type, {:test_1,
    {:type, 0, :union,
     [
       {:type, 0, :tuple, [{:atom, 0, :user}, {:remote_type, 0, [{:atom, 0, String}, {:atom, 0, :t}, []]} ]},
       {:type, 0, :tuple, [{:atom, 0, :top_p}, {:type, 0, :float, []}]},
       {:type, 0, :tuple, [{:atom, 0, :temperature}, {:type, 0, :float, []}]},
       {:type, 0, :tuple, [{:atom, 0, :stream}, {:type, 0, :boolean, []}]},
       {:type, 0, :tuple, [{:atom, 0, :stop}, {:type, 0, :any, []}]},
       {:type, 0, :tuple, [{:atom, 0, :presence_penalty}, {:type, 0, :float, []}]},
       {:type, 0, :tuple, [{:atom, 0, :n}, {:type, 0, :integer, []}]},
       {:type, 0, :tuple, [{:atom, 0, :max_tokens}, {:type, 0, :integer, []}]},
       {:type, 0, :tuple, [{:atom, 0, :logit_bias}, {:type, 0, :map, :any}]},
       {:type, 0, :tuple, [{:atom, 0, :frequency_penalty}, {:type, 0, :float, []}]}
     ]}, []}}

   {:type, {:test_2,
    {:type, 610, :union,
     [
       {:type, 0, :tuple, [{:atom, 0, :user}, {:remote_type, 610, [{:atom, 0, String}, {:atom, 0, :t}, []]} ]},
       {:type, 0, :tuple, [{:atom, 0, :top_p}, {:type, 610, :float, []}]},
       {:type, 0, :tuple, [{:atom, 0, :temperature}, {:type, 610, :float, []}]}
     ]}, []}}

Looks also identical to me, with the difference that type_2 has line number annotations

Lastly converting back to quoted form to compare:

iex(67)> Code.Typespec.type_to_quoted x
{:"::", [],
 [
   {:test_1, [], []},
   {:|, [line: 0],
    [
      {:{}, [line: 0], [:user, {{:., [line: 0], [String, :t]}, [line: 0], []}]},
      {:|, [line: 0],
       [
         {:{}, [line: 0], [:top_p, {:float, [line: 0], []}]},
         {:|, [line: 0],
          [
            {:{}, [line: 0], [:temperature, {:float, [line: 0], []}]},
            {:|, [line: 0],
             [
               {:{}, [line: 0], [:stream, {:boolean, [line: 0], []}]},
               {:|, [line: 0],
                [
                  {:{}, [line: 0], [:stop, {:any, [line: 0], []}]},
                  {:|, [line: 0],
                   [
                     {:{}, [line: 0],
                      [:presence_penalty, {:float, [line: 0], []}]},
                     {:|, [line: 0],
                      [
                        {:{}, [line: 0], [:n, {:integer, [line: 0], []}]},
                        {:|, [line: 0],
                         [
                           {:{}, [line: 0], [:max_tokens, {:integer, ...}]},
                           {:|, [line: 0], [{:{}, ...}, {...}]}
                         ]}
                      ]}
                   ]}
                ]}
             ]}
          ]}
       ]}
    ]}
 ]}


iex(68)> Code.Typespec.type_to_quoted y
{:"::", [],
 [
   {:test_2, [], []},
   {:|, [line: 610],
    [
      {:{}, [line: 0],
       [:user, {{:., [line: 610], [String, :t]}, [line: 610], []}]},
      {:|, [line: 610],
       [
         {:{}, [line: 0], [:top_p, {:float, [line: 610], []}]},
         {:{}, [line: 0], [:temperature, {:float, [line: 610], []}]}
       ]}
    ]}
 ]}

Still don’t see why test_2 is working but test_1 isn’t. Is the construct just too long so dialyzer is choking on it?

/EDIT: Usage of the types like this:

      @type test_1 :: unquote(optional_args)
      @type test_2 :: {:user, String.t()} | {:top_p, float()} | {:temperature, float()}

      # test_1 or test_2 here
      @spec unquote(name)(unquote_splicing(spec), [test_1]) :: {:ok, unquote(response_spec)} | {:error, any()}

I’ve been staring at this a bit too long. Any ideas?

LostKobrakai · March 5, 2023, 8:38am

It might not be choking, but dialyzer does simplify types if they become to complex. Can’t answer when exactly that happens though.

dvcrn · March 5, 2023, 9:10am

I did notice something. When I call my function like this:

  def foo do
    case do_the_fx([], "foo", temperature: "abc", user: 123) do
      {:ok, res} -> IO.inspect(res)
      {:error, reason} -> IO.inspect(reason)
    end
  end

If I use keys that are not in the typespec, I’m getting no_return Function foo/0 has no local return. , but not the usual “xxx breaks contract” error that I usually get when passing invalid types
If I use the keys that are in the typespec but use any type (for example “abc” when float() is expected, it reports no errors
If I hardcode all types to be String.t(), then use 123 instead of “abc”, I’m getting the “has no local return.” error again

So it looks like dialyzer does something with it…

LostKobrakai · March 5, 2023, 9:14am

The simplification I mentioned before might be {:user | :top_p | …, term}, which would explain the behaviour.

dvcrn · March 8, 2023, 1:41pm

Noone an idea what could be going on here?

jhogberg · March 8, 2023, 2:13pm

@LostKobrakai is right, you’ve got too many alternatives in the union so dialyzer stops keeping the individual tuples apart, flattening it to {:user | :top_p | :temperature | ... etc, String.t() | float() | boolean() | ... etc}.

This is a rather hairy part of dialyzer. It’s important to understand that dialyzer is not a type checker, that “type specs” were glued on after the fact, and that as far as dialyzer is concerned they’re just the same constraints it figures out on its own when analyzing code.

The latter is why this happens: if things get too complicated when analyzing your code, it simplifies the constraints to prevent analysis from taking forever. Consider the following code (apologies for the Erlang):

foo(0) -> zero;
foo(1) -> one;
foo(2) -> two;
... and so on ...
foo(9000000000000000000000000) -> nineseptillion.

Having a union of nine septillion different atoms is not going to perform very well, so it flattens them all to atom() after a certain point (similar things are done for integers, tuples, maps, etc). If things are especially bad, it will even flatten to any().

When dialyzer takes a look at your type spec it applies the same logic.

dvcrn · March 8, 2023, 10:55pm

Thanks for the explanation!

So I guess to continue having stricter checks for this, we need to purposely make the spec more complex to trick dialyzer into not optimizing/flattening it away? The amount of keys here is 10 which I hoped wouldn’t be considered as “too many”