How do you check that a MapSet is a MapSet?

I have a function where I do:

def recomputeCapacity(%MapSet{} = mCallIds) do
    cu =
      mCallIds
      |> MapSet.to_list()

And Dyalizer says that the function can never complete (and all the functions calling it…) because:

The call 'Elixir.MapSet':to_list
         (_mCallIds@1 :: #{'__struct__' := 'Elixir.MapSet', _ => _}) does not have an opaque term of type 
          'Elixir.MapSet':t(_) as 1st argumentElixirLS Dialyzer

Now, if mCallIds was an integer or something I’d use a guard is_integer(), but how do I make sure that only a MapSet can match this function?

I could have a

@spec recomputeCapacity( MapSet.t(any) ) :: number

But it would not prevent me calling the function by mistake, and if I leave the %MapSet{} = mCallIds Dialyzer keeps complaining.

I know that MapSet is opaque so I should not pattern-match on it (though - annoyingly - it works…) but what is the alternative?

2 Likes

This is a good question. The pattern match you are doing, on %MapSet{}, is indeed the correct way to ensure that only MapSet-structs can be passed to the function. It is OK to patternmatch on it. It is only wrong to match on any particular fields the struct may or may not have (as these may change in future versions without warning).
This is the same for all opaque structs (and in this case mentioned explicitly at the end of the module’s documentation as well.)

Furthermore, your spec, using MapSet.t(any), is correct. It is an opaque type (so you’re not supposed to depend on its details), but internally it also is the same as a %MapSet{}-struct.

I’m not immediately seeing where the problem of Dialyzer comes from. What you could try, is to just use the type MapSet.t() instead of MapSet.t(any). It means the same, but maybe it confuses Dialyzer less in this situation?

2 Likes

It looks like it’s the match that Dialyzer won’t like, not just in the function signature - this won’t work as well:

  @spec recomputeCapacity(MapSet.t()) :: number
  def recomputeCapacity( mCallIds)  do

    %MapSet{} = mCallIds

    current_capacity_used =
      mCallIds
      |> MapSet.to_list()

Is this a bug? do I just switch checks off for now?

I wonder if using when is_struct(mCallIds, MapSet) results in the same warning?

:thinking: It seems like this is a known limitation of Dialyzer

What I said about pattern-matching on a struct that is defined as an opaque type is correct in theory, but in practice it seems that Dialyzer does not follow this logic correctly in certain situations.

To be precise: An opaque type hides even the fact whether something is a struct or not. (The Erlang/Dialyzer type system, which Elixir inherited, does not treat structs differently from maps.) Thus the pattern match is not strictly allowed. This makes the MapSet.t type impossible to use in combination with any pattern-match.

Personally I’d keep the runtime check (because it is enforced rather than opt-in), and change the @spec to use %MapSet{optional(any) => any()} instead.

3 Likes

Very much this. Opaque types are meant to be treated as completely opaque. It could be a struct today and a tuple tomorrow. Therefore even checking for the struct type is a violation of the opaqueness. It’s unlikely for such a change to happen for MapSet, but that’s besides what dialyzer tries to assert.

Just for an example: ets at some point changed the underlying datatype for :ets.tid. So changes like switching even the underlying datatype do indeed happen sometimes.

1 Like

I agree - but shouldn’t an opaque type offer an is_me?() method, so that you can know that it is itself?

1 Like

Yes it does.

So you’d simply switch off the warning?

Maybe. The problem with MapSet imo is less how opaque types work though, it’s that MapSet is not actually meant to be a full opaque type.

MapSets being a struct is a stable contract. Only all the other keys on the map are considered implementation details. That’s a usecase dialyzer is not able to handle – partial opaqueness in maps per key. If MapSets would be a opaque datastructure in the sense dialyzer treats it then you’d never see any documentation matching on it being a struct in the first place or any other reference to it’s underlying datatype.

E.g. I’d never check if a tid is a valid ets tid. I don’t even know what to check it by. I just use it as one and if it blows up it might have not been.

4 Likes

No, I’d rewrite the @spec to use a type different from MapSet.t() to circumvent the limitation in Dialyzer.

So instead of using the opaque type

@spec recomputeCapacity( MapSet.t(any) ) :: number

, using the non-opaque type

@spec recomputeCapacity(%MapSet{optional(any) => any}) :: number

Depends on what you mean by “calling” - the call to MapSet.to_list/1 would fail on exactly the same kind of MatchError that the pattern-match in your function would, so the result would be the same:

Like @LostKobrakai said, sometimes you don’t need to check every argument exactly at every layer. Let it crash.

1 Like

However, a MatchError will look like a mistake in the code of the module, whereas an ArgumentError or FunctionClauseError (usually) indicates a mistake in the calling code.

I understand the nuanced details about opaque types in Dialyzer, but I don’t understand the paranoia. Yes, the author will have to update code when the version of Elixir rolls, but that is often necessary anyway.

First, there should be a way to disable these warnings from dialyzer, even for guards. You will get compilation errors when the internals change, so no need for a double warning.

Yes, there should be guards provided by every significant type in the language…

Precursor, we need an Elixir guard for Erlang map_get. Not sure why it’s not there already. Perhaps there was discussion on Elixir lang group?

For example, Set could be deprecated, and MapSet needs guards equivalent to:

  defguard is_set(s) when is_struct(s, MapSet)

  defguard set_size(s) when map_size(:erlang.map_get(:map, s))

  defguard is_set_empty(s) when is_set(s) and set_size(s) == 0

  defguard is_set_nonempty(s) when is_set(s) and set_size(s) != 0

  defguard is_set_member(s,x) when is_set(s) and is_map_key(:erlang.map_get(:map, s), x )

Obviously is_set_empty and is_set_nonempty could be implemented using equality comparison with the empty set, without using set_size. But note that both are needed, because is_set_nonempty is not necessarily equal to not is_set_empty. They can both be false if you pass an int by mistake.

Finally, and speculatively, it would be nice to have a better literal format for sets (MapSet). Most obvious delimiters are already taken ([] lists, {} tuples, <> binaries) so it would have to be a compound delimiter, like maps %{}. I don’t have a strong preference, but something like (| ... |) might work - I know bananas, lenses, envelopes and barbed wire : )

P.S. Some of those extensions would help the case when you mistype MapSet as Map, but currently get no warning, which leads to mysterious bugs.

For example: this is a silent mistake, but would benefit from an is_set guard on the mutating function:

s = MapSet.new([:foo, :bar])
# ... much later in a typo galaxy far, far away ...
t = s |> Map.delete(:foo) |> Map.filter(& &1 == :bar) 
# ... consider and try to guess what you will see ...
IO.inspect(s)
IO.inspect(t)
IO.inspect(map_size(s))
IO.inspect(map_size(t))
IO.inspect(set_size(s))
IO.inspect(set_size(t))
IO.inspect(MapSet.to_list(s))
IO.inspect(Map.to_list(s))
IO.inspect(Map.to_list(t))

You can just write map.key in your guard clause in elixir, no guard function needed.

1 Like

Ah, yes, I rarely use the . notation, so did not occur to me (Erlang habits : )

P.S. Also fixed my typo in the first comment, should be map_get not map_key.

For the record, here are the guards using dot notation.

It’s possible some people (including me) would prefer a consistent prefix in the guards, so is_set_empty and is_set_nonempty, updated accordingly:


  defguard is_set(s) when is_struct(s, MapSet)

  defguard set_size(s) when map_size(s.map)

  defguard is_set_empty(s) when is_set(s) and set_size(s) == 0

  defguard is_set_nonempty(s) when is_set(s) and set_size(s) != 0

  defguard is_set_member(s,x) when is_set(s) and is_map_key(s.map, x )

I think . notation can only be used for constant known keys?
In this case, the key .map is known, because MapSet is a fixed struct.

But in general, the . notation is not a complete replacement for Map.fetch!.

So the map_get Elixir guard is still needed when the key is a variable (from context, or an argument to the guarded function).

1 Like

(| ... |) would not work, cannot reuse function parens ().

Maybe ${ ... }