How to think about pattern matching vs. type-checking

fireproofsocks · March 13, 2019, 6:19pm

This is more of a general question, but I’m wondering how other people in the community think about the pattern matching in function signatures.

Pattern matching is fairly straight-forward when you match on simple values, e.g.
def something([]), do: "Emtpy!"

I start to get some metal friction when I look at type-hinting when there are functions like this:

def something(c = %Plug.Conn{}), do: "Something with the conn"

At first glance, I would think that c contains the empty struct, but of course, it will have the FULL value of whatever was passed to the function so long as the input was of the proper type. In other words, it’s not really a pattern-match at all, it’s a type hint.

Granted, my confusion here is probably the baggage of seeing that syntax used not for type-hinting, but for supplying a default value in many other languages (e.g. PHP, Ruby, Python).

The pattern matching/type-hinting gets a bit stranger for me when it gets nested inside tuples. Consider the following example:

my_tuple = MyContext.get_resource_as_tuple()

case my_tuple do
  {:ok, resource = %{status_id: "valid"}} -> result
  {:ok, %{status: status}} -> "Boo. Status #{status} is not valid."
  {:error, msg}  -> "Error: #{msg}"
end

Again, the resource = %{status_id: "valid"} looks more like an assignment, and I have to remind myself how it actually works. Go, for example, omits the equals sign and puts the type after the variable when it is used as part of a type-check. PHP, puts the variable type in front of the variable when it’s used as part of a type-check.

How do others think about this when they’re walking through code?

kokolegorille · March 13, 2019, 6:59pm

I use the other way around…

{:ok, %{status_id: "valid"} = resource}

It reminds me of JS destructuring and got used to this form, and it does not look like an assignement.

dimitarvp · March 13, 2019, 7:10pm

I just had to get used to it. As @kokolegorille mentioned, this is a destructuring statement; the assignments inside the pattern are giving you partial matches on the exact piece of the data inside the bigger piece of data.

IMO Erlang/Elixir pattern matching isn’t type checking. It’s more like asserting the shape of the data itself. Any type checking along the way is a nice bonus.

(As an example, you can use pattern matching with map syntax in your function head and it will happily accept both a map and a struct, if the struct has the exact same keys that your pattern matching expression requires.)

kokolegorille · March 13, 2019, 7:11pm

It is destructuring in JS, but pattern match in Elixir

axelson · March 13, 2019, 7:43pm

Also the reason that %{} matches any map is that otherwise pattern matching on maps wouldn’t be very useful because you would never be able to do a partial match, you’d always have to define all the keys even if you aren’t interested in them. i.e. this would give a MatchError:

%{result: result} = %{result: 42, errors: []}

stefanchrobot · March 13, 2019, 8:22pm

Actually, it is a pattern match, because if you try to pass a plain map it will fail with match error. If you define a struct, you can match on struct type or you can work with any maps:

defmodule User do
  defstruct [:id, :name]

  # more (runtime) type-checking and safety
  def say_hello(%User{name: name}), do: "my name is #{name}"

  # works with any map that has the :name key
  # more extensible, but then - should this function live here or elsewhere?
  def say_hello(%{name: name}), do: "my name is #{name}"
end

So it’s really up to you to decide which approach works best for what you’re trying to achieve.

As a side note, even though def something(c = %Plug.Conn{}) and def something(%Plug.Conn{} = c) are technically the same, I always strongly push for the latter, since it’s more intuitive and more in line with pattern matching inside of a function (from right to left):

def some_func(...) do
  %Plug.Conn{} = c  # pattern match c: it has to be a Plug.Conn
  c = %Plug.Conn{}  # rebinding c to empty Plug.Conn
end

rvirding · March 14, 2019, 9:00am

I quite agree with the opinion that writing %Plug.Conn{} = c feels much better in a pattern match, it is how you would write the match in code. Though some prefer the other way as they see it as first matching then binding the variable. But they are wrong.

Also I just want to point out that you can use the = alias in any patterns anywhere so you can write patterns like {a, b, c} = t and [%Plug.Conn{} = c | rest]. You can have your cake and eat it,

I do just want to stress that both ways result in the same code so there is no “better” choice wrt efficiency.

Qqwy · March 14, 2019, 9:06am

As a side note: Elixir has a syntax for default values to functions as well, it’s \\:

def foo(required, optional \\ 42) do
  IO.inspect({required, optional})
end

tcoopman · March 14, 2019, 1:26pm

Something that I found a bit confusing in the beginning was that %{} matches any map, but [] matches an empty list.

NobbZ · March 14, 2019, 1:31pm

Its even worse once you use them in types vs match…

[foo] in a match means a list with exactly one element, in a type though it means a list of items of type foo, this list can be empty or have arbitrary many elements.

%{} in a pattern match means any map, empty or not, as a type though it means the empty map, literally.

I got used to it, but still sometimes fall into this pit…

OvermindDL1 · March 14, 2019, 2:22pm

That’s because lists have a construct to match non-empty lists, that being [_|_], there is no such syntax for maps, though if there were then I could see it operating like lists, to borrow from another language perhaps something like %{_ => _}, however matching purely empty maps is an extremely rare case, if ever, popping up in Elixir, so using %{} for that seems useful, unlike lists where matching the empty list is extremely common.

fireproofsocks · March 14, 2019, 3:53pm

What would make more sense to me would be to drop the equals sign in the cases where we’re doing a kind of type-hint.

E.g. if we omitted the equals sign:

def foo(bar %Plug.Conn{}), do: "matched when input is a plug"
def foo(bar %Ecto.Changeset{}), do: "matched when input is a changeset"
def foo(bar %{status: "valid"}), do: "matched when input is a map with a status key with a value of valid"
# ... etc...

In all cases, bar gets the full input, it just happened to be filtered according to the type – conceptually something like a guard clause.

That would be more similar to how other languages do the type-hinting (e.g. Go and PHP, although PHP puts the type to the left of the variable).

There still is pattern matching going on there, but it’s not happening directly in the function, it’s happening immediately before when the kernel is choosing which function signature matches. This only comes up in cases where you need to get the full value of a variable, but you need to some pre-emptive filtering.

easco · March 18, 2019, 5:00pm

It’s not a kind of type hint. You’re actually defining a head on the function that ONLY matches when passed data that matches the given pattern. If no function head matches that pattern then the system will throw an exception. To my way of thinking that’s much stronger than a “hint”.

The pattern match can also create bindings:

def foo(%{status: status} = whole_thing), do: "The status given was #{inspect status} and the whole thing is #{inspect whole_thing}"

There’s a lot more going on in that case than just providing a hint about what type should be used.

fireproofsocks · March 18, 2019, 6:02pm

Yes, I know this is more than a type-hint. I’m just trying to make sense of what I feel is a confusing syntax. When you start having to teach this stuff to coworkers and/or students, you become really sensitive to anything like this that creates mental friction and slows down understanding.

gregvaughn · March 18, 2019, 6:12pm

Pattern matching should be a new mind-bending feature to most new students of Elixir. One of the things I appreciated about the Programming Elixir book is that it covers that clearly and comprehensively right near the beginning. If you oversimplify that when teaching, then the student will never really feel comfortable with Elixir.

fireproofsocks · March 25, 2019, 6:46pm

I had a bit of an epiphany re the syntax involved here. If I write out a series of matches like this:

iex(4)> m = %{foo: foo} = %{foo: "bar"}
%{foo: "bar"}
iex(5)> foo
"bar"

then it becomes more obvious how the matching works right to left. You can see it will fail when the input (on the right) fails to be matched to the structures on its left.

If I squint, I can imagine that the right-most match is what happens when Elixir is figuring out which function to call (i.e. which function definition matches the value being passed).

# myfunc(%{foo: "bar"})  <--- matches
# myfunc(%{fizz: "buzz"})  <--- does not match
def myfunc(m = %{foo: foo}), do: "My match!"

I’m probably just late to the party, but I thought I’d share.

LostKobrakai · March 26, 2019, 8:35am

There’s no “right-most” match for function heads. The better mental model is what the compiler actually does with many function heads: move them into one case statement.

def myfunc(param_1) do
  case {param_1} do
    {m = %{foo: foo}} -> "My match!"
    […]
  end
end

It’s even more apparent if you do some more esoteric matches like %{foo: foo} = %{fizz: fizz} in the function’s head. The only difference to actually writing it like that is a slightly different error if there’s no match, afaik.

The only part where location actually matters is for inline matches:

%{foo: foo} = %{fizz: fizz} = %{foo: "test_1", fizz: "test_2"}
# foo = "test_1", fizz = "test_2"
%{foo: foo} = %{fizz: fizz} = %{foo: "test_1"}
# Fails on match

Here the right-most data is matched to everything on the left.

rvirding · March 27, 2019, 12:04am

Yes, the meaning of the = is different in patterns and in the inline use.

In the inline use the = actually has the syntax pattern = expression where the expression on the RHS is first evaluated and then the value of the expression is matched against the pattern. So it has a very strictly defined right-to-left semantics.

However, in a pattern it is an alias where both sides are pattern matched and both matches must succeed. All the variables in both patterns are bound if the matches succeed. It is commonly used for the case where you want to match and extract parts of a structure and have a reference to the whole. Like in you examples with maps and structs. It lets you have your cake and it.

Using the same operator perhaps wasn’t the smartest thing but it was inherited from Erlang so you can blame them (me).

stefanchrobot · March 27, 2019, 11:24pm

Since the right-to-left semantics only apply to the rightmost term, these two are equivalent:

iex> %{x: a} = %{} = %{x: 1, y: 2}
iex> %{} = %{x: a} = %{x: 1, y: 2}

So given the following function:

def f(%{x: a} = %{}) do
  a + 1
end

b = f(%{x: 1, y: 2})

if we were to inline it, it’s all consistent and makes sense:

%{x: a} = %{} = %{x: 1, y: 2}
b = a + 1

NobbZ · March 28, 2019, 8:32am

They applay to the whole term.

BUT, %{} = %{x: 1, y: 2} is an expression, returning the value given on the right.

This is, why have observed the behaviour you observed.