Idiomatic Pattern Matching: function `def` vs. `case`

kredati · June 11, 2017, 8:21pm

Hello! I’m Scott. I’m new to Elixir, and to the forums.

I have a question I’ve searching here and on StackOverflow, and haven’t managed to find anything on this.

I’m wondering what’s the best/most idiomatic location/method for pattern matching: using multiple pattern-matched function definitions, or using a single function with a case statement? To take an example from my solution to the exercism.io Elixir stream’s Rotational Cipher problem:

defp shift_by(char, amount) when char in @lower_case do
  char |> shift_from_base(?a, amount)
end

defp shift_by(char, amount) when char in @upper_case do
  char |> shift_from_base(?A, amount)
end

defp shift_by(char, _), do: char

But this could just as easily be written:

defp shift_by(char, amount) do
  case char do
    char when char in @lower_case -> char |> shift_from_base(?a, amount)
    char when char in @upper_case -> char |> shift_from_base(?A, amount)
    _ -> char
  end
end

Ignoring any infelicities or good abstractions I’ve failed to muster (and the actual details of the implementation), I’m wondering which is the more idiomatic way of writing this? Are there advantages to using pattern matching in function definitions instead of case statements, or vice versa?

To my eye, multiple function definitions reads better, but I’ve noticed that on exercism, people tend to put their pattern matching in a case. Maybe my preference for putting pattern matching in function defs is just my javascript prejudice against switch blocks showing?

Thanks!

kokolegorille · June 11, 2017, 8:33pm

Have a look at this one

http://learnyousomeerlang.com/syntax-in-functions

and this one

Often solution comes from erlang…

In this case, it says in short that guard can have multiple clauses while case have a single clause

cmkarlsson · June 11, 2017, 8:38pm

Generally it doesn’t matter much and you should use the one that shows intent the best or make the code clearer.

One big advantage using function headers is that you can trace them. Elixir/erlang tracing is a very powerful tool both during development and production.

As you are new to elixir it may not be the first thing you jump into but as you progress it is something you should add to your toolbox. In fact I hardly every use debug logging any longer in favour of just doing tracing when needed.

kredati · June 11, 2017, 8:54pm

Thanks so much to you both! TIL: not definitions, but clauses; searching for related erlang problems is useful; tracing is something I should know about. Yay!

I’m especially keen to look into tracing. Coming from JS, I am used to an absolutely terrible debugging experience. (“Cannot find property foo of undefined.” ) (I got into functional JS, and wanted to learn a “real” functional language, hence Elixir, which I am loving.)

sudostack · June 12, 2017, 8:12am

Another thing to note is that the version you have with case would (IMHO) be more clear using just cond.

NobbZ · June 12, 2017, 8:24am

The question on SO you linked is totally unrelated to the question here. The questions on SO can’t be translated to elixir directly, but comes close to “Shall I use when or if?”.

The question asked in the OP is if one should prefer pattern matching with or withoput guards in a function clause or in a separate case inside the functions body.

The answer to this question is not easy. But I tend to use them in a function clause, especially when the case where the only “thing” in the functions body anyway. I simply try to avoid unnecessary nesting.

If there is something else going on in the function body, I often start with a case to make it work quickly and gather the different possibilities that I expect and discover by testing at a single place alongside the code that leads to those. But later on, when I have the feeling that everything is well tested and works as I’d expect it to work, then I often pull out those cases into defped helper functions which then use pattern matching in the clauses.

At the end, I have to admit, this is mostly something of personal style and both ways are totally fine, as long as you use them consistently throughout your project.

aseigo · June 12, 2017, 9:41am

Adding to what others have said … and with the caveat that this is very much just mho …

I find that I use conditionals more often when they are part of the function body and not responsible for the return value of the function. For example this contrived example:

def do_something(x) do
    y = 
        case x do
             value when is_number(value) -> value
             value when is_bitstring(value) -> String.to_integer value
         end

    # perhaps some other code
    y*2
end

If that is the only place the code uses what that case is doing, then I will often enough leave it as-is. The case is there to produce an intermediate result which is then processed into the final return value, and as such it makes sense to keep it within the function. I find that such conditional structures often require / want access to various bits of data / variables in the function, so it is easier to keep the code in that function to access that data directly rather than try to create an over-generalized set of pattern-matching function headers:

def do_something(x, precision) ->
    y = 
        case x do
             value when is_number(value) -> x
             value when is_bitstring(value) -> 
                 {float, _ } = Float.parse value
                 Float.round float, precision
         end

    # perhaps some other code
    y*2.0
end

Since only one branch cares about the precision, creating a helper function would mean that most of the functions would have an unused / meaningless parameter being passed in. As the number of such variables grows, I find the readability of pattern matching functions decreases due to the noise-to-signal ratio in the function headers.

However, I default to using pattern matching function headers when:

a) it creates the final result of the function:

def do_something(x), do: mult(x, 2)

def mult(x, multplier) when is_number(x), do: * multplier
def mult(x, multiplier) when is_bitstring(x), do: String.to_integer(value) * multplier

I find this is often easier to read / reason about and it is “self-documenting” in that it is clear that the intention of do_something is to double its input, while creating re-usable code in the form of mult/2

b) the code should be / is used in more than one place; functions are the obvious and simple way to share common functionality, so if an action-on-conditional is not unique to the function in question then I immediately opt for pattern matching function headers

c) the “host” function is a series of conditionals; breaking those into named functions helps document the code clearly and allows each step to be independently tested, something that is far harder to do when a function contains a waterfall of conditionals

or: readability, reusability, testability.

There are other details that influence my decisions between conditionals and functions, but this is the core of it for me. It’s obviously a bunch of judgement calls, and my use of functions has certainly evolved with experience.

This probably would have made a better blog entry than a comment. Sorry for the length.

NobbZ · June 12, 2017, 10:00am

aseigo:

def do_something(x, precision) ->
    y = 
        case x do
             value when is_number(value) -> x
             value when is_bitstring(value) -> 
                 {float, _ } = Float.parse value
                 Float.round float, precision
         end

    # perhaps some other code
    y*2.0
end

Exactly this is something where I’d extract a helper ensure_numeric/2! Your code will fail with with some argument error on the line y*2.0 because y is nil.

def ensure_numeric(x, _) when is_number(x), do: x
def ensure_numeric(x, precision) when is_bitstring(x) do
  case Float.parse(x, precision) do
    {f, ""} -> Float.round(f, precision)
    _ -> raise WhatEverYouFeelComfortableWith # (or throw, I always confuse those)
  end
end

def do_something(x, precision), do: ensure_numeric(x, precision) * 2.0

This gives a clear stacktrace, telling you exactly what went wrong, where it went wrong and why it went wrong. Also the definition of do_something/2 fits a single line now and is thrice as clear and understandable than your initial version, at least for me.

I wouldn’t factor that case in my ensure_numeric out though. It’s not matching on a raw value but a processed one. Those cases are the ones that I keep in nearly all the time (and to be honest, I do even pipe into case)

aseigo · June 12, 2017, 10:32am

There are lots of ways this could fail (e.g. the string not containing a parseable float which will cause a match fail on {f, “”} … it’s just a contrived and quick example, attempting (poorly to show how different branches can rely on / use different sets of parameters.

If that was precisely the problem the code was tackling, I would indeed write it differently. It just seemed an easier example than the real-world ones I have run into.

So while I agree an ensure_numeric is nicer (and I think I actually covered that in my original comment?), the idea I was trying (again, poorly to present was conditionals where the branches rely heavily on different sets of variables/parameters that exist in the host function.

OvermindDL1 · June 12, 2017, 7:22pm

You should look at Bucklescript then (in-browser demo here. ^.^

On top do note that a function with multiple heads and a case at top level function expression compile to essentially the same CoreErlang code, so it is mostly a stylistic thing.

I agree and do the same thing.