Why is it bad to check for values you don't want (eg nil values)?

I was reading an earlier post: Pipe obsession anti-pattern which recommended an article entitled Good and Bad Elixir.

I’m confused by one of the recommendations in the Good and Bad Elixir article. It says: “State what you want, not what you don’t.”

EXAMPLE GIVEN:

# Don't do...
def call_service(%{req: req}) when not is_nil(req) do
  # ...
end

# Do...
def call_service(%{req: req}) when is_binary(req) do
  # ...
end

I use guards quite a bit to ensure that the function is not being passed a nil value in error. I gracefully manage the error, sometimes sending a message to the user to try again. I don’t understand why that would be bad. The article says, “You’d prefer to raise or crash if you receive arguments that would violate your expectations.”

Could someone explain why that is the case?

1 Like

If you are expecting a string, then there is no advantage to specifying not nil instead of binary. If the function actually expects any value except nil then the first variation is not incorrect.

6 Likes

Because this is considered Defensive Programming (I didn’t fully vet that article but it has a good TL;DR). Unless nil is a meaningful value in your design, you should rewrite your code so that the function would never receive nil anyway (Parse, Don’t Validate) and if through some circumstance it did, it would be considered an “exceptional” circumstance in which case we just fail (throw an exception). You could also look at it like: Why stop at defending against nil? If you’re going to defend against one type you should theoretically be defending against every known type in your system.

6 Likes

While it is common in most languages to have a null value that leads to an exception when accidentally dereferenced, in Elixir, nil is just another atom. Checking for something not being nil would be similar to checking something not being :foo for instance. There’s just no need to.

Checking for the actual type you expect it to be is clearer to people reading your code. It also probably saves a few lines. And in general it’s often better to let it crash and recover, then trying to account for all the possible failures I believe.

2 Likes

Hmmm. I did a quick read of Parse Don’t Validate. It makes sense to try to prevent the false data from being passed in the first place. And I get what you’re all saying about it being difficult to check for every single corner case. But there is really no harm done in checking for nil or other possible failure states … right? It’s not like it impacts performance.

nil is just an atom, yes, but it is still special in that many things return it implicitly so we do end up with stray nils.

One example is that I worked on a system that took in JSON that was scraped from the internet. It was passed raw through a very long code path that at different points looked for different keys. Of course sometimes these keys would be missing so we’d get nil errors. This resulted in a bunch of “Guard against nil” commits. What we should have done is transform the unkown data into a known shape before it’s passed into the main program to ensure that all keys we expect are there.

This just becomes death by a thousand cuts. It’s the very antithesis of Erlang’s Let it Crash philosophy. Coding against “possible future states” is a general anti-pattern in any language! It leads to a bunch of unused code that can pile up quickly and cause great confusion to people reading it later on. I know I’ve pulled my hair out before wondering why what some code was doing only to figure out hours later the answer was “nothing.”

2 Likes

Relatively simply: If I do call_service(%{req: %UnexpectedThing{}}), %UnexpectedThing{} isn’t nil, but it also probably isn’t what you want. So if what you actually want is a binary, then ask for it. If literally everything is valid except nil, then do the nil check.

EDIT: And to be clear the emphasis is on avoid not checks, it isn’t so much on nil.

6 Likes

Ahhh. Thanks for sharing that example @sodapopcan . That makes the Parse Don’t Validate approach make more sense. Better to look at the data being passed in (and fortify it) rather than setting up a bunch of “not” checks.

When I started off with Elixir, I added “nil” guards because my app kept crashing on nil maps with the error of: You can’t call nil.field_name. I guess I should have been using an is_map() guard which would have been more explicit about what was expected for the code to work.

Lesson Learned on why you don’t want to excessive not-checks:

  • design so that the data coming in is clean and doesn’t need excessive guards
  • too many “not” checks creates confusing code for others to support
  • use guards to explicitly identify the type needed for function to work

THANK YOU to everyone who replied. I better understand this now.

4 Likes

I feel like it needs to be emphasized that this is really not primarily an issue with guards or even “excessive” negations/not, it’s just a logic error. If your function accepts the category “binaries” and you define that category as “not nil” this is simply incorrect. You could use all negative guards and solve the logic error, and then you might have a style issue that makes your code “confusing” etc. Conversely you can make category errors like this without using not at all (if say you guard on is_list but actually only keyword lists are expected).

8 Likes

Just to point it out, this is exactly what Ecto.Changeset.cast does! At least it can be used that way if you Ecto.Changeset.apply_changes. Other libs like Drops are more generalized.

2 Likes

You already had excellent answers and did a good summary yourself.

I’ll add one more thing: make full use of pattern-matching to your own advantage. Example with the code from your OP:

def call_service(%{req: req}) when is_binary(req) do
  # ...
end

def call_service(other) do
  # Simplified, we did it much more fine-grained with all the `is_*` guards.
  YourMonitoringService.warning(
    "Unexpected value for AwesomeModule.call_service/1: #{inspect(other)}"
  )
end

Note the second variant of the same function. In Elixir you can have catch-all clauses for fallback actions. In this case: report an unexpected value. This is not an error condition but it’s undesirable and the devs want to be informed so they can monitor closely who and when is sending those invalid values.

I had such requirements in a few projects, one of them was a legacy API provider where the company announced an API upgrade shortly after I joined but many users of the API still didn’t understand or read the docs and upgrade guides and kept calling the new API with the old parameters.

Thanks to the catch-all / fallback mechanism Elixir gives us we managed to track down the customers who used the API wrongly and helped them off-board their previous code and write a new one.

Just one production example for you. Obviously you can do that with any programming language but Elixir made it very easy to achieve.

7 Likes

This reminds of me of other confusion I’ve seen on this site around “let it crash” where people have taken that to mean “never handle errors.” It certainly doesn’t mean that and this is a good example! If you keep getting the same error over and over again then of course deal with it!! Just don’t “guess” what you might get as is maybe sometimes considered good practice in other languages(?). But if you get an error once and it never happens again, it’s probably pointless to code around it. This is keeps codebases manageable and is a huge boon for readability.

Sorry, I know you didn’t ask about let it crash @annad but I just brushed on it and I’m admittedly half-responding to a past convo :sweat_smile:

2 Likes

Oh yeah, I’ve met a good amount of people who misunderstand this. The whole point of “let it crash” when you deploy stuff to production is more like “no need for defensive coding because your app will not crash, you’ll just get a report in your AppSignal / Sentry monitoring system”.

That’s the value of “let it crash” in production really. Less code, quicker iteration. Needless to say, every Elixir team I was in operated like this: get a scary error in your APM system, somebody makes a quick PR to deal with it, merge, deploy, all good. Rinse and repeat until only the occasional intermittent failures stemming from DB timeouts (that happen twice a month) remain. And for those I pushed to increase various Ecto.Repo settings like :queue_target et. al. until even those errors disappeared. Big deal that one out of 10_000 people will have to reload their page once, we can’t deal with quantum uncertainty. :person_shrugging:

2 Likes

This further makes me think about mystery errors that happen just a few times per year that just aren’t worth fixing :money_with_wings: Sometimes these can take down the whole system and a restart gets us back on track or we need to pour resources into investigating and fixing them. BEAM gives us a lot of leeway here :slight_smile: And I would love to see the pristinely empty error log of anyone who scoffs at this notion :upside_down_face:

1 Like

Thank you for these additions! On a good day, I feel like I’m at the intermediate level with Elixir but I am soooooo far behind all of you! Today I feel like a beginner. Ha ha ha.

@tfwright … I’m trying to make sure I understand your comment. Except for function components, there is no type enforcement. But if a function is expecting a binary, you’re saying that logically the guard should enforce that type (not check for every wrong type). So in a way … the guard becomes a way of enforcing types. Obviously, it doesn’t catch it at compile time but enforces at run time.

Thank you @sodapopcan and @dimitarvp for the discussion on “let it crash” because that has confused me for a long time. My thought is that it will be incredibly frustrating for the user if I let it crash all the time. But I see now that it’s more like … “Don’t expect to find every bug. Run it, let it crash, capture source and then fix it later.” That makes a lot more sense.

@dimitarvp Thank you for that piece of code! I had not thought of using pattern matching in that way.

1 Like

Yes, exactly. And it’s very important to complement this with catch-all clauses if unexpected inputs are bound to come in. Both go hand in hand, as I have shown in the code snippet above.

Yes, that’s like 90% of it. The other 10% are more about: “if your DB falls down there’s exactly nothing that your application code can do about it – so yes let it crash”. Even if that means the app itself falls down eventually (which happens after too many failures – it’s the default OTP config anyway, and that can be overriden).

Similar as above, pattern matching + guards and catch-all clauses are a brother and a sister. Catch-all clauses become important when you have to deal with user input or 3rd party data. Even if you just discard the data and do not act on it in any way you still might not want to have a runtime exception.

2 Likes

Could you elaborate a bit more on how you use Ecto.Changeset.cast and apply_changes to validate function parameters? I found this example from the hexdocs on Ecto.Changeset.cast:

data = %{title: "hello"}
types = %{title: :string}
changeset = cast({data, types}, %{title: "world"}, [:title])
apply_changes(changeset)
%{title: "world"}

Would you use that approach to validate the parameters before using them in a function like this?

def some_function(maybe_a_string) do
data= %{title: ""}
types = %{title: :string}
changeset = cast({data, types}, %{title: maybe_a_string}, [:title])
params = apply_changes(changeset)

# now do something with validated params ...
end

But I’m confused about apply_changes because the hexdocs say “This operation will return the underlying data with changes regardless if the changeset is valid or not.” So couldn’t it potentially return a bad changeset?

Ha, ya I was gonna say “fix the errors… if you even can/want to” but you beat me to it! This stuff was designed with many physical points of failure in mind that you can’t always deal with. There’s a funny Joe Armstrong thing where, er, I forget who he was talking to but they were talking about the beautiful type system in their language and Joe responded, “What if a missile were to destroy the machine your code is running on? Can the type system help with that?” This is not a prompt to start a static vs dynamic discussion, it’s just a funny quote in regards to Erlang’s design goals :smiley:

D’oh, I now realize this wasn’t the clearest parallel to draw. Changesets are designed to validate and cast data that and end-user is in control of so if there are any errors we would pass it back to them to fix. In the scenario I’m talking about there is no end-user so using Changesets is maybe a little convoluted but not totally useless—it’s a nice way to cast a map of raw data into a struct! But none of the validation stuff would apply since it’s up to our code to ensure it’s valid.

1 Like

My point was that the main problem with the example is that it does not correctly express the exact expectation, regardless of whether the goal is to control logic flow (which I believe is the primary intended use case for guards), function as a type checking mechanism, documentation, etc.

The way I would formulate a “rule” around guards would not be “avoid excessive negation,” since in my opinion that doesn’t express the core concern, but more like the following: “if you add guards to a function definition, they should be logically identical to the expected types.” Otherwise, whether too loose or too narrow, negatively expressed or positively, the system won’t behave as expected. This is what I meant by “logic error,” but maybe that was too strong…

2 Likes

Can’t speak for the broader community but I use guards for type enforcement and keep pattern-matching for the shape of the data inside. It’s obviously not an universal rule because this:

def do_stuff(User{id: id, username: username}), do: # ...

is just more convenient and intuitive compared to this:

def do_stuff(user) when is_struct(user, User), do: # ...

(and then you also have to fetch field values here), or even this:

def do_stuff(%module{id: id, username: username}) when module == User

(yes, I’ve seen such code)…
…but overall I’ve been mostly operating under this “rule” of mine.

2 Likes