I’ve always wondered this and while reading the docs again I was curious about why this is left to the user. The docs say:
This is done as a security measure to avoid attacks that attempt to traverse entries with nil columns.
To check that value is nil, use is_nil/1 instead:
I’m curious about 2 things.
- What is the attack it refers to, and how does
is_nil/1 thwart the attack in a way that
- Since nil comparisons raise an error if done with
==, why is the implementation of checking for nil, and using
== based on the results left up to the user rather than handled internally in the lib? Couldn’t the Ecto.Query.Builder.not_nil!/1 check be amended to instead resolve the final query to either
not_nil/1 based on the runtime value?
Genuinely curious, not a complaint.
Imagine you write this query:
from User, where: [api_token: ^params["token"]], limit: 1
Now if someone passes no token, you will accidentally login as any of the users without a token.
is_nil/1 doesn’t really need to handle things differently, it’s more of a guardrail so users have to explicitly opt in if they want nil, rather than accidentally exposing these security holes? It’s still a rather verbose way to handle it. Something like
where: p.column == nilable(^var) would lead to cleaner code, rather than having a case statement with a nil case and a non-nil case.
That doesn’t work at the SQL level though as most people expect, you can’t compare with NULL, it is always false.
The thing is that
NULL is a little bit unfortunate name in the DB.
NULL in the DB mean that we do not know value of this field, not that it is “empty”. One unknown value is not equal to another unknown value, similar to
NaN != NaN in IEEE754. We cannot tell that 2
NULLs in the DB are equal, because we do not know what they value should be.
Imagine that you have table like:
∅ here mean null value)
Can we tell that Bob and Dan have the same occupation? No, because we do not know what their occupation is, it is different from being unemployed.
Your approach would make it confusing in situations like:
v = nil
from a in A,
where: a.foo == ^v
And while we could secure user a little bit against the above, the one below is not possible without explicit user intent:
from a in A,
join: b in B, on: a.foo == b.foo
While there is no safeguard in Ecto against the latter, the whole nature of the fact that you need to use
is_nil/1 in favour of
a.foo == nil is something that make people think a little about that case.