I’ve always wondered this and while reading the docs again I was curious about why this is left to the user. The docs say:
This is done as a security measure to avoid attacks that attempt to traverse entries with nil columns.
To check that value is nil, use is_nil/1 instead:
I’m curious about 2 things.
- What is the attack it refers to, and how does
is_nil/1
thwart the attack in a way that ==
doesn’t?
- Since nil comparisons raise an error if done with
==
, why is the implementation of checking for nil, and using is_nil
or ==
based on the results left up to the user rather than handled internally in the lib? Couldn’t the Ecto.Query.Builder.not_nil!/1 check be amended to instead resolve the final query to either ==
or not_nil/1
based on the runtime value?
Genuinely curious, not a complaint.
Imagine you write this query:
from User, where: [api_token: ^params["token"]], limit: 1
Now if someone passes no token, you will accidentally login as any of the users without a token.
3 Likes
Ahhh, so is_nil/1
doesn’t really need to handle things differently, it’s more of a guardrail so users have to explicitly opt in if they want nil, rather than accidentally exposing these security holes? It’s still a rather verbose way to handle it. Something like where: p.column == nilable(^var)
would lead to cleaner code, rather than having a case statement with a nil case and a non-nil case.
That doesn’t work at the SQL level though as most people expect, you can’t compare with NULL, it is always false.
The thing is that NULL
is a little bit unfortunate name in the DB. NULL
in the DB mean that we do not know value of this field, not that it is “empty”. One unknown value is not equal to another unknown value, similar to NaN != NaN
in IEEE754. We cannot tell that 2 NULL
s in the DB are equal, because we do not know what they value should be.
Imagine that you have table like:
Name |
Occupation |
Alice |
Programmer |
Bob |
∅ |
Charlie |
Unemployed |
Dan |
∅ |
(∅
here mean null value)
Can we tell that Bob and Dan have the same occupation? No, because we do not know what their occupation is, it is different from being unemployed.
Your approach would make it confusing in situations like:
v = nil
from a in A,
where: a.foo == ^v
And while we could secure user a little bit against the above, the one below is not possible without explicit user intent:
from a in A,
join: b in B, on: a.foo == b.foo
While there is no safeguard in Ecto against the latter, the whole nature of the fact that you need to use is_nil/1
in favour of a.foo == nil
is something that make people think a little about that case.
6 Likes