Feedback for how `CompareChain` handles `nil`

nil is annoying for CompareChain and I would like it to be better. I’m looking for some feedback on approaches!

The problem

compare?/2 works quite well for reducing boilerplate when all the arguments are non-nil. But when nil is a possible value it leads to headaches.

At CargoSense, we currently have 4 separate comparison modules to deal with this problem:

  • DateTimeNilInfinity - nil > datetime and nil == nil
  • DateTimeNilNegInfinity - nil < datetime and nil == nil
  • DateTimeNilInfinityRaise - nil > datetime and nil raises if compared to nil
  • DateTimeNilNegInfinityRaise - nil < datetime and nil raises if compared to nil

It’s… ok. It deals with the problem in a succinct-ish way:

# Before
old && new && DateTime.compare(old, new) != :eq || old != new
# After
compare?(old != new, DateTimeNilInfinity)

The problem is that no one wants to read DateTimeNilNegInfinityRaise, let alone think though all the casework to understand what the code means.

I have considered a few options over the years. @benwilson512 recently advocated for another one but I remain undecided.

Options

1. SQL-like optional argument

We make CompareChain aware of nils and handle them specially.

compare?(old != new, DateTime, nils: :infinity)

nils can be one of: :infinity, :neg_infinity, :infinity_raise, :neg_infinity_raise. This is reminiscent of how SQL allows you to specify NULLS FIRST | LAST.

Downsides:

  • All nils would be necessarily be treated the same in a single expression.
  • All desired ways of handling nil would need to be represented as one of the options. (I.e. the 4 I listed may not be exhaustive.)

2. inf() and -inf()

(Courtesy of @benwilson512)

We provide expressions which CompareChain understands how to handle natively:

compare?(old || inf() != new || inf(), DateTime)
# or possibly
compare?(inf(old) != inf(new), DateTime)

Here we have -inf() < datetime < inf() (or whatever the struct is), inf() == inf(), and -inf() == -inf(). Then nil could be handled in the “normal” way of raiseing if passed to a function like DateTime.compare/2. And if you didn’t want that, you provide inline fallbacks.

Downsides:

  • Still a little verbose (and possibly hard to understand if we go with inf(old) != inf(new))

Other considerations

  • Current approach of defining bespoke modules automatically interopts with Enum.sort/2 and friends. The new approaches would not. (Though one could still define a bespoke module if one wanted.)

  • Certain 3-valued logics are simply impossible to encode using compare/2 #=> :lt | :eq | :gt. E.g. select 1 = null; in SQL returns null, meaning none of :lt | :eq | :gt are appropriate. We’ve sort of been handling this by raiseing internally. But just note that the design space for CompareChain is fundamentally limited in this regard since the intent is to interop with the existing compare/2 paradigm.


Feedback

Please LMK your thoughts! Given that CompareChain is so conceptually simple, I’m hesitant to add special casing. But nil really is annoying, so I’d love to hear any opinions.

Also please advocate for another approach you like if you think of one!

To further advocate for this a bit I think it’s worth looking at how it handles more “normal” CompareChain scenarios since this != one is actually less common, and using the normal DateTime.compare? might actually be fine in this situation eg if new && old && DateTime.compare?(new, old) != :eq do

Consider:

if compare?(data.sync_cursor >= shadow.sync_cursor, DateTimeNilNegInfinity) do
  # stuff
end

What we are trying to deal with is that shadow.sync_cursor might be nil if the shadow is freshly initialized. In these cases we want nil to evaluate as less than any possible value of new_data.sync_cursor which is coming in. So in my proposal you’d have:

if compare?(data.sync_cursor >= shadow.sync_cursor || -inf(), DateTime) do
  # stuff
end

or

if compare?(data.sync_cursor >= -inf(shadow.sync_cursor), DateTime) do
  # stuff
end

to indicate that negative infinity is being used as a fallback for the value of shadow.sync_cursor.

This really improves the look I think of chained comparisons too:

compare?(si.activated_at <= geo_elem.latest_recorded_at <= si.deactivated_at || inf(), DateTime)

Because in this particular case we are being very explicit about which of the possible values we want to be OK with being nil.

2 Likes

@benwilson512 Yeah great callout. An upside of an inline approach is that individual nils can be special cased. I didn’t do a good job highlighting that in the OP.

I’m a fan of the inf() approach for its flexibility.

Nit (naming): I think I’d use infinity(), personally. I think the keystrokes are worth the clarity, and I’ve seen :infinity used for case matching before. (:hot_pepper: : inf feels like an unwarranted homage to Python’s math.inf.)

3 Likes