CompareChain - Semantic, chained comparisons for Elixir

CompareChain

Announcing CompareChain - a small library to aid with comparisons.

Examples

iex> import CompareChain

# Chained comparisons
iex> compare?(1 < 2 < 3)
true

# Semantic comparisons
iex> compare?(~D[2017-03-31] < ~D[2017-04-01], Date)
true

# Semantic comparisons + logical operators
iex> compare?(~T[16:00:00] <= ~T[16:00:00] and not (~T[17:00:00] <= ~T[17:00:00]), Time)
false

# More complex expressions
iex> compare?(%{a: ~T[16:00:00]}.a <= ~T[17:00:00], Time)
true

Sales pitch

Working with comparison operators in Elixir can lead to a fair bit of boilerplate. This is because the normal infix comparison operators like < do structural comparison:

iex> ~D[2017-03-31] < ~D[2017-04-01]
false

When you try that, you get a warning: warning: invalid comparison with struct literal ~D[2017-03-31]. Comparison operators (>, <, >=, <=, min, and max) perform structural and not semantic comparison...

To do semantic comparison, you need to use the proper module’s compare/2 function:

iex> Date.compare(~D[2017-03-31], ~D[2017-04-01]) == :lt
true

This ends up reading like RPN where :lt acts somewhat like a postfix operator. The issue is compounded when you need to perform more complicated logic:

iex> Date.compare(~D[2017-03-31], ~D[2017-04-01]) == :lt and Date.compare(~D[2017-04-01], ~D[2017-04-02]) == :lt
true

You end up with a verbose mix of infix and pseudo-postfix operators.

Additionally, Elixir does not support chained comparisons like 1 < 2 < 3:

iex> 1 < 2 < 3
false

When you try that, you get a warning: Elixir does not support nested comparisons...

Enter CompareChain

CompareChain provides some helper macros that allow you to

  • chain infix operators
  • perform semantic comparison with infix operators
  • combine (chained) comarisons with and, or, and not

After calling import CompareChain, you get macros compare?/{1,2}. With compare?/1 can do operations like:

iex> compare?(1 < 2 < 3)
true
iex> compare?(1 < 2 > 3)
false

With compare?/2 can do comparisons like:

iex> compare?(~D[2017-03-31] < ~D[2017-04-01], DateTime)
true

The idea is that you provide a module with a suitable compare/2 function as the second argument just like with functions like Enum.sort/2. The macro then rewrites your expression using the module you provide.

You can write complicated expressions if you wish:

iex> yesterday = ~D[2022-11-04]
iex> today     = ~D[2022-11-05]
iex> tomorrow  = ~D[2022-11-06]
iex> compare?(yesterday < today < tomorrow and not (today >= tomorrow), Date)
true
iex> compare?(%{a: ~T[16:00:00]}.a <= ~T[17:00:00], Time)
true

You can also do fancier things by defining a custom module:

defmodule DateTimeWithInfinity do
  def compare(:infinity, _), do: :gt
  def compare(_, :infinity), do: :lt
  def compare(:neg_infinity, _), do: :lt
  def compare(_, :neg_infinity), do: :gt
  
  def compare(%DateTime{} = dt1, %DateTime{} = dt2) do
    DateTime.compare(dt1, dt2)
  end
end

This module supports :infinity as a value that is always greater than every date time, and :neg_infinity that is always less than every datetime. This is super useful for defining ranges that are open on one side:

range1 = %{starts_at: ~U[2022-01-01T00:00:00Z], ends_at: ~U[2022-02-01T00:00:00Z]}
range2 = %{starts_at: ~U[2022-01-10T00:00:00Z], ends_at: :infinity}
compare?(
  range2.starts_at <= range1.starts_at <= range2.ends_at or
  range2.starts_at <= range1.ends_at <= range2.ends_at,
  DateTimeWithInfinity)

#=> true

Future work

If you try it out and like it and/or find any problems, let me know! Issues and PRs are welcome.

Acknowledgements

Shoutout to @benwilson512 and @mcrumm for the helpful discussions and guidance! :slight_smile:

And thank you to all the folks who participated in the elixir-lang-core discussion. In particular, thanks to Cliff (sorry I don’t know your handle) whose idea I shamelessly built off of: https://groups.google.com/g/elixir-lang-core/c/W2TeQm5r1H4/m/ctVuN_woBgAJ

29 Likes

Woot, glad to see this got released! We do a ton of datetime range comparisons at CargoSense since we’re frequently having to compare whether the span of new data we’ve received is relevant to other spans of data, and to date it’s involved quite a lot of boilerplate helper functions. “Fence post” problems abound, since we want to try to make sure that every data point is always associated with a single window, not two if it’s on a boundary.

Writing helpers for this stuff has involved a ton of little boilerplate functions. Looking forward to the refactor PR now that this is out!

2 Likes

Do you have the and not as an example here or does it add any value?
iex> compare?(yesterday < today < tomorrow and not (today >= tomorrow),

Today can’t be both strictly smaller and greater or equal, can’t it?

Today can’t be both strictly smaller and greater or equal, can’t it?

Correct. That’s just an example of a legal expression.

does it add any value?

Only in showing that compare?/2 isn’t returning nonsense :slight_smile:

not basically lets you pivot between logically equivalent renderings of the same thing. DeMorgan’s law says that:

a > b == not(a <= b)

and this extends to compound propositions like:

(a > b or a <= c) == not (a <= b and a > c)

Depending on what your function is doing, it might be easier to think of the logic in terms of or and having a choice between two things, or it might be easier to think of it in terms of and where several things all have to be true.

By supporting not you can turn an and into an or by pulling out a not to the front and vice versa.

1 Like

We do a ton of datetime range comparisons at CargoSense

You got that right!

Some additional context for this release: Like Ben said, at CargoSense we do stuff like this all. the. time. I have become an unwilling expert in the art of comparing datetime ranges.

Even for simple things, datetime comparisons can be cumbersome. For example, suppose you want to compare a datetime dt to some range {left, right}. What do you do?

The obvious answer is to check if dt is between left and right. But which “between” do you mean? There are actually 4 cases to cover:

  • left <= dt <= right
  • left <= dt < right
  • left < dt <= right
  • left < dt < right

We’ve had occasion to need all 4 in one circumstance or another. And given the difficulties in reading the native datetime comparisons, we found ourselves writing a bunch of defps all over the place. So we’re unreasonably excited by the prospect of in-lining a bunch of functions with names like between_inclusive?.

It gets even more fun when you compare ranges to ranges since you may have overlapping ranges (e.g. Less Than, Not Overlapping vs. Less Than, Overlapping). For my project these range types originate in the database so I put similar functionality to what you’ve done here in the library I use close to where the database custom types are defined.

Nice to see a this kind of problem being looked at in a public library.

3 Likes

I know about the DeMorgan’s law. I just didn’t understand why the example shows duplicated logic and was wondering if it was a mistake.

1 Like

It gets even more fun when you compare ranges to ranges since you may have overlapping ranges (e.g. Less Than, Not Overlapping vs. Less Than, Overlapping).

Exactly!

Speaking of “fun”, that DateTimeWithInfinity example is (almost) real. We often deal with ranges of time which have “started but not yet stopped”. E.g. a plane that took off at 11am but hasn’t landed yet. Even though the plane’s flight time doesn’t have a definite end, it should still overlap with any other range that starts after 11am.

The moral of the story is that you end up doing a bunch of bespoke logic all over the place because it never quite seems to generalize nicely. (Or at least, we couldn’t get it to.)

For my project these range types originate in the database so I put similar functionality to what you’ve done here in the library I use close to where the database custom types are defined.

I’m curious what you mean by that. Like, custom handling of %Postgrex.Range{}?

For the Range types, pretty much this. I’ve basically wrapped %Postgrex.Range{} into type specific versions (e.g. DateRange, DecimalRange, etc.) which can be used with Ecto. Then I’ve defined a Range protocol and a more general database type protocol that defines functions for comparison handing; I need two protocols since some comparisons aren’t range related and the range related types have things like the overlapping conditions to consider.

The Range protocol functions deal with things like upper bounds compare and lower bounds compare. The more general comparison function is modelled on the existing Elixir modules such as DateTime.compare/2 except that there is an expanded set of return values that it can return due to the complexity of range handling.

In deciding what the range type comparisons would return, the range types in the application are largely just a reflection of the PostgreSQL range type implementations. When looking at the various operators that PostgreSQL implements to compare ranges, I figured I could create an extended set of return values from my DbTypes.compare/2 function that more or less mirrored the PostgreSQL comparison functions. So, whereas DateTime.compare/2 can return :eq, :lt, or :gt, my DbTypes.compare/2 can return (from my docs):

  • :gt - left is greater than right.
  • :lt - left is less than right.
  • :eq - the values are equal.
  • :lcr - left contains right.
  • :rcl - right contains left.
  • :gto - greater than overlapping.
  • :lto - less than overlapping.

Naturally, what of those can be returned in practice depends on what is being compared. So, for example, comparing a simple DateTime value to a DateTimeRange value cannot result in the overlapping return values though the “contains” values are possible. Anyway, I figure this keeps me in sync with what I can expect to do with the database directly using the PostgreSQL operators and keeps me reasonably aligned with how more complex comparisons are implemented elsewhere in the Elixir ecosystem.

2 Likes

CompareChain Release: v0.3.0 (2023-01-28)

Hi (upwards of) tens of users! I’ve just released a new version of CompareChain.

There are two main changes:

  • You can now use == and != as well

    compare?(~T[00:00:00] == ~T[11:11:11], Time) #=> false
    compare?(~T[00:00:00] != ~T[11:11:11], Time) #=> true
    

    (No idea why I didn’t do this in the first place…)

  • You can now use Elixir >= 1.13.0 instead of being restricted to 1.14
    I could probably go much lower if I stopped using Macro.prewalker/1.

As always, issues and PRs are welcome! :slight_smile:

5 Likes

CompareChain Release: v0.4.0 (2023-09-10)

Hi, there’s been another (tiny) release of CompareChain.

This is the only change:

  • Using a struct with compare?/1 results in a warning

    Running:

    iex> compare?(~D[2023-01-02] < ~D[2023-01-02]) #=> false
    

    Now yields:

    [warning] Performing structural comparison on matching structs.
    
    Did you mean to use `compare?/2`?
    
      compare?(~D[2022-01-02] ??? ~D[2022-02-01], Date)
    

We decided that’s preferable to silently giving you the wrong answer and making you chase down a weird bug. Not speaking from experience or anything…

As always, issues and PRs are welcome! :slight_smile:

3 Likes