Library API Design error returns: nil vs error tuple vs exception

PragTob · June 16, 2019, 10:48am

I’m currently extracting the statistics calculation part from benchee and stumbled upon how to present error conditions. In the concrete example statistics of an empty list [] don’t make much sense. There’s no average, no total, no nothing.

I see three possible ways to represent this error condition right now:

return nil - feels kind of right as in this specific case an empty list just doesn’t have an average, a total or what not. So it’s nothing. Feels kind of wrong because some of these also return nil values normally (there doesn’t have to be a mode). Also nil tends to leak and be a bit hard to debug.
return {:error, reason} - this seems to be the most conventional wisdom in elixir looking around. Mostly it’s paired with an {:ok, value} return value which for my concrete use case I’m not the biggest fan of. It’s easy to match against etc. and should one decide to want to introduce an exception raising API it’d be easy to implement !-bang methods that raise instead of the tuple. What I also like about this, is that it’s easy to represent in the type spec.
raise a (custom) exception - this is mostly discouraged in elixir, the main argument being that it should only be used if it would leave the application in an undefined state. Here’s [a good blog post] (https://michal.muskala.eu/2017/02/10/error-handling-in-elixir-libraries.html) by @michalmuskala going over error tuples vs. exceptions. So why am I even considering this? I find it curious that the default behavior of Enum.max/2 and Enum.min/2 is to raise an exception (my min/max functions basically just wrap these ) . However, Enum.sum/1 returns 0. So at least part of the elixir stdlib does this (to an extent, you can pass your own empty element function).

I think it’s important to consider what kind of usage each pattern promotes for the users of a library:

nil - get some error that they had some nil value somewhere where they were expecting something else having to debug it back to the library call and ultimately the bad value they provided. Not great.
{:error, reason} - pattern match on calls either in case or directly assertive through {:ok, result} = Lib.call(input) which I think is ok. However, if users didn’t read the docs (bad!) tracing back where that tuple originated instead of a real value came from might still be a bit of a bother.
exception - never ever pass me that value i.e. the value should be caught by users beforehand. If they don’t the stack trace points them right where they need to look and fix the input (hopefully with a very good error message). I kind of like this for my specific case as handing an empty list to a statistics function seems like something that mostly shouldn’t be done and easy to catch on the user side. However, it’s not an “application critical state”, definitely recoverable and might be considered a valid input.

Right now I use value return but an {:error, reason} error return type - which I realize is sub optimal as it’s hard to match without an {:ok, value}. I kind of like just raising exceptions for my specific case but it feels wrong to do so. After having written all this it feels like implementing both (with !-bang methods) seems like the safest choice without too much extra work.

What’s your take? What do you think about the different options in general? What do you think about the different options in the concrete statistics calculation use case?

sashman · June 16, 2019, 11:23am

As a library user my vote would be a tuple return in general.

sasajuric · June 16, 2019, 11:45am

In this case I’d raise an exception for two reasons:

according to your description, an empty list is a non-domain input (e.g. because avg([]) is undefined)
library users can easily handle empty list anywhere up in the call stack

In addition, I’d communicate this through typespec, by using [...] or [element_type, ...].

Qqwy · June 16, 2019, 12:01pm

How common is it that people pass in an empty list?

If it is common (and for instance the parameter is expected to often be based on dynamic user input, rather than programmer-written code), I’d use an error-tuple (or maybe a nil, see below), since it is a common error.
If it is very uncommon, (i.e. only occurs in ‘exceptional’ situations, for instance when a programmer mistypes input parameters), I would use an exception, which means that the average user is not burdened with having to handle it.

Using nil only makes sense if there is one and only one(!) way of your function to return a ‘useless value’ result, and if you do not consider passing inputs that produce this useless value an error. This might actually be true for passing in [] to your statistics function. As soon as there are multiple ways to get a ‘useless value’ (for instance if you have a container that might contain nils) then don’t use them, since their meaning becomes ambiguous.

So: Exceptions for programmer mistake, errors for user mistake or context problems that keep you from continuing, nil if there is exactly one, clear, way to get a ‘useless value’.

value | {:error, reason} is definitely a bad idea. Any of the three listed approaches (nil, ok/error, raising exceptions) is better, because they are consistent in what they return.

michalmuskala · June 16, 2019, 12:03pm

I agree with @sasajuric - passing in an empty list to a function that doesn’t make sense on empty lists is most probably a programmer error and should just fail. I’d say that the case here is almost the definition of the ArgumentError exception.

LostKobrakai · June 16, 2019, 2:49pm

Besides what @sasajuric and @michalmuskala concluded I can see an argument being made for nil as return value, as it’s similar to how e.g. sql does handle the situation of empty list as input to aggregation functions.

xlphs · June 16, 2019, 6:10pm

Doing statistics on empty list usually involves dividing by 0, so raising ArithmeticError exception makes the most sense. You can always provide another version that doesn’t raise exception but instead returns NaN or nil.

dimitarvp · June 16, 2019, 11:57pm

It doesn’t make sense to return anything in your case. I’d just do raise(ArgumentError, "cannot calculate average of an empty list").

PragTob · June 17, 2019, 9:01am

Thank you everyone for your input! Before the post my hunch was to go with an exception but it somehow also felt wrong. I’ll do that now and raise an ArgumentError as suggested by many of you!

Thanks for being such a helpful bunch!

@xlphs I’d rather avoid that the users of my lib see errors involved with what the library does but rather what they did “wrong” so I think ArgumentError is better

@Qqwy It’s hard to say how often they’ll do it as a library can be used in all sorts of ways imo I think they should never do it/if it can be empty catch it before ideally though. I like your thoughts on the ways to generate “useless values” - thank you!

@sasajuric yup thank you, type spec will definitely be [number, ...], I remember the dialyzer on benchee complaining when I did that though but that just means I need to tune something

dogweather · December 11, 2023, 8:30pm

I’m coming to this very late!

I’m also interested in returning a nil vs. a tuple in library design.

But to your question, I have another idea: Design the function so that it can only be called with the proper input—a nonempty list:

def stats([x | xs] = values)

If this is a frequent occurrence in the API, you could create a type for NonemptyList using e.g. Domo, and catch the error at compile time.

PragTob · December 12, 2023, 2:03pm

@dogweather designing it in a way that it can’t even be called with an empty list results in an exception as well (no function clause matching) which leaves the library user to scratch their head even more. From that perspective I think the ArgumentError is superior.

As for Domo, the way I understand it the only compile time checks it does are against the default values of a struct, which makes sense as a full compile time check against non empty list needs a rather well functioning type system which we don’t have (yet).

dogweather · December 12, 2023, 6:10pm

True. I can see that it wouldn’t fail until it does.

I think a NonEmptyList type would be useful: then it’s both explicitly documented and fails every time if called incorrectly.

PragTob · December 12, 2023, 7:12pm

The library for which I raised this sample for has a similar type

  @typedoc """
  The samples to compute statistics from.

  Importantly this list is not empty/includes at least one sample otherwise an `ArgumentError` will be raised.
  """
  @type samples :: [sample, ...]