This is exactly the difference between a naïve compiler/runtime and an optimizing compiler/runtime.
In the case of an optimizing one, indeed the known properties (that are in the function’s signature) of a function are taken to e.g. fuse it. This is what Haskell does when combining multiple list-operations. This fusing-system is by the way a set of rewrite-rules that is curated by humans (rather than being fully automated). See the list of rewrite rules for Haskell’s Data.List.map here.
Other compilers, like the C and C++ compilers perform similar steps to do loop fusion.
All of optimization is designed to speed up the common/average runtime case, but this inevitably means (because compilers/runtimes themselves only have the incomplete information about a function’s properties available that was specified in the code) that there will be some edge cases that will run slower.
A good language-agnostic example of this would be the decision on implementing Quicksort: On average, it takes O(n * log(n))
steps to sort an enumerable. And, because its actual implementation is in many languages simpler than other sorting algorithms with the same time complexity, it is a common sort to be used in many environments. However, its worst-case time complexity is O(n²)
.
… unfortunately, when implemented in the most straightforward, deterministic way, this worst-case behaviour happens on an already-sorted enumerable.
So many implementations decide to pick a ‘random’ element as pivot, making the implementation suddenly non-deterministic (with all problems of non-determinism as a result). But even when deciding on another, deterministic, way to implement which pivot to pick, there will always be enumerables that trigger the O(n²)
case.
To get back to your point:
Correct. In the example of Quicksort, we often like to only think about how it behaves in the average (‘behaves as if’) case. (The one exception being when we write real-time software).
At least in Haskell, raising an error is viewed as a side-effect. In fact, Haskell’s Control.Exception module states that:
In addition to exceptions thrown by IO
operations, exceptions may be thrown by pure code (imprecise exceptions) or by external events (asynchronous exceptions), but may only be caught in the IO
monad.
The kind of exceptions that might be thrown in pure code are things like:
- failed pattern matches (which means that your code, while ‘pure’, was not total),
- division by zero (which means that your ‘pure’ math is not total),
- allocation exceptions (which means that, while your pure implementation would work perfectly on a computer with more memory, we have reached the boundaries of this physical one).
In each of these cases, Haskell says ‘our pure representation of the world is now breaking down, completely interrupt the normal pure dataflow and go to the nearest, effectful exception handler.’
Also note that Haskell has some other kinds of exception-handling mechanisms as well, and has actually multiple escape hatches, for when you require some crazy hack. People will scoff at you, and it is very good that accursedUnutterablePerformIO
is well-hidden, but in the case we decide that the only way to solve our difficult problem in a reasonable way on the current hardware architecture stack needs it, it is possible to reach out for these tools. Doing so is very similar to using unsafe
in Rust, or writing a NIF in Elixir/Erlang.
I think that speaking for Erlang/Elixir, re-ordering statements in a way that affects when the error occurs (or, to be more exact: in which order errors might occur) should be forbidden (or be opt-in behind a ‘I understand that I might mess with the rest of the program’). Because if you don’t, you alter the semantics of the code, which is something that compilers/interpreters are not allowed to do.