Decimal bug - please be careful with negative coefficients - fix is on pull request

Hi everybody,

I found a bug or structural problem in Decimal what could have some big impact:
wrong calculation (maybe with money!!!) or even infinite loop ;-(

My suggestion for a fix is on pull request:

Does anybody have another idea - there are some possible ways to handle validation for structs - but what would you suggest is the best way?
Any suggestions are welcome! :wink:

Here are one solution example for division:

defmodule Decimal do
  @type coefficient :: non_neg_integer | :NaN | :inf
  @type exponent :: integer
  @type sign :: 1 | -1

  @type t :: %__MODULE__{
          sign: sign,
          coef: coefficient,
          exp: exponent
        }
  defstruct sign: 1, coef: 0, exp: 0

...

  def div(%Decimal{coef: coef1}, %Decimal{})
      when coef1 < 0,
      do: error(:invalid_operation, "dividend (#{coef1}) must be > 0", %Decimal{coef: :NaN})

  def div(%Decimal{}, %Decimal{coef: coef2})
      when coef2 < 0,
      do: error(:invalid_operation, "divisor (#{coef2}) must be > 0", %Decimal{coef: :NaN})

the orig pull request text - if donā€™t want to follow the link:

Hi,

I found a bug or structural problem in Decimal what could have some big impact under special circumstances. Here an example from the Money lib where Decimal is used as main backbone:


iex(15)> m = Money.new(:USD, Decimal.new(%Decimal{sign: -1, coef: -12000000000}))

#Money<:USD, --12000000000>

iex(16)> Money.mult(m, 3)

{:ok, #Money<:USD, --36000000000>}

iex(17)> Money.mult!(m, 3 )|> Money.to_string

{:ok, "$--*,000,000,000.00"}

Of course itā€™s a special edge case ā€¦ normally the best practice is:


iex(18)> m = Money.new(:USD, "-12000000000")

#Money<:USD, -12000000000>

but you never know ā€¦ maybe someone using it ā€¦itā€™s possible! itā€™s valid Code ! NO compiler warning! NO runtime error!

And the worst case, if you are using any division with this kind of Decimal number like this:


iex(19)> m = Money.new(:USD, Decimal.new(%Decimal{sign: -1, coef: -12000000000}))

#Money<:USD, --12000000000>

iex(20)> Money.mult!(m, 3) |> Money.to_decimal |> Decimal.to_float

-> you end up in an infinite loop!!!

The reason for this is clear: the %Decimal{} struct - with no validation for input -> so you can input a negative coefficient.

The type spec and docu are perfect - but do not prevent any misuse:

Type spec : @type coefficient :: non_neg_integer | :NaN | :inf

Documentation: The coefficient of the power of 10. Non-negative because the sign is stored separately in sign.

So my suggestion for a solution you can find in this pull request: mostly checks for this special edge case.

And this time I included the corresponding tests. :wink:

Kind regards and I hope it helps anybody to use Decimal in a safer way!

Ralph

2 Likes

I think the intended use here and for most structs is to not construct them yourself with raw internals but to use the library as the authority on validating/constructingā€¦ in this case, Decimal.new handles this by setting coef to an absolute value https://github.com/ericmj/decimal/blob/master/lib/decimal.ex#L1086

Iā€™ll let @ericmj or other maintainers decide if theyā€™re also interested in this additional safeguard, and Iā€™ve been burned by constructing a struct incorrectly myself like this before, but, worth knowing as a general rule in elixir.

5 Likes

The coefficient type is clearly typed as non negative integer. Even though Iā€™d not necessarily expect the library to crash on malformed data, Iā€™d at least not expect it to behave correctly.

So any PR ā€œfixingā€ this should do so by crashing because of function clause errors or badarg at least.

6 Likes

yes - the fix does exact that - anyway I was interested on your opinions and ideas. Thanks.

Maybe the Norm lib:

Nitpick: that specific code

defmodule Derp do
  @spec bad_decimal() :: Decimal.t()
  def bad_decimal() do
    Decimal.new(%Decimal{sign: -1, coef: -12000000000})
  end
end

will be detected by Dialyzer:

lib/derp.ex:3:no_return
Function bad_decimal/0 has no local return.
________________________________________________________________________________
lib/derp.ex:4:call
The function call will not succeed.

Decimal.new(%Decimal{:coef => -12_000_000_000, :exp => 0, :sign => -1})

breaks the contract
(decimal()) :: t()

because Dialyzer can be 100% certain that -12_000_000_000 is not a non_neg_integer().

However, using a variable gets it to pass:

defmodule Derp do
  @spec bad_decimal(pos_integer()) :: Decimal.t()
  def bad_decimal(n) do
    Decimal.new(%Decimal{sign: -1, coef: -n})
  end
end

BUT Iā€™m not sure how far library code should go to defend its invariants against intentional manipulation.

For instance, itā€™s possible to produce a MapSet with strange behavior at runtime:

iex(28)> x = MapSet.new([:a, :b])                        
#MapSet<[:a, :b]>

# NOTE: writing this will trigger a Dialyzer error because `MapSet` is opaque, but works at runtime
iex(29)> y = %MapSet{map: %{c: "wat"}}
#MapSet<[:c]>
iex(30)> z = MapSet.union(x, y)       
#MapSet<[:a, :b, :c]>
iex(31)> z2 = MapSet.new([:a, :b, :c])
#MapSet<[:a, :b, :c]>
iex(32)> MapSet.equal?(z, z2)         
false

but that code will always cause a Dialyzer error because of the @opaque setting on MapSet.

There are some gotchas with @opaque, notably in pattern matching and module attributes which may explain why it isnā€™t used in Decimal.

7 Likes

Yes exactly. Elixir is dynamically typed. If you abuse that to create malformed types they wonā€™t work properly. This is true of literally every struct.

5 Likes

Thanks all for commenting.

@benwilson512 : Of course! I know. But does that mean we should not fix error edge cases? -> I mean NO. And I think you agree with me and if we can we should make it easier to use and more safe for everybody.

As NobbZ wrote : So any PR ā€œfixingā€ this should do so by crashing because of function clause errors or bad arg at least.

And thats exactly what it does. So back to the focus :

I think too the best bevavior for the existing Decimal lib and all the depending libs are: let it crash if wrong inputs:

  # wrong sign
  assert_raise Decimal.Error, fn ->
    Decimal.new(%Decimal{sign: 2, coef: 1, exp: 2})
  end

  # wrong coef
  assert_raise Decimal.Error, fn ->
    Decimal.new(%Decimal{sign: 1, coef: -1, exp: 2})
  end
iex(1)> Decimal.new(%Decimal{sign: 1, coef: 1, exp: 2})  
#Decimal<1E+2>
iex(2)> Decimal.new(%Decimal{sign: 2, coef: 1, exp: 2})
** (Decimal.Error) : wrong decimal number: (%Inspect.Error{message: "got ArgumentError with message \"argument error\" while inspecting %{__struct__: Decimal, coef: 1, exp: 2, sign: 2}"}) 
    (decimal 2.0.0-rc.0) lib/decimal.ex:1151: Decimal.new/1
iex(3)> Decimal.new(%Decimal{sign: 1, coef: -1, exp: 2})
** (Decimal.Error) : wrong decimal number: (%Inspect.Error{message: "got ArgumentError with message \"argument error\" while inspecting %{__struct__: Decimal, coef: -1, exp: 2, sign: 1}"})
    (decimal 2.0.0-rc.0) lib/decimal.ex:1151: Decimal.new/1 ```

all other wrong cases are handled by function clauses

So I think better then the current behavior with all the consequences of prolonging the error (see above):

iex(1)> Decimal.new(%Decimal{sign: 1, coef: 1, exp: 2})  
#Decimal<1E+2>
iex(2)> Decimal.new(%Decimal{sign: 2, coef: 1, exp: 2})
#Decimal<1E+2>
iex(3)> Decimal.new(%Decimal{sign: 1, coef: -1, exp: 2})
#Decimal<-.1E+3>

I think in a dynamically typed language, it is possible for a library to detect many misuses and edge cases at runtime, but there is a runtime cost to doing that. Documenting proper usage and data types allows to lower that runtime cost (assuming the documentation and type specifications are being followed)

3 Likes

yes something like thisā€¦

I think as much as possible without breaking the target. And itā€™s possible in this case.

Thanks for your thoughts!

yes you are right! There is a cost for it. I will try to measure it ā€¦

before the fix:

##### With input 30 Digits Integer #####
Name                           ips        average  deviation         median         99th %
Decimal.add()    Int      318.01 K        3.14 Ī¼s   Ā±771.75%           3 Ī¼s           5 Ī¼s

Memory usage statistics:

Name                    Memory usage
Decimal.add()    Int         4.26 KB

**All measurements for memory usage were the same**

##### With input 6 Digits Integer #####
Name                           ips        average  deviation         median         99th %
Decimal.add()    Int      903.01 K        1.11 Ī¼s  Ā±3457.31%           1 Ī¼s           2 Ī¼s

Memory usage statistics:

Name                    Memory usage
Decimal.add()    Int           592 B

**All measurements for memory usage were the same**

##### With input 606 Digits Integer #####
Name                           ips        average  deviation         median         99th %
Decimal.add()    Int       43.65 K       22.91 Ī¼s    Ā±73.77%          22 Ī¼s          44 Ī¼s

Memory usage statistics:

Name                    Memory usage
Decimal.add()    Int         3.73 KB

with the fix:

##### With input 30 Digits Integer #####
Name                           ips        average  deviation         median         99th %
Decimal.add()    Int      316.97 K        3.15 Ī¼s   Ā±659.73%           3 Ī¼s           5 Ī¼s

Memory usage statistics:

Name                    Memory usage
Decimal.add()    Int         4.26 KB

**All measurements for memory usage were the same**

##### With input 6 Digits Integer #####
Name                           ips        average  deviation         median         99th %
Decimal.add()    Int      930.26 K        1.07 Ī¼s  Ā±3092.95%           1 Ī¼s           2 Ī¼s

Memory usage statistics:

Name                    Memory usage
Decimal.add()    Int           592 B

**All measurements for memory usage were the same**

##### With input 606 Digits Integer #####
Name                           ips        average  deviation         median         99th %
Decimal.add()    Int       43.93 K       22.77 Ī¼s    Ā±29.74%          22 Ī¼s          39 Ī¼s

Memory usage statistics:

Name                    Memory usage
Decimal.add()    Int         3.73 KB

As you can see the cost in this case is not relevant. :wink:

2 Likes

I think the largest cost here isnā€™t in performance, or even in the code required for this specific PR, but in what this implies about coding strategy. The principle youā€™re articulating is that every public function on a public API should validate the invariants of the datastructure. I donā€™t think this is a sound principle. It happens to be reasonably cheap in this case, but that is by no means guaranteed. It also puts a massive burden on library authors.

If you arenā€™t trying to articulate a general principle, then the question becomes, why enforce these checks here and not elsewhere?

The right pattern is to enforce invariants at the point of documented constructors.

4 Likes

Yes, thank you very much. Theoretically perfect formulated!

But thats the point.

Theoretically we can use Decimal.new as this kind of single entry point and close all other - but thats in the practical case of the current architecture of this Decimal lib not possible - itā€™s used a lot in other projects (money, ecto, ā€¦)- of course change radical everything - maybe possible ā€¦ I wrote a small side project (for fun) the lib Apa - Arbitrary Precision Artithmetic with a different architecture (tuple vs. struct / integers and strings vs lists)ā€¦ but thats not the case here ā€¦ I want to help to make the original Decimal lib more safe! :wink:

by the way ā€¦ it already does it a lot:

  def add(%Decimal{coef: :inf}, %Decimal{coef: :inf}),
    do: error(:invalid_operation, "adding +Infinity and -Infinity", %Decimal{coef: :NaN})

:wink:

But really I appreciate this discussion - thanks a lot!

Can you clarify why this isnā€™t possible? The money and Ecto apps should always use either 1) use the safe constructors or 2) take complete responsibility for using the structs wrong.

Those arenā€™t protections against intentionally broken data structures, thatā€™s traditional app logic. You can generate infinite numbers via the safe public API, and in that case the add function handles it. Youā€™re proposing a more extensive model where the safeguards need to handle not just the domain of valid decimals but the domain of all nonsense decimals too.

3 Likes

Again - theoretically you are absolute right - we could discuss if 0 or NaN or Inf are nonsense too ā€¦ but no hairsplitting ā€¦ :wink:

I like your really correct thinking. And if we find both a way for a new solution - wonderful!

What would you suggest?

If I would be radical - as you suggested - I would remove the struct

%Decimal{sign: sign, coef: coef, exp: exp}

and use tuples of arbitrary integers:
{integer(), integer()}
{-12, -3} == -0.012

no limits no restrictions, all possible math numbers are constructible

and an add function for example like this:
@spec add({integer(), integer()}, {integer(), integer()}) ::{integer(), integer()}

composable, piping (of course) and nice performanceā€¦ :wink:

with well defined entry points ā€¦

ā€¦ but itā€™s sounds like a dream ā€¦ I would break a lot of old stuff :wink:

How does this make any difference? An end user can construct hostile tuples in the same way that they can construct hostile or nonsense decimals.

But they arenā€™t, theyā€™re a well defined value, you just canā€™t do all the operations to them, constructible via the public constructors.

1 Like