I’m happy to announce the plans for the next major version of Decimal: v2.0.0.
The primary reason for the next major release is a need for a breaking change. Elixir v1.10 ships with Improvements to sort-based APIs in Enum using a compare/2 returning :lt | :eq | :gt that a module can implement. However, there already exists a Decimal.compare/2 function with different return values, and so the function will be adjusted in Decimal v2.0.0.
Creating a new major release is also an opportunity to further clean up the API. For example, Decimal.parse/1 will be changed to behave like the Integer.parse/1 and Float.parse/1 counterparts. Other functions have been slightly changed and/or renamed to have a more idiomatic Elixir interface. Finally, the next major release will drop deprecated functionality and will require more recent Elixir versions.
Before shipping v2.0.0, we plan to have v1.9.0, which will be a backwards-compatible release that works on Elixir v1.0 and has deprecation warnings that should ease the transition.
Today we’re also shipping release candidates: v1.9.0-rc.0 and v2.0.0-rc.0. See the changelongs for more information:
Have you thought about going to a Record based backend instead of a Struct? I’ve written a Decimal clone (for the functions I needed for a very limited purpose) and it benchmarked over 5 times (~5.68 average times faster for my use-case) faster than Decimal. It was identical code to Decimal (copy/pasted) other than using records instead of Structs.
We haven’t discussed this. That might be too big of a leap if that makes sense, for example we’d no longer have protocols. But your results are definitely interesting, can you share library and benchmark code?
Before Elixir v1.0, we had protocols for Records and there were many issues. For example, is {:name, "hello"} a record or not? Checking all tuples for potentially being a record was too expensive and it had too many false positives, meaning we would call the implementation code for something that would not be a record and then it would fail. That’s one of the reasons why we introduced structs in the first place.
I am honestly skeptical this would be case. Records are not 5x faster than structs even on regular operationss, so once it gets diluted with the operations done by Decimal, I really don’t expect a 5 times improvement.
Hi,
Sorry if I’m introducing off-topic but what’s the purpose of this library? I saw the post on the forum homepage and was curious.
Is it just for decimal number formatting purpose?
Or is it even related to computation precision? If so why? Does elixir is lacking type for precision calculation (e.g. like using Float in Ruby for currency operations)?
Yes… And so, I just noticed that I completely missed the fact that Elixir guides doesn’t even mention working with decimals for precision computation…
In the meantime I just discovered (while searching on the forum about how people are dealing with currencies) ex_money which seems to depend indeed on the decimal library… So I bet that ex_money doesn’t simply consider values as integer in cents…
Anyway I’m interested to learn more on the subject… If you have any resource (like blogposts etc.) I’ll be happy to read them…
Definitely ex_money uses Decimal for representing money amounts both in Elixir and the the database (I’m the author)
José has always been clear in his intent to keep Elixir as a language small and to consider libraries as first class citizens in the ecosystem. Decimal is definitely the ‘goto’ package for this purpose.
My point is that records are not 5 times faster than structs even when you are only performing records and structs operations, so when you increase the baseline by adding decimal operations as well, it is even more unlikely to get such a number.
Also, comparing decimals is as bad as comparing dates
Don’t think structs vs records can do anything here, nor elixir/erlang can help with this.
P.S. i wonder if it’s possible to make standard macro to take advantage of that standard compare/2 thing - e.g.
Extending protocols to support records is as trivial as testing the first element of the tuple if it’s module implements the protocol.
It’s a feature I’ve wanted and talked about for many years. ^.^
It was for a benchmark comparing Coerce’s and Number’s protocols reimplemented in ProtocolEx, implemented for integers, float, Decimal, and MyDecimal (copied from Decimal but using records instead of structs, I was curious how it would do, just enough of it was implemented for the test). The code is available in ProtocolEx’s test directory but the results run right now on my local computer are:
➜ mix test --include bench:true
Including tags: [bench: "true"]
....Operating System: Linux
CPU Information: AMD Phenom(tm) II X6 1090T Processor
Number of Available Cores: 6
Available memory: 15.67 GB
Elixir 1.9.1
Erlang 22.2.1
Benchmark suite executing with the following configuration:
warmup: 3 s
time: 3 s
memory time: 0 ns
parallel: 1
inputs: Decimal, Floats, Integers, MyDecimal
Estimated total run time: 2 min
Benchmarking MyNumbers with input Decimal...
Benchmarking MyNumbers with input Floats...
Benchmarking MyNumbers with input Integers...
Benchmarking MyNumbers with input MyDecimal...
Benchmarking MyNumbers - sans coerce with input Decimal...
Benchmarking MyNumbers - sans coerce with input Floats...
Benchmarking MyNumbers - sans coerce with input Integers...
Benchmarking MyNumbers - sans coerce with input MyDecimal...
Benchmarking Numbers with input Decimal...
Benchmarking Numbers with input Floats...
Benchmarking Numbers with input Integers...
Benchmarking Numbers with input MyDecimal...
Benchmarking Numbers - my coerce with input Decimal...
Benchmarking Numbers - my coerce with input Floats...
Benchmarking Numbers - my coerce with input Integers...
Benchmarking Numbers - my coerce with input MyDecimal...
Benchmarking Numbers - sans coerce with input Decimal...
Benchmarking Numbers - sans coerce with input Floats...
Benchmarking Numbers - sans coerce with input Integers...
Benchmarking Numbers - sans coerce with input MyDecimal...
##### With input Integers #####
Name ips average deviation median 99th %
MyNumbers - sans coerce 9.08 M 110.15 ns ±16673.74% 73 ns 279 ns
MyNumbers 4.68 M 213.63 ns ±7430.41% 151 ns 465 ns
Numbers - sans coerce 3.94 M 253.98 ns ±9300.43% 180 ns 474 ns
Numbers 2.87 M 348.04 ns ±5574.50% 247 ns 656 ns
Numbers - my coerce 2.77 M 361.31 ns ±6628.20% 243 ns 661 ns
Comparison:
MyNumbers - sans coerce 9.08 M
MyNumbers 4.68 M - 1.94x slower
Numbers - sans coerce 3.94 M - 2.31x slower
Numbers 2.87 M - 3.16x slower
Numbers - my coerce 2.77 M - 3.28x slower
##### With input Floats #####
Name ips average deviation median 99th %
MyNumbers - sans coerce 8.95 M 111.72 ns ±1351.18% 83 ns 313 ns
MyNumbers 4.60 M 217.38 ns ±759.57% 181 ns 518 ns
Numbers - sans coerce 2.12 M 471.92 ns ±11747.10% 217 ns 586 ns
Numbers - my coerce 1.72 M 582.45 ns ±9204.68% 296 ns 761 ns
Numbers 1.71 M 586.38 ns ±9112.67% 287 ns 794 ns
Comparison:
MyNumbers - sans coerce 8.95 M
MyNumbers 4.60 M - 1.95x slower
Numbers - sans coerce 2.12 M - 4.22x slower
Numbers - my coerce 1.72 M - 5.21x slower
Numbers 1.71 M - 5.25x slower
##### With input MyDecimal #####
Name ips average deviation median 99th %
MyNumbers - sans coerce 1.53 M 653.91 ns ±3127.51% 491 ns 1230 ns
MyNumbers 1.30 M 768.43 ns ±5945.41% 553 ns 1386 ns
Numbers - sans coerce 1.22 M 821.94 ns ±2839.92% 663 ns 2528 ns
Numbers 1.13 M 884.97 ns ±4559.10% 676 ns 1660 ns
Numbers - my coerce 1.09 M 914.85 ns ±4804.80% 678 ns 2588 ns
Comparison:
MyNumbers - sans coerce 1.53 M
MyNumbers 1.30 M - 1.18x slower
Numbers - sans coerce 1.22 M - 1.26x slower
Numbers 1.13 M - 1.35x slower
Numbers - my coerce 1.09 M - 1.40x slower
##### With input Decimal #####
Name ips average deviation median 99th %
MyNumbers - sans coerce 330.74 K 3.02 μs ±912.21% 2.56 μs 6.25 μs
MyNumbers 327.97 K 3.05 μs ±1079.04% 2.68 μs 6.06 μs
Numbers - my coerce 316.77 K 3.16 μs ±1057.15% 2.78 μs 6.36 μs
Numbers 314.40 K 3.18 μs ±785.11% 2.81 μs 6.38 μs
Numbers - sans coerce 305.82 K 3.27 μs ±999.41% 2.70 μs 6.86 μs
Comparison:
MyNumbers - sans coerce 330.74 K
MyNumbers 327.97 K - 1.01x slower
Numbers - my coerce 316.77 K - 1.04x slower
Numbers 314.40 K - 1.05x slower
Numbers - sans coerce 305.82 K - 1.08x slower
.
Finished in 157.5 seconds
1 property, 4 tests, 0 failures
Randomized with seed 501954
I entirely admit I may have screwed something up in the implementation (see github, clone ProtocolEx and run mix test --include bench:true), but the MyDecimal speed over Decimal was uniformly much higher in both Protocol’s Numbers/Coerce’s and ProtocolEx’s implementations. The actual code for the MyDecimal module is in lib/protocol_ex.ex (only when the environment is test, I really need to move it out, it was just some testing code before I got around to pulling out all the tests that I never ended up moving out, but it’s at the bottom of the file). The difference was I implemented it in the protocol itself so it didn’t have to call another class but saving a single module call doesn’t seem a 5x difference.
Not unless the consolidation built the record tests like the struct tests where it tests explicit names first before the tuple itself.
Yes! Never use floats for currency ever ever ever! Decimal and a wrapper like ex_money is what should be used!
I just needed to fulfill the multiplication and addition API’s, so those functions (and the helpers Decimal used like pow10 is all I copied in. It could just have been that this code is extra susceptible to such changes.
Yeah that’s my thoughts, let’s write a bench to compare just base structs and records… Ooo, I already have one in my playground I just noticed, let’s run it.
Getting data out:
Benchmarking Classifier: get
============================
Operating System: Linux"
CPU Information: AMD Phenom(tm) II X6 1090T Processor
Number of Available Cores: 6
Available memory: 15.67 GB
Elixir 1.9.1
Erlang 22.2.1
Benchmark suite executing with the following configuration:
warmup: 2 s
time: 2 s
memory time: 2 s
parallel: 1
inputs: none specified
Estimated total run time: 1.20 min
Benchmarking Record1-direct...
Benchmarking Record1-remote...
Benchmarking Record1-stock...
Benchmarking Record9-first-direct...
Benchmarking Record9-first-remote...
Benchmarking Record9-first-stock...
Benchmarking Record9-last-direct...
Benchmarking Record9-last-remote...
Benchmarking Record9-last-stock...
Benchmarking Struct1...
Benchmarking Struct9-first...
Benchmarking Struct9-last...
Name ips average deviation median 99th %
Record9-last-direct 26.21 M 0.0382 μs ±6.99% 0.0370 μs 0.0460 μs
Record1-direct 25.82 M 0.0387 μs ±622.14% 0.0400 μs 0.0900 μs
Record9-first-direct 25.74 M 0.0388 μs ±7.88% 0.0370 μs 0.0480 μs
Record9-last-stock 25.64 M 0.0390 μs ±7.90% 0.0370 μs 0.0480 μs
Record9-first-stock 25.56 M 0.0391 μs ±8.28% 0.0380 μs 0.0500 μs
Record1-stock 25.45 M 0.0393 μs ±8.32% 0.0370 μs 0.0490 μs
Struct1 22.00 M 0.0455 μs ±647.96% 0.0400 μs 0.100 μs
Struct9-first 21.04 M 0.0475 μs ±622.37% 0.0400 μs 0.110 μs
Struct9-last 19.80 M 0.0505 μs ±11.05% 0.0480 μs 0.0650 μs
Record1-remote 11.50 M 0.0870 μs ±9.04% 0.0820 μs 0.116 μs
Record9-last-remote 11.50 M 0.0870 μs ±5.79% 0.0840 μs 0.100 μs
Record9-first-remote 11.26 M 0.0888 μs ±7.10% 0.0860 μs 0.104 μs
Comparison:
Record9-last-direct 26.21 M
Record1-direct 25.82 M - 1.01x slower
Record9-first-direct 25.74 M - 1.02x slower
Record9-last-stock 25.64 M - 1.02x slower
Record9-first-stock 25.56 M - 1.03x slower
Record1-stock 25.45 M - 1.03x slower
Struct1 22.00 M - 1.19x slower
Struct9-first 21.04 M - 1.25x slower
Struct9-last 19.80 M - 1.32x slower
Record1-remote 11.50 M - 2.28x slower
Record9-last-remote 11.50 M - 2.28x slower
Record9-first-remote 11.26 M - 2.33x slower
Memory usage statistics:
Name Memory usage
Record9-last-direct 24 B
Record1-direct 24 B - 1.00x memory usage
Record9-first-direct 24 B - 1.00x memory usage
Record9-last-stock 24 B - 1.00x memory usage
Record9-first-stock 24 B - 1.00x memory usage
Record1-stock 24 B - 1.00x memory usage
Struct1 24 B - 1.00x memory usage
Struct9-first 24 B - 1.00x memory usage
Struct9-last 24 B - 1.00x memory usage
Record1-remote 24 B - 1.00x memory usage
Record9-last-remote 24 B - 1.00x memory usage
Record9-first-remote 24 B - 1.00x memory usage
**All measurements for memory usage were the same**
Putting data in:
Benchmarking Classifier: put
============================
Operating System: Linux"
CPU Information: AMD Phenom(tm) II X6 1090T Processor
Number of Available Cores: 6
Available memory: 15.67 GB
Elixir 1.9.1
Erlang 22.2.1
Benchmark suite executing with the following configuration:
warmup: 2 s
time: 2 s
memory time: 2 s
parallel: 1
inputs: none specified
Estimated total run time: 1.50 min
Benchmarking Record1-direct...
Benchmarking Record1-remote...
Benchmarking Record1-stock...
Benchmarking Record9-first-direct...
Benchmarking Record9-first-remote...
Benchmarking Record9-first-stock...
Benchmarking Record9-last-direct...
Benchmarking Record9-last-remote...
Benchmarking Record9-last-stock...
Benchmarking Struct1...
Benchmarking Struct1-opt...
Benchmarking Struct9-first...
Benchmarking Struct9-first-opt...
Benchmarking Struct9-last...
Benchmarking Struct9-last-opt...
Name ips average deviation median 99th %
Record1-direct 18.88 M 0.0530 μs ±729.48% 0.0500 μs 0.110 μs
Record1-stock 17.61 M 0.0568 μs ±795.50% 0.0500 μs 0.110 μs
Struct1 16.85 M 0.0594 μs ±540.43% 0.0500 μs 0.120 μs
Record9-last-direct 15.99 M 0.0625 μs ±724.38% 0.0500 μs 0.140 μs
Record9-first-direct 15.93 M 0.0628 μs ±573.43% 0.0500 μs 0.130 μs
Record9-first-stock 15.90 M 0.0629 μs ±721.68% 0.0500 μs 0.130 μs
Record9-last-stock 15.30 M 0.0653 μs ±704.62% 0.0600 μs 0.24 μs
Struct1-opt 14.29 M 0.0700 μs ±386.88% 0.0600 μs 0.150 μs
Struct9-first 13.78 M 0.0726 μs ±448.21% 0.0600 μs 0.190 μs
Struct9-first-opt 12.62 M 0.0792 μs ±300.47% 0.0700 μs 0.160 μs
Struct9-last 12.49 M 0.0801 μs ±7517.80% 0 μs 0.40 μs
Struct9-last-opt 11.65 M 0.0858 μs ±156.14% 0.0800 μs 0.150 μs
Record1-remote 9.30 M 0.108 μs ±228.35% 0.100 μs 0.180 μs
Record9-last-remote 8.94 M 0.112 μs ±182.77% 0.100 μs 0.180 μs
Record9-first-remote 8.42 M 0.119 μs ±286.36% 0.110 μs 0.23 μs
Comparison:
Record1-direct 18.88 M
Record1-stock 17.61 M - 1.07x slower
Struct1 16.85 M - 1.12x slower
Record9-last-direct 15.99 M - 1.18x slower
Record9-first-direct 15.93 M - 1.19x slower
Record9-first-stock 15.90 M - 1.19x slower
Record9-last-stock 15.30 M - 1.23x slower
Struct1-opt 14.29 M - 1.32x slower
Struct9-first 13.78 M - 1.37x slower
Struct9-first-opt 12.62 M - 1.50x slower
Struct9-last 12.49 M - 1.51x slower
Struct9-last-opt 11.65 M - 1.62x slower
Record1-remote 9.30 M - 2.03x slower
Record9-last-remote 8.94 M - 2.11x slower
Record9-first-remote 8.42 M - 2.24x slower
Memory usage statistics:
Name Memory usage
Record1-direct 48 B
Record1-stock 48 B - 1.00x memory usage
Struct1 64 B - 1.33x memory usage
Record9-last-direct 112 B - 2.33x memory usage
Record9-first-direct 112 B - 2.33x memory usage
Record9-first-stock 112 B - 2.33x memory usage
Record9-last-stock 112 B - 2.33x memory usage
Struct1-opt 64 B - 1.33x memory usage
Struct9-first 128 B - 2.67x memory usage
Struct9-first-opt 128 B - 2.67x memory usage
Struct9-last 128 B - 2.67x memory usage
Struct9-last-opt 128 B - 2.67x memory usage
Record1-remote 48 B - 1.00x memory usage
Record9-last-remote 112 B - 2.33x memory usage
Record9-first-remote 112 B - 2.33x memory usage
**All measurements for memory usage were the same**
So yeah likely something else coming into play, maybe that one less module call is really significant combined with the above… Or I screwed up when I converted it. Would love someone to check where I screwed up at. ^.^;
This is only “trivial” if you dismiss a bunch of possible edge cases. Today two separate modules can define a record name {:user, _} with the same amount of fields. So which one wins over the protocol implementation and consolidation?
What you say would only be possible if we:
tie records to modules
each module can only a single “tied record”
Which is how structs behave and out of the gate rules out all of Erlang records. Unless we use another implementation of protocols, which are not open for extension as Elixir’s. So at the end of the day, you won’t have records as they are today OR you won’t have protocols as they are today.
Honestly such a setup should error in my opinion, as it would in a static language of similar consolidation. There should only be a single implementation for, say, a given struct or record, implementing it multiple times always indicates a possible fault and should be errored at the earliest opportunity (don’t struct’s already error if implemented multiple times?).
I’d say that is what should be done, and yes that rules out erlang records unless someone defmodule :some_erlang_record do ... end of course, which is quite feasible as there can only be one module in the system (otherwise wrap it up into another record, a single indirection there is still quite cheap).
EDIT: Just for some somewhat unrelated history, back in my Erlang days I tended to use a very generic name for non-public records like :state or so, or I used the name of the module itself for public records (or ‘namespaced’ on the module name for many records in a module). I don’t think that’s an uncommon pattern, but still easy workarounds by either defmodule the name or wrap it.
Exactly. If you want protocol dispatch for records, then you need to add these restrictions. But this is not how records have been used and adding these restrictions would leave many records unsupported. Especially considering that we inherited many of them from Erlang.
Defining modules around records wouldn’t always work either. For example, Erlang has records that are named after the modules that define them. So you obviously can’t redefine those modules unless we create some sort of special convention for wrapper modules, which ends up adding even more complexity and slowing down dispatch.
And again, this still does not solve the problem with false positives. It is absolutely ok for me to define a tuple {:user, :ok} that accidentally matches a record. The reason why structs solve this is exactly because we use the __struct__ key, which is unlikely to conflict.
To be clear, I am not saying it is not possible, I am just saying that the claim it can be “trivially addressed” is unfair because the trivial solution comes with many requirements and pitfalls. And if the requirements end-up making records work pretty much like structs, except they use tuples underneath (which aren’t that much faster anyway and have their own downsides), then it is worth asking what is the point anyway.
That’s why I think the restrictions are good. Because there can be multiple type of records with the same signature (same first element atom and same number and potentially even type of fields) is specifically why they need to be restricted. It will rule out overlapping records but that’s just part of the ‘interface definition’ of the protocols. If they don’t want that restriction then they should use something else that would be able to distinguish between signature identical records (like protocol_ex). For the most general use these restrictions would be far more than sufficient as they would handle the general case where people would want to use them. As you stated the __struct__ key in struct’s is just an ‘unlikely to conflict’ convention, but it’s not impossible for, say, a deserialized json map to have such a key and to suddenly be passed through a protocol and then to suddenly fail when it never failed with other deserialized data before, it is possible, but it is not likely, same thing with these records. In general I whole-heartedly would design it in such a way that only records defined for such protocols should be allowed, not just any record without some excess support like a wrapper (which is also not a rare thing to have is wrapper structs).
Personally I prefer using records over structs because I prefer nominally typed structures (elixir records) over row-typed structures (elixir structs), not just because they are ~19% faster in the simple case (though can be a lot faster on larger mostly-read-only records) but because they make it impossible to accidentally pass it to row-destructured function heads (which I have accidentally done in elixir on more than a few occasions). If Elixir were a statically typed language then it would be significantly less of an issue, but it’s specifically because elixir is so dynamically typed that matching becomes so ‘fluffy’ when we don’t necessarily want it to.
Plus sometimes working with tuples is just outright easier than maps. ^.^
Right. I am not disagreeing with this. My only point is if you add restrictions, then they are not records as they are known today.
If this happens, it is a bug in the JSON decoding code, because it has obvious security implications. AFAIK, jason and poison do not allow it.
The problem is not even about JSON or external formats. Just to provide a concrete example, when we supported protocols for records in Elixir, we would have bugs when inspecting Macro.Env. That’s because Macro.Env has fields like {SomeModule, [some_function: 1]} and if SomeModule defines a record of one element, Inspect would be called with the wrong value, and everything would fail. So the claim that the chance for conflicts in records is the same as structs is not true, it is much more common. If Elixir was statically typed then we could rule out based on typing, but Elixir isn’t - and it isn’t the point of this discussion either.
In order to use records, you would always have to type the record name. And if you always add the struct name, this issue doesn’t exist either. The row-destructure feature from structs can be useful in certain occasions and fully removing it is actually one extra downside in records.