Is there an integer size limit?

jyf · August 23, 2024, 2:57pm

I was playing with some very large numbers, for fun, and ran across some type of limit that I can’t find the source of in the docs.

10**1_262_593 <— blows up
10**1_262_592 <— is fine

If I understand the docs correctly, the bigint limit should be limited by system ram, but even though there is plenty of ram available Elixir throws an error:

** (SystemLimitError) a system limit has been reached
    :erlang.*(10000....

I have tried with a pure Erlang integer exponentiation function and it has the same limit, so I guess it is some underlying limit somewhere but I can’t seem to find where.

Any pointers?

lud · August 23, 2024, 7:24pm

Math is implemented by the C library of the system AFAIK so I bet the limitation is inherited from there.

(10**1_262_592 + 10**1_262_592) should work though it takes time ; the BEAM indeed supports infinitely large integers (as long as there is memory of course).

Try (10**1_262_592 + 10**1_262_592) |> rem(10) to bypass the print from the shell. But it’s still long (did not wait).

Nicd · August 23, 2024, 7:46pm

A user on the Discord has dug up some interesting things:

I’ve found some hints while looking for causes of the system limit error. There’s a BIG_ARITY_MAX , which prevents larger bignums to be created (on ARCH_64 defined as (1 << 16)-1 ). For example, in this place otp/erts/emulator/beam/big.c at 494647adab20a19afcaa8eee98d427b8af5886de · erlang/otp · GitHub. This is the commit with a hint that bignums are artificially limited in size: don't create oversize bignums in binary matching · erlang/otp@7e147a0 · GitHub.

Nicd · August 23, 2024, 7:49pm

Result on my system (macOS on ARM):

iex(1)> (10**1_262_592 + 10**1_262_592)
** (SystemLimitError) a system limit has been reached
    :erlang.+(10000000000000000000000000000000000000000000000...

Nicd · August 23, 2024, 9:24pm

Just for fun, I tested this with my Gleam package bigi, which runs on Erlang and JS. This code crashes on the BEAM, but passes on Node.JS 20.10.0. So whatever the limit is, it’s lower than on Node. On both targets, the math operation is really quick on my machine, much quicker than in Elixir, which I suppose is due to a different implementation.

pub fn hello_world_test() {
  should.equal(
    bigi.power(bigi.from_int(10), bigi.from_int(1_262_593)),
    bigi.power(bigi.from_int(10), bigi.from_int(1_262_593)),
  )
}

lud · August 23, 2024, 10:34pm

Ah sorry I should have tested!

jyf · August 24, 2024, 5:53am

I added this reply to the Discord post a few mins ago, so I’m adding it here too for context:

Great find. It’s a bit over my head though… what does “arity” mean in this context? (1 << 16) -1 is a pretty small number, and it is not obvious how the limit is applicable.
Also note this comment: https://github.com/erlang/otp/blob/494647adab20a19afcaa8eee98d427b8af5886de/erts/emulator/beam/erl_iolist.c#L936
And the limit as a power of 2 in case that offers any insight:
2**4_194_239 <— is fine
2**4_194_240 <— blows up

bjorng · August 24, 2024, 7:54am

The limit for big integers used to be limited by RAM. There has always existed a limit, but in practice one would run out of RAM before reaching the limit for bigints. That is no longer the case, especially when running on a 64-bit computer.

The current limit is about 4 MBits or 4,194,240 bits. The value of BIG_ARITY_MAX should be multiplied by the number of bits in a machine word (32 or 64) to obtain the maximum number of bits.

The limit used to be higher, but we lowered it in Optimize binary_to_integer/1 and friends by bjorng · Pull Request #7426 · erlang/otp · GitHub to ensure that operations on big integers would always finish in a reasonable time.

Just for fun, I did a micro-optimization of your power/2 function:

power(Base, Exp) when is_number(Base), is_integer(Exp),
                      0 =< Exp, Exp =< 1 bsl 24 ->
    {ok, do_power(Base, Exp)};
power(_Base, _Exp) ->
    {error, nil}.

Guard tests can help the JIT to emit better native code. For example, when the exponent is known to be a small integer, the div and rem operations can be simplified. Without the guard tests, the div operation would look like so:

# i_int_div_jIssd
    mov x2, 47
    and x13, x26, 15
    cmp x13, 15
    b.ne L70
# optimized div by replacing with right shift
    orr x0, x13, x26, 1
    b L71
L70:
    mov x1, x26
    mov x3, 4347964816
    bl L73
L71:
    mov x27, x0

The first operand might not be a small integer, so the JIT needs to emit code to handle that. With the guards tests, the code for the div operation can be simplified to:

# i_int_div_jIssd
# skipped test for smalls operands and overflow
    mov x13, 15
# optimized div by replacing with right shift
    orr x26, x13, x26, 1

The runtime of power/2 is dominated by the multiplication of big integers, so the actual reduction in runtime with the guard tests is minuscule.

jyf · August 24, 2024, 9:19am

Many thanks for the very detailed response!

I had just recompiled Erlang with an increased BIG_ARITY_MAX and was testing it right as I saw your reply.

It might be beneficial to update the documentation:
https://www.erlang.org/doc/system/memory.html
to clarify the current limit.