Hi, I have a ~200 MB binary file that I am trying to load into a tuple of ints. My previous code used the binary directly to get an int, like so:
def get_table() do
File.read('data/HandRanks.dat')
end
def get_int(table, index) do
:binary.part(table, {index * 4, 4})
|> :binary.decode_unsigned(:little)
end
This code worked fine so and was reasonably performant but I wanted to test other methods of accessing the ints. There are around 32 million ints total in the file. First I built a list that contained all of the ints, starting from the end of the binary, then when done I tried to call List.to_tuple to convert to a tuple, like so:
def gen_tuple(table) do
length = div(byte_size(table), @int_size)
gen_tuple([], length - 1, table)
end
def gen_tuple(list, -1, _) do
List.to_tuple(list)
end
def gen_tuple(list, current, table) do
new_list = [get_int(table, current) | list]
gen_tuple(new_list, current - 1, table)
end
with @int_size being 4. When I run my gen_tuple I get this error:
** (ArgumentError) argument error
:erlang.list_to_tuple([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ā¦])
Iāve tried testing a few different things. The list is the correct size, and running things like length(list), Enum.slice, first(), last() on the list donāt throw errors and give the answers I would expect. is_list returns true, both for the elixir version and :erlang.is_list. If I use Enum.take to only use the first elements of the list to_tuple runs without error sometimes. I figured out that if the list is taken down to length 16,777,215 it can run without error. Anything with length 16,777,216 and above gives me the error. I sliced the list at that spot and didnāt notice anything improper. Element 16,777,215 is the number 24788 and the next element is the number 24792. I can even slice the list to take 50 or so elements in the middle that include this āproblem spotā and to_tuple runs fine, so Iām really confused about what is going on. The length of the list shouldnāt cause any problems, the list isnāt improper as far as I can tell, and the āproblem elementā doesnāt seem to be a problem when it is part of a smaller list. Can anyone help me figure out what Iām doing wrong here? I get the feeling Iām missing something obvious!
In case anyone is feeling unreasonably helpful the binary file can be downloaded here: https://github.com/chenosaurus/poker-evaluator/raw/master/data/HandRanks.dat
If anyone is familiar with poker math, itās the really awesome lookup table used in the 2+2 forum poker evaluator algorithm, used for lightning-fast poker hand evaluations.