Hi guys,
I am having an interesting issue with Elixir compiling time. The case is really simple, I needed a database which maps the IP to the country. I know there are other solutions for this case, but let’s focus what I’ve observed.
Database is a flat CSV file contains “start IP”, “end IP” and “country code”. Here is the example:
"2.16.108.0","2.16.108.255","ES"
"2.16.109.0","2.16.109.255","PL"
I took it from https://db-ip.com
I decided to create a bunch of do_whereis/1
function definitions with guards. It looks to be an easy job:
dbip_stream = File.stream!(Path.join([__DIR__, @db]), [], :line)
|> CSV.decode
|> Stream.map(fn [_ip1, ip2, country] -> {ip2integer(ip2), String.to_atom(country)} end)
for {ip2, country} <- dbip_stream do
defp do_whereis(ip) when ip <= unquote(ip2), do: unquote(country)
end
(ip2integer
translates IP string to the corresponding integer (as 192.168.1.1 == 256*256*256*192 + 256*256*168 + 256*1 + 1
), the rest should be obvious. do_whereis/1
is supposed to translate this integer to the corresponding country code)
Everything works, but the compilation takes forever. The database contains about 400,000 lines with (I’ve filtered it to IPv4 only). Interesting is the compilation time is not linear. I made some test for the first N lines and this are the results:
1000 lines -> 0,96s user 0,47s system 157% cpu 0,905 total
10000 lines -> 9,06s user 1,38s system 128% cpu 8,121 total
20000 lines -> 30,55s user 2,45s system 114% cpu 28,734 total
50000 lines -> 225,72s user 7,43s system 105% cpu 3:41,74 total
Just for fun (I don’t want to create 16 millions of function declarations) I’ve tried to generate a function for every single IP with:
for {ip1, ip2, country} <- dbip_stream, ip <- ip1..ip2 do
defp do_whereis(unquote(ip)), do: unquote(country)
end
but obviously it is even worse in terms of compilation time.
Of course I can do the same with many other ways, but it is not a case, I just would like to understand
Is this method a bad practice? Elixir does it itself, as far as I remember, with Unicode. Did I touch the practical limits?