Module with too many functions exausts erlang atoms

Because of the fact that there are 3.4*10³⁸ possible addresses in IPv6, I am wondering about the rest of your constraints that make you believe that, unless your ranges are very convenient, you could fit this in a 10⁶ lookup table?

Instead, maybe it is worth looking into bloom filters (combined with a somewhat slower but more scalable access method like for instance ETS).

2 Likes

We work primarily with IPv4 (for now), which only contains 4,294,967,296 IPs. Still the suggestion of bloom filters looks interesting!

You really don’t need an interval tree for this, you just need a straight-up sorted tree, with the possibility of “null” entries, then look up the largest entry <= the provided IP. Now, a b-tree would give you better lookup performance than any kind of binary tree, but I’d try some kind of simple binary tree first–a quick google search finds some Erlang implementations.

(You need an interval tree when you want to do things like find overlapping ranges…)

2 Likes

Another data structure that might be worth looking at is a Patricia tree, which, amongst other places, is used in multiple Blockchain-implementations exactly because of the large keyspace.

2 Likes

True, any kind of prefix trie, with either 4- or 8-bit symbols, could work. My suspicion is that this would provide little to no advantage over simpler trees for IPv4, but might be an advantage for IPv6…

2 Likes

I would consider something like OPA https://www.openpolicyagent.org/ running as a sidecar for each elixir deployment. Or potentially an ETS table with read optimization.

Even something like memsql/redis as a sidecar could work well. You will have some initial initialization time, but it should be quite fast after that.

I use the ETS table with a whitelisting api edge solution and it works well, but have never tried it at the scale you are referring to.