Best way to generate random unisgned 64-bit integers?

Hey all, wanted to run a quick sanity check by you.

I’m building an analytics database and I need to generate random IDs for sessions in the app server. To achieve better compression in the database, I don’t want to use UUID strings there but I’d rather use UInt64 data types.

So I’m left with the question of what’s the best way to generate random UInt64 values? Here’s my current approach:

def random_uint64() do
  :crypto.strong_rand_bytes(8) |> :binary.decode_unsigned()
end

Any obvious issues with doing this?

2 Likes

Not sure what database you are using, but at least PostgreSQL has a real UUID type that is not stored as a string into the database. Of course that’s 128 bits instead of 64.

If you intend to store 64 bit unsigned integers into the DB, note that PostgreSQL’s bigint type is signed, and so cannot fit all unsigned values.

If you are not using PostgreSQL, just ignore this post. :slight_smile:

4 Likes

Thanks :slight_smile: I’m actually moving from Postgres to Clickhouse, here’s the repo: https://github.com/plausible-insights/plausible

I was using the UUID type in Postgres but after a consultation with some Clickhouse experts they told me that UInt64 is smaller and has enough cardinality for my use-case.

2 Likes

(twitter) snowflake ids - also comes to mind - though I assume you don’t need the features…

Oh it’s you. :slight_smile: I use UUID in my analytics app, but I guess I have a lot less traffic than Plausible!

The only thing issue I can think of is low entropy, there’s some answer about it on SO: https://stackoverflow.com/a/30652871 – but I don’t know how often low entropy happens on modern machines / OSes / OpenSSL versions.

IMO your code is good enough. :crypto.strong_rand_bytes is much better than most out-of-the-box pseudo-random generators.

1 Like

The problem is that it can deplete the entropy source and is slower than most PRNGs. So the question is what is needed. @ukutaht maybe you could use UUIDv6, which is not random at all. Especially for utility that is about analytics, which will need to store time anyway.

2 Likes

:crypto.strong_rand_bytes calls OpenSSL lib and that in turn reads from system APIs (/dev/urandom on POSIX, CryptGenRandom on Windows). I don’t really know about Windows but running out of entropy in properly intialized /dev/urandom is not a concern, see e.g. https://security.stackexchange.com/questions/186086/is-always-use-dev-urandom-still-good-advice-in-an-age-of-containers-and-isola

Anyway, cryptographic strong random is way more than what OP needs. It’ll work fine but just 8 bytes from UUID or snowflake would suffice.