How to hash a random number?

Hi,
I’m trying to hash a random number but can’t make it work…

num = Enum.random(1..9999)
hashed_num = :crypto.hash(:sha256, num)

I’m getting “1st argument: not an iodata term” error. From my understanding, Enum.random creates a number and :crypto.hash wants an iodata. The same thing happens if I use :crypto.rand_uniform(1, 9999) instead of Enum.random… can anyone help?

You have to convert the integer to a bitstring then :crypto.hash will accept it:

num = 999
width_in_bits = 64 # unsure if it's possible to get this dynamically?
bin = <<num::integer-size(width_in_bits)>>
hashed_num = :crypto.hash(:sha256, bin)
{num, bin, hashed_num}

https://hexdocs.pm/elixir/Kernel.SpecialForms.html#<<>>/1

Someone clever can let us know if it’s possible to get the bit width of an number dynamically. I’ve only ever used this with fixed/known widths. You could just wing it on a huge width if you didn’t really care. AFAIK beam numbers have no maximum value, so no maximum width but you can generally scope your usecase.

1 Like

An integer() value is not a subtype of iodata() and even the integers allowed as part of an iolist need to be byte values, so in the range of 0…255. you likely want to encode your integer to a binary format first.

2 Likes
num = 
    Enum.random(1..999) 
    |> Integer.to_string()

hashed_num = :crypto.hash(:sha256, num)

This is the final code for future reference, converting integer to string as @LostKobrakai suggested was all that was needed.

You can transform any term to a binary (and then hash it) with :erlang.binary_to_term/1:

iex> :erlang.term_to_binary(42)
<<131, 97, 42>>
iex> :erlang.term_to_binary(42.0)
<<131, 70, 64, 69, 0, 0, 0, 0, 0, 0>>
iex> :erlang.term_to_binary({4, 2})
<<131, 104, 2, 97, 4, 97, 2>>
4 Likes

Note that’s hashing the string “999”, not the integer 999, which might make a difference to some intentions or across application contexts. (You could argue that settling on “we always hash everything as utf8-string” as being more transportable/less-complex?)

Your hashing here is fixed width but for some things like encoding Base58 it makes a difference in payload size if nothing else.

t = DateTime.utc_now() |> DateTime.to_unix(:microsecond)

to_string_encode =
  t
  |> Integer.to_string()
  |> Base.encode64()
  |> dbg()

to_bin_encode =
  t
  |> then(fn x ->
    <<x::unsigned-integer-size(64)>>
  end)
  |> Base.encode64()
  |> dbg()

to_string_base =
  t
  # you can also pass a 2->36 (not 64!) as a base
  # this is *not* functionally the same thing though!
  |> Integer.to_string(32)
  |> dbg()

[
  to_string_encode: to_string_encode,
  to_bin_encode: to_bin_encode,
]

# the encoded string is a larger payload than the encoded integer
# => [to_string_encode: "MTY2NjQzNTI4MTAzNzU3NA==",
# =>  to_bin_encode:    "AAXrnTL3qQY="]

term_to_binary might be problematic as its leading byte is a version number (which I assume can change …) and the docs warn “There is no guarantee that this function will return the same encoded representation for the same term.”

https://www.erlang.org/doc/apps/erts/erl_ext_dist.html

https://www.erlang.org/doc/man/erlang.html#term_to_binary-1

There is an option for deterministic but it’s not x-otp version stable.

Option deterministic (introduced in OTP 24.1) can be used to ensure that within the same major release of Erlang/OTP, the same encoded representation is returned for the same term. There is still no guarantee that the encoded representation remains the same between major releases of Erlang/OTP.

That does make me wonder though, how easy is it to safely & stabley hash an actual composed data type? You could convert to some other format like json or mpack but those aren’t guaranteed to be stably ordered. I guess you’re stuck converting each value to a bitstring/hash and hashing the combination?

4 Likes

Not text string. People are telling you to convert your integer to a bit string.

2 Likes

Is this the right way?

Enum.random(1..999)
|> Integer.to_string()
|> Base.encode64()

No, you’re still producing a text string. You were pointed at several other ways above.

1..9999 |> Enum.map(&(&1 |> :math.log2() |> ceil)) 
2 Likes

Smarter than max_n |> Integer.to_string(2) |> String.length() :smile:

actually not, because I got it wrong. should be log2 |> floor to find the highest bit set.

Or, instead of counting bits, you can just let the Erlang runtime produce the smallest binary representation of any given integer: :binary.encode_unsigned/1.

Still, I wonder what use-case is the OP had in mind: it seems to me this is building a PRNG with questionable randomness properties. If the idea is to return 32 bytes of truly random data, use :crypto.strong_rand_bytes(32). Might be faster too…

3 Likes