Programming Phoenix book: why limit the password length?

Hello!

I reading Programming Phoenix >= 1.4 to get a full picture of the framework.

Chapter 5 deals with authentication. In it, we implement validation as such:

  # The password rules are intentionally weak, as this implementation's focus is being a learning material
  def registration_changeset(user, params) do
    user
    |> changeset(params)
    |> cast(params, [:password])
    |> validate_required([:password])
    |> validate_length(:password, min: 6, max: 100)
    |> put_pass_hash()
  end

I’m puzzled by the max length requirement. Why cap the password length at 100? I’m reading another book (other technology) and I saw the same capping being done by the author.

I understand, of course, that having a longer password is probably impractical but still, if a user wants to store a password of length 101, 150 or even 200 (generated with a password manager), I’d like to know why I should prevent that?

By capping the length and returning an error about it, we could wrongly hint at the fact that we may be storing the password in clear text (max column length would be 100). Considering the input length has no effect on the output length of a hashing function.

Traditionally, it’s my understanding that a hashing function’s input length did not matter (that much) computing wise:

$ time ruby -e "print 'a'*200" | sha256sum 
c2a908d98f5df987ade41b5fce213067efbcc21ef2240212a41e54b5e7c28ae5  -

real    0m0.161s
user    0m0.107s
sys     0m0.060s


$ time ruby -e "print 'a'*200000" | sha256sum 
2287d207f24a941ff3b56c04c8a25ad56b63e3023207b3bb5b4ac0c9869d74be  -

real    0m0.133s
user    0m0.082s
sys     0m0.040s

It looks like more recent algorithms may care about it. I can’t deduce much from this limited experiment though:

$ ruby -e "print 'a'*100" | argon2 salt123456
Type:           Argon2i
Iterations:     3
Memory:         4096 KiB
Parallelism:    1
Hash:           400680f3e4793f230c859946e19c9c49b0311bf3cbfd1b564c5cfaecd86d0992
Encoded:        $argon2i$v=19$m=4096,t=3,p=1$c2FsdDEyMzQ1Ng$QAaA8+R5PyMMhZlG4ZycSbAxG/PL/RtWTFz67NhtCZI
0.019 seconds
Verification ok

Type:           Argon2i
Iterations:     3
Memory:         4096 KiB
Parallelism:    1
Hash:           639df3698b92be93a2dc0b6dc6709eba334d0c074451bb8f22b0a30cfcb4531e
Encoded:        $argon2i$v=19$m=4096,t=3,p=1$c2FsdDEyMzQ1Ng$Y53zaYuSvpOi3AttxnCeujNNDAdEUbuPIrCjDPy0Ux4
0.019 seconds
Verification ok

$ ruby -e "print 'a'*200" | argon2 salt123456
Error: Provided password longer than supported in command line utility

I used the argon2 command line utility, but we use the pbkdf2 algorithm in the book. It’s unclear why the author capped the input length here, there’s an open github issue about it.

I saw some conflicting and contradicting view points online, so I thought I’d ask here.

Maybe someone knowledgeable can enlighten me :slight_smile:

2 Likes

Okay, so the book later recommends having a look at the OWASP cheatsheet series to learn more.

This section related to pbkdf2 is of value, especially this paragraph:

When PBKDF2 is used with an HMAC, and the password is longer than the hash function’s block size (64 bytes for SHA-256), the password will be automatically pre-hashed. For example, the password “This is a password longer than 512 bits which is the block size of SHA-256” is converted to the hash value (in hex) fa91498c139805af73f7ba275cca071e78d78675027000c99a9925e2ec92eedd. A good implementation of PBKDF2 will perform this step before the expensive iterated hashing phase, but some implementations perform the conversion on each iteration. This can make hashing long passwords significantly more expensive than hashing short passwords. If a user can supply very long passwords, there is a potential denial of service vulnerability, such as the one published in Django in 2013. Manual pre-hashing can reduce this risk but requires adding a salt to the pre-hash step.

So it seems that limiting the password length may be a useful precaution to prevent denial of service attacks.

Quoting the Django denial of service issue in 2013:

Unfortunately, this complexity can also be used as an attack vector. Django does not impose any maximum on the length of the plaintext password, meaning that an attacker can simply submit arbitrarily large – and guaranteed-to-fail – passwords, forcing a server running Django to perform the resulting expensive hash computation in an attempt to check the password. A password one megabyte in size, for example, will require roughly one minute of computation to check when using the PBKDF2 hasher.

This allows for denial-of-service attacks through repeated submission of large passwords, tying up server resources in the expensive computation of the corresponding hashes.

Although this is most effective against algorithms which are designed to be relatively “slow” to compute, such as PBKDF2 (which, again, is the default hasher in Django’s authentication framework), it also is effective against other hashers, as the time to compute the hash generally grows with the size of the password.

To remedy this, Django’s authentication framework will now automatically fail authentication for any password exceeding 4096 bytes.

So that’s roughly between 1k to 4k characters in utf8 if my math is right.

Before posting, I searched around and found 1024 characters being the max length of an html input field. So maybe that’d be a good default length to cap a password at.

This issue I found was specifically related to pbkdf2, but if you have more general info to share, please do :slight_smile:

2 Likes

The documentation on mix phx.gen.auth files state that the limitation is there to prevent silent truncation of the password by db level limits, which would be bad.

1 Like

Thanks. Could you point me at the doc you’re referring to? I don’t understand your answer since the output’s length has no relation to the input’s length (per my understanding).

1 Like

Bcrypt has a maximum input of 72 bytes, otherwise the input is silently truncated. Argon2 do not have limit AFAIK

4 Likes

Good to know, thanks :slight_smile: