Securely checking if a password is compromised in Python
Have I Been Pwned is a service that allows users to check if their account/password is compromised. It’s useful and important to know whether your password is compromised so you can change it. However, one problem with using the service via their web portal is that you need to provide your full password. Although, haveibeenpwned is trustworthy, it’s still doesn’t feel comfortable to share your valuable and hopefully secure password with a third party service*.
K-anonymity + Have I Been Pwned
Fortunately, Have I Been Pwned provides an API that allows you to use k-anonymity to check if your password is compromised. The basic idea behind the protocol is that you don’t share your whole password or more accurately the whole SHA-1 hash of your password with the service provider, but you only provide a short prefix of your password. This means your actual password is now in a much larger pool of passwords that have the same prefix.
For example, imagine the hash of “123” is 001ABCxxxxxxxx, and the hash of another password (“bestpassword”) is 001ADExxxxxxxx. Now, if I only ask the service, what are the compromised passwords that starts with 001A, the service won’t be able to distinguish whether my password is “123” or “bestpassword”. Meaning, in k-anonymity for every entry there are k-1 other possible entries**.
How easy is it to find different values that share the same hash prefix?
The following code creates random values (byte arrays of length 10) until it finds 5 that have SHA-1 hex digest with 3 leading zeros. You may notice the code below follows the idea behind proof of work mechanisms such as Hashcash and Bitcoin.
import os
import hashlibn_matches = 0
while True:
sha1 = hashlib.sha1()
sha1.update(os.urandom(10))
digest = sha1.hexdigest()
if digest[:3] == '000':
n_matches += 1
print(digest, n_matches)
if n_matches >= 5:
break
Have I Been Pwned API is easy to use and you only need to send the first 5 characters of the SHA-1 hash of your password (prefix) to the server. Then, the server replies back with the remaining characters (suffixes) and the number of hits per each.
PWNEDURL = "https://api.pwnedpasswords.com/range/{}"sha1 = hashlib.sha1()
sha1.update(passwd.encode())
hex_digest = sha1.hexdigest().upper()hex_digest_f5 = hex_digest[:5]
hex_digest_remaining = hex_digest[5:]r = requests.get(PWNEDURL.format(hex_digest_f5))leaked_passwd_freq = defaultdict(int)
for passwd_freq in r.content.splitlines():
pass_parts = passwd_freq.split(b":")
passwd = pass_parts[0].decode()
freq = pass_parts[1]
leaked_passwd_freq[passwd] = int(freq)
Demo
$ python3 passcheck.py
Enter password:
WARNING: Your password is compromised with 1078184 hits in the compromised passwords database
The full source code for passcheck is available at https://github.com/amiralis/passcheck
Footnotes
*Under the hood Have I Been Pwned web portal uses the same k-anonymity protocol and API to check for compromised passwords.
** A malicious service might still be able to deduce some information about the queried passwords. For example, if a password is compromised it’s more likely to be “123” rather than “j2#$so4”, even though the both might have the same SHA-1 hash.