Hashing & Password Storage
Securely hash and store passwords with modern algorithms
TL;DR
Hashing: One-way function. Hash("password") produces digest; can't reverse to get "password". Password Storage: Never store plaintext. Hash password with salt (random data) when user signs up. When user logs in, hash provided password, compare digests. Use modern algorithm: Argon2 (best), bcrypt (good), scrypt (good). Add salt automatically; include salt in hash digest.
Learning Objectives
- Understand hashing vs encryption
- Hash passwords securely (Argon2, bcrypt)
- Implement salt to prevent rainbow tables
- Design password storage for compliance
- Detect compromised password databases
Motivating Scenario
Problem: Password database exposed. Contains plaintext passwords (or MD5 hashed). Attacker cracks 10,000 passwords in seconds. Hashes reversed via rainbow table or brute force.
Solution: Store Argon2(password, salt). Attackers spend hours cracking each password. Salt prevents rainbow tables (same password, different hash). Breach contained; passwords remain safe.
Core Concepts
Hashing vs Encryption
- Hash(data) → digest
- Can't reverse (not supposed to)
- Same input = same digest
- Tiny change in input = huge digest change
- Example: SHA-256, bcrypt, Argon2
- Use: passwords, integrity verification
- Encrypt(data, key) → ciphertext
- Decrypt(ciphertext, key) → data
- Can be reversed with key
- Example: AES-256, RSA
- Use: sensitive data storage, communication
Password Hashing Algorithms
MD5/SHA-1: DEPRECATED. Fast to compute = fast to crack.
SHA-256: Better, but still vulnerable to GPU brute force.
bcrypt: Slow by design. Configurable work factor. Salt included. Good choice.
scrypt: Memory-hard (resists GPU attacks). Good choice.
Argon2: Time-hard and memory-hard. Winner of Password Hashing Competition. Best choice.
Input: password="MyPass123", salt=random 16 bytes
bcrypt:
cost=12 (2^12 iterations, configurable)
hash=$2b$12$R9h/cIPz0gi.URNNW3kh2OPST9/PgBkqquzi.Ss7KIUgO2t0jWMUW
(includes algorithm, cost, salt, digest)
Argon2:
mode=i (resistant to side-channel), time=2 (iterations), memory=65536 KB
hash=$argon2i$v=19$m=65536,t=2,p=1$encoded_salt$digest
(more configurable, more secure)
Salt & Rainbow Tables
Without salt:
password: "admin"
hash: "8c6976e5b5410415bde908bd4dee15dfb167a9c873fc4bb8a81f6f2ab448a918" (SHA-256)
Attacker pre-computes hashes:
"admin" → same hash (cracked instantly via lookup table = rainbow table)
With salt:
password: "admin", salt: random bytes (e.g., "a3f2e1")
hash: bcrypt(admin + a3f2e1) = "$2b$12$a3f2e1...digest..."
Attacker must compute bcrypt for each password guess (slow, infeasible)
Salt prevents rainbow tables (same password, different salt = different hash)
Practical Examples
- Bcrypt (Node.js)
- Argon2 (Node.js)
- Algorithm Comparison
- Breach Detection
const bcrypt = require('bcrypt');
// Sign up: hash password
async function registerUser(email, password) {
const salt = await bcrypt.genSalt(10); // 10 = cost factor (higher = slower)
const hashedPassword = await bcrypt.hash(password, salt);
// Store in database
db.users.insert({
email: email,
password_hash: hashedPassword // "$2b$10$..."
});
}
// Login: verify password
async function authenticateUser(email, providedPassword) {
const user = db.users.findOne({ email });
const isValid = await bcrypt.compare(providedPassword, user.password_hash);
return isValid;
}
// Bcrypt benefits:
// - Salt included in hash (no separate storage)
// - Cost factor makes brute force slow (exponential time)
// - Time-tested (been around since 1999)
const argon2 = require('argon2');
// Sign up: hash password
async function registerUser(email, password) {
const hash = await argon2.hash(password, {
type: argon2.argon2id, // Most secure variant
memoryCost: 65536, // 64 MB
timeCost: 2, // 2 iterations
parallelism: 1
});
db.users.insert({
email: email,
password_hash: hash // "$argon2id$v=19$m=65536,t=2,p=1$..."
});
}
// Login: verify password
async function authenticateUser(email, providedPassword) {
const user = db.users.findOne({ email });
const isValid = await argon2.verify(user.password_hash, providedPassword);
return isValid;
}
// Argon2 benefits:
// - Memory-hard (GPU/ASIC resistant)
// - Time-hard (configurable iterations)
// - Winner of Password Hashing Competition (2015)
// - More secure than bcrypt
// Speed comparison (on modern CPU/GPU, intentionally slow for passwords)
Algorithm | Hash Time | GPU Advantage | Memory | Status
-----------------+-----------+---------------+--------+---------
MD5 | 0.0001ms | 1000x+ | Low | BROKEN
SHA-256 | 0.001ms | 100x+ | Low | Weak
PBKDF2 (100k) | 100ms | 10-50x | Low | OK
bcrypt (cost=12) | 250ms | 10x | Low | Good
scrypt | 500ms | 1-2x | High | Good
Argon2id | 500ms | ~1x | High | Best
// Recommendation: Use Argon2 with default params above
// Fallback: bcrypt (simpler, well-tested)
// Avoid: SHA-256, MD5, plain PBKDF2
// After password database breach, check passwords against known compromises
const crypto = require('crypto');
async function checkCompromisedPassword(password) {
// Use HaveIBeenPwned API (k-anonymity to protect privacy)
const hash = crypto.createHash('sha1').update(password).digest('hex').toUpperCase();
const prefix = hash.slice(0, 5);
const suffix = hash.slice(5);
// Query API (returns hashes starting with prefix)
const response = await fetch(`https://api.pwnedpasswords.com/range/${prefix}`);
const hashes = await response.text();
// Check if our full hash in returned list
const found = hashes.includes(suffix);
return found ? 'compromised' : 'safe';
}
// On login, check if password is compromised
// If yes, force password reset
When to Use / When Not to Use
- Storing user passwords
- Maximum security required
- Resources available (memory, CPU)
- Can tolerate 500ms hash time
- Storing user passwords
- Argon2 not available
- Simpler implementation preferred
- Well-tested solution needed
Patterns and Pitfalls
Pitfall: Storing plaintext passwords.
Pitfall: Using MD5 or SHA-256 for passwords.
Pattern: Always use bcrypt or Argon2. Salt automatically included.
Pitfall: Same salt for all passwords (weakens salt purpose).
Pattern: Unique salt per password (libraries generate automatically).
Pitfall: Allowing weak passwords ("123456").
Pattern: Enforce password complexity rules. Prevent compromised passwords (check HaveIBeenPwned).
Design Review Checklist
- Password hashing algorithm: Argon2 or bcrypt
- Salt automatically generated (not user-provided)
- Hash parameters configured securely (cost/memory/time)
- No plaintext passwords stored
- Legacy MD5/SHA-256 password hashes migrated
- Password complexity requirements enforced
- Compromised password detection implemented
- Password reset tokens hashed (not reusable)
- Breach notification plan documented
- Regular audit of password storage practices
Self-Check
- Why can't you reverse a hash?
- What does salt do?
- Why is bcrypt slower than SHA-256?
Argon2 + unique salt + proper cost factors = cracked passwords take years, not minutes.
Password Security Best Practices
Password Complexity and Validation
import re
class PasswordValidator:
def validate(self, password: str) -> tuple[bool, str]:
"""Validate password meets security requirements"""
if len(password) < 12:
return False, "Password must be at least 12 characters"
if not re.search(r'[A-Z]', password):
return False, "Must contain uppercase letter"
if not re.search(r'[a-z]', password):
return False, "Must contain lowercase letter"
if not re.search(r'[0-9]', password):
return False, "Must contain digit"
if not re.search(r'[!@#$%^&*(),.?":{}|<>]', password):
return False, "Must contain special character"
# Check against known weak passwords
if self.is_compromised(password):
return False, "This password has been exposed in data breaches"
# Check for common patterns
if self.has_common_patterns(password):
return False, "Password is too predictable"
return True, "Password is strong"
def is_compromised(self, password: str) -> bool:
"""Check if password is in known breach database"""
# Use Have I Been Pwned API with k-anonymity
import hashlib
sha1_hash = hashlib.sha1(password.encode()).hexdigest().upper()
prefix = sha1_hash[:5]
suffix = sha1_hash[5:]
response = requests.get(f'https://api.pwnedpasswords.com/range/{prefix}')
return suffix in response.text
def has_common_patterns(self, password: str) -> bool:
"""Detect common patterns like keyboard walks or repeats"""
patterns = [
r'(.)\1{2,}', # Repeating characters (aaa, 111)
r'1234|qwert|asdf', # Keyboard patterns
r'^(password|admin|letmein)', # Common words
]
for pattern in patterns:
if re.search(pattern, password, re.IGNORECASE):
return True
return False
# Usage
validator = PasswordValidator()
is_valid, message = validator.validate("Tr0ub4dor&3")
if not is_valid:
print(message)
Multi-Factor Authentication
# Password alone insufficient; use MFA
class AuthenticationService:
def login(self, username: str, password: str) -> bool:
"""Two-factor authentication flow"""
# Step 1: Verify password
user = self.db.get_user(username)
if not user:
return False
if not self.verify_password(password, user.password_hash):
self._log_failed_attempt(username)
return False
# Step 2: MFA check
mfa_method = user.mfa_method # 'totp' or 'sms'
if mfa_method == 'totp':
# Time-based one-time password
prompt_user_for_totp_code()
elif mfa_method == 'sms':
# SMS code
self._send_sms_code(user.phone)
prompt_user_for_sms_code()
# Step 3: Issue session token
return self.create_session(user)
Password Reset Security
# Secure password reset (avoid token reuse attacks)
class PasswordResetService:
def initiate_reset(self, email: str):
"""Generate secure reset token"""
user = self.db.get_user_by_email(email)
if not user:
# Security: Don't reveal if email exists
return "If email exists, reset link sent"
# Generate single-use, short-lived token
reset_token = self.generate_secure_token(32)
reset_hash = hash_password(reset_token)
self.db.store_reset_token({
'user_id': user.id,
'token_hash': reset_hash,
'expires_at': datetime.utcnow() + timedelta(hours=1),
'used': False
})
# Send reset link (don't include token in query param)
reset_link = f"https://app.com/reset?id={reset_token}"
self.email.send_reset_link(email, reset_link)
def reset_password(self, reset_token: str, new_password: str):
"""Apply password reset (one-time use)"""
token_hash = hash_password(reset_token)
reset = self.db.get_reset_token(token_hash)
# Validations
if not reset:
raise InvalidTokenError()
if reset['expires_at'] < datetime.utcnow():
raise TokenExpiredError()
if reset['used']:
raise TokenAlreadyUsedError() # Detects replay attacks
# Reset password
user = self.db.get_user(reset['user_id'])
new_hash = hash_password_secure(new_password)
user.password_hash = new_hash
user.password_changed_at = datetime.utcnow()
self.db.save_user(user)
# Mark token as used (prevent reuse)
self.db.mark_token_used(reset['id'])
# Invalidate all existing sessions (force re-login)
self.db.invalidate_user_sessions(user.id)
Password Breach Response
# What to do if user database is compromised
class BreachResponseHandler:
def handle_breach(self, breach_details: dict):
"""Respond to password database breach"""
# 1. Determine what was exposed
exposed_data = breach_details['fields']
# 2. Notify users immediately
if 'passwords' in exposed_data:
# Even though hashed, assume compromise
message = """
Our security team discovered unauthorized access to your account.
Please reset your password immediately. We recommend:
1. Change password on THIS site
2. Check other sites using same password
3. Monitor credit for fraud
"""
self._notify_all_users(message)
# 3. Force password reset for severe breaches
if 'plaintext_passwords' in exposed_data:
# Force reset for all users
self._require_password_reset_for_all()
# 4. Mandatory 2FA for affected users
if 'payment_methods' in exposed_data:
self._require_mfa_for_all()
# 5. Regulatory notification
if self._qualifies_for_regulatory_notice(exposed_data):
self._notify_regulators(breach_details)
# 6. Public transparency report
self._publish_transparency_report(breach_details)
Next Steps
- Read Authentication for complete password handling
- Study Secrets Management for API key hashing
- Explore Compliance for password storage requirements
- Implement Passwordless Authentication (WebAuthn, FIDO2)
- Design Zero Trust Architecture for additional security
- Learn Incident Response for breach scenarios
References
- OWASP Password Storage Cheat Sheet (owasp.org/CHEATSHEETS)
- NIST SP 800-63: Digital Identity Guidelines (nist.gov)
- Argon2 Paper (https://argon2.online/)
- HaveIBeenPwned API (haveibeenpwned.com)
- "The Security Developer's Handbook" by Norton, Steinberg
- Google's "Attack-Resistant Password Storage" (security.googleblog.com)