SECURITY WRITEUP

Deep Dive into Shannon Entropy: Evaluating Password Strength Mathematically

Posted on June 4, 2026 · Krish Attri · 6 min read

When assessing user credentials during a security audit, traditional password checkers rely on arbitrary regex rules (e.g., "must contain 1 symbol, 1 uppercase letter"). While this enforces character diversity, it fails to measure actual information density or resistance to brute-force attacks. This is where Shannon Entropy becomes an invaluable metric.

What is Shannon Entropy?

Originating from Claude Shannon's 1948 paper on mathematical theories of communication, information entropy measures the uncertainty or unpredictability in a message. In the context of password analysis, it represents the minimum number of bits needed to represent the search space, assuming all characters in the defined pool are selected with equal probability.

H = L * log₂(R)

Where:

H = Entropy of the password (in bits).
L = Length of the password (number of characters).
R = Size of the character pool (charset range).

Defining the Character Pool (R)

To accurately calculate entropy, we inspect the input string and identify active character sets:

Lowercase letters [a-z]: R = 26
Uppercase letters [A-Z]: R = 52 (26 + 26)
Numeric digits [0-9]: R = 62 (52 + 10)
Common special symbols (e.g., !@#$%^&*()): R = 94 (62 + 32)

Python Implementation

Below is a production-ready Python implementation evaluating password entropy dynamically and estimating cracking speeds against a high-performance offline brute-force rig running at 10 billion hashes/second (typical GPU rig constraints):

import math

def calculate_entropy(password: str) -> float:
    if not password:
        return 0.0
    
    pool_size = 0
    if any(c.islower() for c in password):
        pool_size += 26
    if any(c.isupper() for c in password):
        pool_size += 26
    if any(c.isdigit() for c in password):
        pool_size += 10
    if any(not c.isalnum() for c in password):
        pool_size += 32
        
    if pool_size == 0:
        return 0.0
        
    entropy = len(password) * math.log2(pool_size)
    return round(entropy, 2)

def estimate_crack_time(entropy: float) -> str:
    # Search space size
    combinations = 2 ** entropy
    # Guess rate: 10 billion (10^10) guesses/sec
    guess_rate = 1e10
    seconds = combinations / (2 * guess_rate) # average time is half search space
    
    if seconds < 1:
        return "Instantaneous"
    elif seconds < 60:
        return f"{round(seconds, 2)} seconds"
    elif seconds < 3600:
        return f"{round(seconds / 60, 2)} minutes"
    elif seconds < 86400:
        return f"{round(seconds / 3600, 2)} hours"
    elif seconds < 31536000:
        return f"{round(seconds / 86400, 2)} days"
    else:
        return f"{round(seconds / 31536000, 2)} years"

# Example run
pwd = "Hardened_Pass!26"
bits = calculate_entropy(pwd)
print(f"Password: {pwd} -> Entropy: {bits} bits | Cracking Time: {estimate_crack_time(bits)}")

Interpreting Entropy Scores

We classify credentials based on their active bit strength to determine system safety:

< 28 bits: Very Weak. Vulnerable to instant brute-force.
28 - 35 bits: Weak. Crackable in minutes.
36 - 59 bits: Medium. Reasonable security for low-value systems.
60 - 127 bits: Strong. Safe from modern offline GPU cracking rigs.
> 127 bits: Overkill / Cryptographically Secure.

Beyond Pure Math: Dictionary Attacks

Pure Shannon Entropy assumes characters are chosen uniformly at random. In practice, users select common phrases or dictionary words. Therefore, a production password analyzer must combine dynamic entropy evaluation with a Trie-based dictionary matcher to catch predictable patterns (e.g., "P@ssword123" which mathematically scores high but is vulnerable to dictionary mapping). This is the exact approach implemented in our flagship Password Strength Analyzer.