My friend asked me if it was possible to recover a very important Excel file that he forgot the password for. It's very important to him. I remember that a long time ago, hashcat could crack it, so I looked it up and recorded the usage.
A brief description of how to use the powerful password cracking tool John the Ripper (JtR) to recover a forgotten password for an XLSX file, covering several core cracking modes of John.
Prerequisites#
- John the Ripper: Make sure you have downloaded and extracted John the Ripper. It is highly recommended to use the community-enhanced version "Jumbo John", as it supports more hash types and GPU acceleration. https://github.com/openwall/john
- Target File: You need the
.xlsx
file for which you want to crack the password.
Step 1: Extract Hash from XLSX File#
John the Ripper cannot directly handle .xlsx
files; it requires a specially formatted "hash" string. We use the office2john.py
script to extract it.
- Open your terminal (PowerShell or CMD on Windows).
- Use the
cd
command to navigate to therun
directory of John the Ripper. - Run the following command:
# Replace "C:\path\to\your\file.xlsx" with the full path to your Excel file
python .\office2john.py "C:\path\to\your\file.xlsx" > hash.txt
This command will generate a file named hash.txt
, which contains the encrypted information needed by John the Ripper.
Step 2: Choose a Cracking Mode and Execute#
John the Ripper has multiple cracking modes, and choosing the right one for different scenarios is key to success.
Core Cracking Modes of John#
1. Dictionary Mode (Wordlist Mode)#
This is the most commonly used mode. You provide a dictionary file (wordlist) containing common passwords, and John will try them one by one. You can also use rules to mutate the dictionary words (e.g., pass
-> P@ss123
), greatly increasing the success rate.
# --wordlist= followed by the path to your dictionary file
john --wordlist=password.lst hash.txt
2. Incremental Mode#
Pure brute-force cracking. It will try all possible combinations of characters; theoretically, it can crack any password given enough time. However, it will be very slow for longer passwords.
# Try all combinations of lowercase letters up to 8 characters
john --incremental=Lower --max-len=8 hash.txt
3. Mask Mode - The Mode Used This Time#
When you have some understanding of the password structure, this is the most efficient mode. You can define the format of the password, significantly narrowing the search range.
?d
: Represents one digit (0-9)?l
: Represents one lowercase letter (a-z)?u
: Represents one uppercase letter (A-Z)?s
: Represents one special symbol (!@#$)
Example: Cracking a 6-digit numeric password.
You can refer to: https://in.security/2022/06/20/hashcat-pssw0rd-cracking-brute-force-mask-hybrid/
https://github.com/openwall/john/blob/bleeding-jumbo/doc/RULES
(It's very complex; it's recommended to ask AI directly)
john --mask=?d?d?d?d?d?d hash.txt
In simple terms:
? The symbol itself is not a character, but a "special instruction" or "prefix" that tells John: "Please note, the letter following me is not a regular letter, but a placeholder representing a specific character set."
Think of it as a fill-in-the-blank question:
__ __ __ __ __ __
?d?d?d?d?d?d This mask is equivalent to saying:
- In the first blank __ fill in adigit (?d)
- In the second blank __ fill in adigit (?d)
- ... and so on, filling all six blanks.
Detailed Breakdown#
Let's take a closer look at ? and the combination of letters that follow it.
1. Built-in Standard Placeholders#
John the Ripper has predefined some letters, which represent specific character sets when they follow ? . The most commonly used ones are:
Placeholder | Character Set Represented | Explanation | Example Characters |
---|---|---|---|
?d | Digits | Numbers | 0, 1, 2, ... 9 |
?l | Lower | Lowercase Letters | a, b, c, ... z |
?u | Upper | Uppercase Letters | A, B, C, ... Z |
?s | Special | Special Symbols (ASCII) | !, @, #, $ ... |
?a | All | All printable characters (?l+?u+?d+?s) | a, A, 1, ! ... |
?h | Hex, lower | Lowercase hexadecimal characters | 0-9, a-f |
?H | Hex, upper | Uppercase hexadecimal characters | 0-9, A-F |
?b | All 8-bit | All possible ASCII characters (0-255) | (All characters) |
2. How to Combine Them?#
You can freely combine these placeholders to construct what you think might be the password structure.
Example 1: A password that starts with an uppercase letter followed by 7 lowercase letters (e.g., Password)
--mask=?u?l?l?l?l?l?l?l
Example 2: A 4-digit ATM password followed by two uppercase letters (e.g., 1234AB)
--mask=?d?d?d?d?u?u
3. What if the password contains a regular letter?#
Anycharacter without ? prefix will be treated asa regular (or "literal") character. John the Ripper will consider the character in that position to be fixed.
Example 3: You know the password starts with pass- followed by 4 digits (e.g., pass-1234)
--mask=pass-?d?d?d?d
In this example,p, a, s, s, - are all fixed, and only the last four ?d positions will be brute-forced by John. This greatly reduces the search space!
4. More Advanced Usage: Custom Character Sets#
You can even define your own placeholders ?1, ?2, ?3 etc.
Example 4: You know the password is 8 characters long and only contains a, b, c, 1, 2, 3 characters.
You can define a custom character set ?1 and then repeat it 8 times.
john --mask='?1?1?1?1?1?1?1?1' --mask-char-?1='abc123' hash.txt
- <font style="color:rgb(26, 28, 30);">--mask-char-?1='abc123'</font><font style="color:rgb(26, 28, 30);">: This part defines that</font><font style="color:rgb(26, 28, 30);"> </font><font style="color:rgb(26, 28, 30);">?1</font><font style="color:rgb(26, 28, 30);"> </font><font style="color:rgb(26, 28, 30);">represents the character set</font><font style="color:rgb(26, 28, 30);"> </font><font style="color:rgb(26, 28, 30);">'abc123'</font><font style="color:rgb(26, 28, 30);">.</font>
- <font style="color:rgb(26, 28, 30);">--mask='?1?1?1?1?1?1?1?1'</font><font style="color:rgb(26, 28, 30);">: This part tells John the Ripper that the password consists of 8 characters from</font><font style="color:rgb(26, 28, 30);"> </font><font style="color:rgb(26, 28, 30);">?1</font><font style="color:rgb(26, 28, 30);"> </font><font style="color:rgb(26, 28, 30);">character set.</font>
? The symbol itself has no meaning, but itgives special meaning to the letter that follows it, allowing you to shift from "blindly brute-forcing all possibilities" to "strategically brute-forcing specific formats," thus reducing cracking time from years to seconds.
4. Single Crack Mode#
The mode that John tries first by default, which is extremely fast. It will use information such as usernames from the hash file to perform simple transformations and guesses.
# No mode parameters added, it will be enabled by default
john hash.txt
Practical Exercise: Cracking a 6-Digit Password for an XLSX File#
In our practical exercise, we know the password is a 6-digit number, so we choose mask mode.
The ideal command is:
.\john --mask=?d?d?d?d?d?d hash.txt
Common Issues and Solutions#
Error: Error: UTF-16 BOM seen in input file.
#
- Problem: John cannot recognize the file encoding of
hash.txt
. - Reason: When creating the file using the
>
redirection symbol in Windows PowerShell, the default encoding isUTF-16
, while John requiresUTF-8
orASCII
. - Solution:
- Open the
hash.txt
file with Notepad. - Select "File" -> "Save As".
- In the pop-up window, change the "Encoding" from
UTF-16 LE
toUTF-8
. - Save and overwrite the original file.
- Open the
Step 3: View the Cracking Results#
When the command executes successfully, you will see output similar to the following:
Warning: detected hash type "Office", but the string is also recognized as "office-opencl"
Use the "--format=office-opencl" option to force loading these as that type instead
Using default input encoding: UTF-8
Loaded 1 password hash (Office, 2007/2010/2013 [SHA1 256/256 AVX2 8x / SHA512 256/256 AVX2 4x AES])
Cost 1 (MS Office version) is 2007 for all loaded hashes
Cost 2 (iteration count) is 50000 for all loaded hashes
Will run 32 OpenMP threads
Press 'q' or Ctrl-C to abort, almost any other key for status
933728 (WeChat Registration 1 (1).xlsx)
1g 0:00:00:29 DONE (2025-09-05 19:45) 0.03344g/s 20067p/s 20067c/s 20067C/s 616778..351115
Result Interpretation:
- Cracked Password:
933728
- Time Taken:
0:00:00:29
, which is 29 seconds.
If you want to view the cracked password again later, you can run:
.\john --show hash.txt
That's it.
PS:
There are many third-party software options available. I found a software called Passper for Excel.exe, which seems to also call John and supports a GUI. It is recommended to use this directly if needed; search for related Passper for Excel crack.