Cryptopals Challenge 1.4: Detect Single-Character XOR
This challenge is basically Challenge 3 but scaled up. Instead of one ciphertext, we get a file full of hex strings and one of them has been encrypted with single-byte XOR. We need to find which one.
Challenge
This one builds directly on Challenge 3. If you haven’t read that one yet, I recommend doing so first because we are reusing the same scoring approach here.
Description
Here is what the challenge asks us to do:
One of the 60-character strings in the file “4.txt” has been encrypted by single-character XOR. Find it.
(Your code from #3 should help.)
So we have a text file full of hex-encoded strings, and exactly one of them has been XOR’d against a single character. We need to figure out which line it is and what the key is.
Solution
Thought Process
The first thing that came to my mind was to mostly copy-paste from Challenge 3 and just add a file reading part that processes each line. The plan was simple:
- Read the file line by line
- Feed each line into HexDecoder to get raw bytes
- Run the same brute-force key finder from Challenge 3 on each line
- Keep track of the best result across all lines, not just the current one
The key difference from Challenge 3 is that we now need two levels of “best”: the best key for the current line and the best result globally across all lines.
Setting Things Up
Same libraries as before, plus we need <fstream> for file reading:
#include <climits> #include <cryptopp/hex.h> #include <cryptopp/filters.h> #include <cryptopp/base64.h> #include <cstring> #include <iostream> #include <fstream> using namespace CryptoPP;
I also decided to use a struct this time to keep the results organized. In Challenge 3 I had separate variables for the best key, best score, best plaintext, etc. and it was getting messy. A struct bundles everything together nicely and I even added a print function to it:
struct best_result { std::string best_cyphertext; std::string best_plaintext; int best_score; char best_key; void printInternalData() { std::cout << "Ciphertext (hex): " << best_cyphertext << "\n"; std::cout << "Plaintext: " << best_plaintext << "\n"; std::cout << "Score: " << best_score << "\n"; std::cout << "Key: " << (int)best_key << " ('" << best_key << "')\n"; } };
Reading the File
First we open the file and make sure it actually exists:
std::ifstream file("4.txt"); if (!file) { std::cout << "Something wrong with the file"; return -1; }
Then we set up our global tracking variables and read line by line:
int global_best_score = 0; char global_best_key = 0; std::string global_best_plaintext; std::string global_best_cyphertext; std::string line; int line_number = 0; while (std::getline(file, line)) { line_number++; // Skip empty lines if (line.empty()) continue; // Decode hex to raw bytes std::string cyphertext; StringSource ss1(line, true, new HexDecoder(new StringSink(cyphertext)));
The Two-Level Brute Force
This is where it differs from Challenge 3. For each line, we try all 256 keys and find the best key for that line. Then we compare that line’s best score against the global best:
int best_score_cur_line = 0; char best_key_cur_line = 0; std::string best_plaintext_cur_line; for (int key = 0; key < 256; key++) { std::string temp_plaintext; for (size_t i = 0; i < cyphertext.size(); i++) { temp_plaintext += cyphertext[i] ^ key; } int current_score = score_calc(temp_plaintext); // Best for current line? if (current_score > best_score_cur_line) { best_score_cur_line = current_score; best_plaintext_cur_line = temp_plaintext; best_key_cur_line = key; } } // Better than global best? if (best_score_cur_line > global_best_score) { global_best_score = best_score_cur_line; global_best_key = best_key_cur_line; global_best_plaintext = best_plaintext_cur_line; global_best_cyphertext = line; std::cout << "NEW best found at line " << line_number << " (score: " << global_best_score << ")\n"; } }
I added that print statement inside the loop so I could watch the program find better and better candidates as it goes through the file. It’s satisfying to see it converge on the answer.
Printing the Result
After processing all lines, we package everything into our struct and print it:
file.close(); std::cout << "******* Final Result *******\n"; best_result result{global_best_cyphertext, global_best_plaintext, global_best_score, global_best_key}; result.printInternalData();
Running this on the file, we can see it narrowing down as it goes through the lines, and the final result is: the encrypted line was on line 171, the key was 53 (the character '5'), and the decrypted message is “Now that the party is jumping\n”. Another fun hidden message from the Cryptopals authors!
The Scoring Function
Same score_calc function from Challenge 3, completely unchanged. It already works well enough to distinguish English from garbage across hundreds of lines:
int score_calc(const std::string &text) { int score = 0; for (char c : text) { if (c == ' ') score += 3; else if (c == 'e' || c == 'E' || c == 't' || c == 'T' || c == 'a' || c == 'A' || c == 'o' || c == 'O' || c == 'i' || c == 'I' || c == 'n' || c == 'N') score += 2; else if ((c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z')) score += 1; else if ((c >= '0' && c <= '9') || c == '.' || c == ',' || c == '!' || c == '?' || c == '\'' || c == '"' || c == ';' || c == ':') score += 1; else if (c < 32 || c > 126) score -= 5; else score += 0; } return score; }
You can reach the full code at my GitHub: https://github.com/AydoganArslantash/Cryptopals-Solutions/blob/main/set1/chal4.cpp