The 80 known and confirmed cartouches form 10 strings of 10 characters, as theorized by Log from Blammo
Letter frequency analysis of these 10 strings matches plaintext English quite well, so with my limited knowledge it seems that it likely involves a simple substitution cipher as opposed to some other cipher that disguises letter frequency. (http://practicalcryptography.com/cipher ... on-cipher/)
There hasn't been any luck trying to decipher the 10 strings directly, so the ciphertext probably doesn't contain the strings themselves
There are many possible ways to arrange the strings to try to generate the ciphertext. My code focuses on the following possibility:
-The 10 strings are arranged vertically in a 10x10 grid
-The ciphertext is read from the resulting grid, left to right, top to bottom
-Checking all 10! solutions and reading left to right will simultaneously account for all solutions for reading right to left as well.
Code: Select all
V Q B Q C T X U U T R V U Q V I B M V D H A G C V W Q C G D G Z V C O Q M K C D O H I Q Q K V V G O B B O K Q K G Q I U B G V U Z G D H T J V O B L G X X L V W K L G X Z T G B T C K O B U D X U Q Q Z
Setting up to run the code:
-Download the attached code, "break_cartouche.py", which was modified from "break_simplesub.py" from here: http://practicalcryptography.com/crypta ... on-cipher/
-I've been running the code in Python 2.7.10 which can be found here: https://www.python.org/downloads/
--Note that this code will not run in Python 3.X. I'm led to believe Python 3.X is the devil. Or something.
-This code imports from "pycipher" which can be found here: https://pypi.python.org/pypi/pycipher
--If you don't want to follow the instructions there to install "pycipher" using pip, I believe it's possible to install by unzipping the download and running "setup.py" from the command line.
-It also requires a quadgram dictionary. I have been using "english_quadgrams.txt.zip" from here: http://practicalcryptography.com/crypta ... ementation
--This should be unzipped and the txt file placed in the same folder where you're running "break_cartouche.py"
After completing that setup, you can run "break_cartouche.py" from the python shell or the command line.
-To run from the command line, I've been navigating to the folder with my code and running the following command. c:\python27\python.exe break_cartouche.py
-When run, it will create an output file, "cartouche_solutions.txt", and save any solutions that score better than a certain threshold.
-If you stop and start again, it will start from the initial order. However, you can set the starting order from within "break_cartouche.py"
-Running "break_cartouche.py" when "cartouche_solutions.txt" already exists will append new solutions to the existing file, not overwrite
Simple Substitution Solver Basics
The basic idea of the solver I based this code on is to guess a key, generate decoded text from the guessed key and ciphertext, and then analyze how closely the decoded text resemble English. They look at all 97 quadgrams (4 sequential letters) in the decoded text, score each quadgram based on the quadgram dictionary (a database of the frequency of any quadgram in plain English), and add up those scores to score the decoded text. In this way, deciphered text that more closely resembles English will score better, allowing the solver to find solutions, even if not all words are in the dictionary.
User parameters that can be changed within "break_cartouche.py":
Code: Select all
#User defined parameters orderTrack = 1023456789 #Integer who's digits represent the order of the columns to be checked. Update based on desired starting order. Should be 10 digits. Lazy coding means 0 can't be the first digit. iterationsPerOrder = 20 #Default = 20. Number of iterations to shuffle the key for the current order. Each iteration will shuffle the parent key to look for better solutions in case the previous best iteration was stuck at a local maxima iterationsPerKey = 1000 #Default = 1000. Number of iterations to optimize the current key by swapping 2 letters of the key randomly scoreThreshold = -480 #Default = -480. The threshold to output that iteration's solution to the txt file. Scores will be negative, with scores closer to 0 being a closer match to plaintext English. provideUpdate = 150 #Default = 150. If more than this many orders have been checked since the last output to the txt file, write an update to the file to inform the latest order checked
iterationsPerOrder: How many times should the program randomly reshuffle the substitution key. Higher numbers mean longer run time but more thorough search
iterationsPerKey: How many times should the program randomly swap 2 letters in the current key to try to improve the score. Higher numbers mean longer run time but more thorough search
scoreThreshold: What score should be the threshold to include a result in the output txt. Scores are negative, with numbers closer to 0 being better. I expect the solution to be much closer to 0 than the default -480, but I figure memory is relatively cheap and the periodic storage of gibberish solutions at least gives some peace of mind that it's running as expected.
provideUpdate: How many orders are allowed to be checked without updating the txt file if no solutions meet the score threshold.
I ran my rough pass with lower numbers for iterationsPerOrder and iterationsPerKey. At the default values in the attached code, it takes my computer (2.4 GHz processor laptop) about 12 seconds to check each order, or 504 days of continuous running to check 10! orders.
-The iterationsPerKey = 1000 is the default value from the base simple substitution solver
-The iterationsPerOrder = 20 default is something I made up. The base solver just used a while true infinite loop here since they only had to worry about a single ciphertext. However, this setting shouldn't matter a whole lot, since running through the 10! orders twice at 20 would yeild the same results as running through it once at 40.
Room for Improvement
These are beyond my coding and cyrptography knowledge, but here are some ideas that could theoretically shrink the search space
-Use knowledge of the letter frequency to skip checking unlikely keys, such as a key that would result in the letter "Z" appearing a large number of times in the paintext.
-Rather than checking through the column orders incrementally, try working forward from the current best solution and swapping only a small number of columns at a time.