Cartouche Puzzle Solving with Python

For script writing
Post Reply
User avatar
ungawa
AFH
Posts: 377
Joined: Sat Aug 25, 2012 3:16 pm

Cartouche Puzzle Solving with Python

Post by ungawa » Wed Sep 09, 2015 2:34 pm

As I mentioned in the G-D Cartouche thread, I modified a Phython based Simple Substitution Cryptogram solver I found online to try to start brute forcing the Cartouche puzzle. However, I didn't have much luck with a rough pass through the 10! orderings of the strings, and trying a more in depth pass greatly increases the necessary computing power. I figured I'd share what I have in case anyone else is interested, either in running it to help divide the load, or in looking at the code to see where there's room to improve efficiency. (I'm an engineer, not a computer scientist, so don't judge the code too harshly)

Assumptions:
The 80 known and confirmed cartouches form 10 strings of 10 characters, as theorized by Log from Blammo
Letter frequency analysis of these 10 strings matches plaintext English quite well, so with my limited knowledge it seems that it likely involves a simple substitution cipher as opposed to some other cipher that disguises letter frequency. (http://practicalcryptography.com/cipher ... on-cipher/)
There hasn't been any luck trying to decipher the 10 strings directly, so the ciphertext probably doesn't contain the strings themselves
There are many possible ways to arrange the strings to try to generate the ciphertext. My code focuses on the following possibility:
-The 10 strings are arranged vertically in a 10x10 grid
-The ciphertext is read from the resulting grid, left to right, top to bottom
-Checking all 10! solutions and reading left to right will simultaneously account for all solutions for reading right to left as well.
10 strings arranged as columns. 10! possible orders of the columns, to be read left to right, top to bottom to create cipher text

Code: Select all

V	Q	B	Q	C	T	X	U	U	T
R	V	U	Q	V	I	B	M	V	D
H	A	G	C	V	W	Q	C	G	D
G	Z	V	C	O	Q	M	K	C	D
O	H	I	Q	Q	K	V	V	G	O
B	B	O	K	Q	K	G	Q	I	U
B	G	V	U	Z	G	D	H	T	J
V	O	B	L	G	X	X	L	V	W
K	L	G	X	Z	T	G	B	T	C
K	O	B	U	D	X	U	Q	Q	Z
Based on these assumptions, I found a Python based Simple Substitution solver online, and modified it slightly and wrapped it in a loop to try the 10! orderings of the strings.


Setting up to run the code:
-Download the attached code, "break_cartouche.py", which was modified from "break_simplesub.py" from here: http://practicalcryptography.com/crypta ... on-cipher/
-I've been running the code in Python 2.7.10 which can be found here: https://www.python.org/downloads/
--Note that this code will not run in Python 3.X. I'm led to believe Python 3.X is the devil. Or something.
-This code imports from "pycipher" which can be found here: https://pypi.python.org/pypi/pycipher
--If you don't want to follow the instructions there to install "pycipher" using pip, I believe it's possible to install by unzipping the download and running "setup.py" from the command line.
-It also requires a quadgram dictionary. I have been using "english_quadgrams.txt.zip" from here: http://practicalcryptography.com/crypta ... ementation
--This should be unzipped and the txt file placed in the same folder where you're running "break_cartouche.py"

After completing that setup, you can run "break_cartouche.py" from the python shell or the command line.
-To run from the command line, I've been navigating to the folder with my code and running the following command. c:\python27\python.exe break_cartouche.py
-When run, it will create an output file, "cartouche_solutions.txt", and save any solutions that score better than a certain threshold.
-If you stop and start again, it will start from the initial order. However, you can set the starting order from within "break_cartouche.py"
-Running "break_cartouche.py" when "cartouche_solutions.txt" already exists will append new solutions to the existing file, not overwrite

Simple Substitution Solver Basics
The basic idea of the solver I based this code on is to guess a key, generate decoded text from the guessed key and ciphertext, and then analyze how closely the decoded text resemble English. They look at all 97 quadgrams (4 sequential letters) in the decoded text, score each quadgram based on the quadgram dictionary (a database of the frequency of any quadgram in plain English), and add up those scores to score the decoded text. In this way, deciphered text that more closely resembles English will score better, allowing the solver to find solutions, even if not all words are in the dictionary.

Parameters
User parameters that can be changed within "break_cartouche.py":

Code: Select all

#User defined parameters
orderTrack = 1023456789 #Integer who's digits represent the order of the columns to be checked. Update based on desired starting order. Should be 10 digits. Lazy coding means 0 can't be the first digit.
iterationsPerOrder = 20 #Default = 20. Number of iterations to shuffle the key for the current order. Each iteration will shuffle the parent key to look for better solutions in case the previous best iteration was stuck at a local maxima
iterationsPerKey = 1000 #Default = 1000. Number of iterations to optimize the current key by swapping 2 letters of the key randomly
scoreThreshold = -480   #Default = -480.  The threshold to output that iteration's solution to the txt file.  Scores will be negative, with scores closer to 0 being a closer match to plaintext English.
provideUpdate = 150       #Default = 150. If more than this many orders have been checked since the last output to the txt file, write an update to the file to inform the latest order checked
orderTrack: set to a 10 digit number to define the starting order of the columns. should be updated based on where the previous run left off
iterationsPerOrder: How many times should the program randomly reshuffle the substitution key. Higher numbers mean longer run time but more thorough search
iterationsPerKey: How many times should the program randomly swap 2 letters in the current key to try to improve the score. Higher numbers mean longer run time but more thorough search
scoreThreshold: What score should be the threshold to include a result in the output txt. Scores are negative, with numbers closer to 0 being better. I expect the solution to be much closer to 0 than the default -480, but I figure memory is relatively cheap and the periodic storage of gibberish solutions at least gives some peace of mind that it's running as expected.
provideUpdate: How many orders are allowed to be checked without updating the txt file if no solutions meet the score threshold.

Run Time
I ran my rough pass with lower numbers for iterationsPerOrder and iterationsPerKey. At the default values in the attached code, it takes my computer (2.4 GHz processor laptop) about 12 seconds to check each order, or 504 days of continuous running to check 10! orders.
-The iterationsPerKey = 1000 is the default value from the base simple substitution solver
-The iterationsPerOrder = 20 default is something I made up. The base solver just used a while true infinite loop here since they only had to worry about a single ciphertext. However, this setting shouldn't matter a whole lot, since running through the 10! orders twice at 20 would yeild the same results as running through it once at 40.


Room for Improvement
These are beyond my coding and cyrptography knowledge, but here are some ideas that could theoretically shrink the search space
-Use knowledge of the letter frequency to skip checking unlikely keys, such as a key that would result in the letter "Z" appearing a large number of times in the paintext.
-Rather than checking through the column orders incrementally, try working forward from the current best solution and swapping only a small number of columns at a time.
Attachments
break_cartouche.zip
Cartouche simple substitution solver
(3 KiB) Downloaded 243 times

User avatar
cheesecookie
Inscrutable Pi
Posts: 246
Joined: Mon Nov 24, 2014 4:53 pm

Re: Cartouche Puzzle Solving with Python

Post by cheesecookie » Wed Sep 09, 2015 5:43 pm

Python 3.0 is the devil.

I just needed to say that before I could read further :)

Post Reply