How to Decrypt “Gravity Falls” Ciphers Like a (Literal) Pro (Guest Post)
↓ CLICK HERE TO JUMP TO THE GUEST POST ↓
So, while I’m doing NaNoWriMo and hiatusing, Wervyn (my husband, for those not in the know) has taken on a new hobby! We’ve been watching the TV series Gravity Falls together, me inspired by friend recommendations, and he because his software engineer coworkers told him that each episode has a cryptogram to solve!
In case you weren’t aware, Wervyn is puzzle-mad. His schedule every day includes playing a few daily puzzle games on his computer, and he has a growing collection of Rubix Cubes and Hanayama metal puzzles on his desk at work. And likely because of his puzzle obsession, which turned into programming obsession, he has always loved cryptograms of all kinds. So I’m the goofy, loud, free spirited artist, and he’s the goofy, awkward brainiac obsessed with learning all he can about how the universe works. Basically, I’m Mabel and he’s Dipper.
Needless to say, I have strictly right-brain-type motives for watching the show (Mabel is my favorite character, I aspire to buy more plaid shirts and wear a hunting hat like Wendy, and I recognize some of the guest voice actors without reading the credits) but the cryptography aspect is really fun! Plus, I love seeing how much Wervyn enjoys tinkering with it; he wrote a little program in Python to help him decipher them!
He’s been having so much fun that he essentially started writing up a tutorial blog post because he hasn’t found any articles online that discuss his solving method. And when he confessed that he wasn’t sure where to post it, in typical Mabel style I offered, “Post it as a guest post on my blog! Heck, my tagline has ‘Geek’ in the title: it fits!”
So this is a much nerdier left-brain post than most of you have come to expect from this blog, but I really think this is super cool, and I’m proud to host it here! Plus, it’s been a while since I’ve given Wervyn a full-fledged guest spot around here, so without further ado, enjoy!
How to Decrypt “Gravity Falls” Ciphers Like a (Literal) Pro
A guest post by Wervyn
Watching for keywords in Gravity Falls but finding it’s just too much work to analyze every frame of artwork for hidden messages? Well let me teach you to crack a Vigenère cipher like a professional cryptanalyst would.
First, let’s get it out there that while the Vigenère cipher may have, at one time in the past, been called le chiffre indéchiffrable, it is anything but. Anyone who’s ever done a newspaper cryptogram knows that just because you write something in code, doesn’t mean that no one can read it. We MAKE ciphers like this to be broken for fun, and a Vigenère cipher is no exception. If you actually have some information that you want to encrypt securely, use a modern standard like AES.
Now you probably already know that Vigenère ciphers function with the use of a keyword, repeated over and over again and combined with your plaintext message to produce ciphertext, and vice-versa. Typically the method for handling this is presented as a giant table of letters, but it’s actually much easier to work with the standard Vigenère cipher using what’s called modulo arithmetic.
In this case, much like the A1Z26 cipher, we assign every letter a number from 0 to 25. (A = 0, Z = 25, so one less than what you may be used to.) Then, to encrypt, we ADD a letter from the plaintext to the appropriate letter of the key, then subtract 26 if the result is greater than or equal to 26. To decrypt, we SUBTRACT the key from the ciphertext, and add 26 if the result is less than zero. The number 26 here is called a “modulus”, hence modulo arithmetic, and in fact the cryptography that goes on in your web browser today is not all that far removed from math exactly like this.
One of the first major points of attack in a Vigenère cipher is that because the key is repeated over and over again, information about how long it is shows up like a signature in the encrypted text. Let’s use the end credits cipher from S2E15 as an example:
Before we do anything else, we can make a very good guess at the length of the key, just by noticing repeated strings of letters in the ciphertext. We’ll be using transcripts from a Python program I wrote (found here) to help us out with this and the rest of the decryption.
>>> vt = subcipher.VigenereSolver("S UPYTYH DIP GAVO QETHI MCBK OHK XEXJB VRW YOUWCHIA VRSV OQ LRDIA")
>>> print vt.countRepetition()
s upytyh DIp gavo qethi mcbk ohk xexjb vrw youwchia vrsv oq lrDIa (Occurrance: di, min distance 43)
s upytyh dip gavo qethi mcbk ohk xexjb VRw youwchia VRsv oq lrdia (Occurrance: vr, min distance 11)
s upytyh dip gavo qetHI mcbk ohk xexjb vrw youwcHIa vrsv oq lrdia (Occurrance: hi, min distance 22)
s upytyh dip gaVO Qethi mcbk ohk xexjb vrw youwchia vrsV OQ lrdia (Occurrance: voq, min distance 33)
s upytyh dip gAVo qethi mcbk ohk xexjb vrw youwchiA Vrsv oq lrdia (Occurrance: av, min distance 30)
s upytyh dip gavo qethi mcbk ohk xexjb vrw youwchIA vrsv oq lrdIA (Occurrance: ia, min distance 11)
Notice how the numbers 11, 22, and 33 immediately jump out? Generally, when you see repetition of several consecutive letters in the ciphertext, that means the same plaintext letters were encrypted with the same letters from the key, and thus the distance from one instance to the next would be a multiple of the length of the key. The longer the repeated string, the more likely it is that this is the case. So in the cases of “di” (distance 43) and “av” (distance 30), two letters isn’t much to go on and these are probably coincidences. But with “voq” (distance 33), combined with the other multiples of 11, we can be pretty sure the key is 11 letters long.
Another more complicated method of guessing the key length is by calculating the likelihood that two letters, picked at random, will be the same. I won’t go into the exact details of that formula here (you can find it in the code), but the basic premise is that English (and human language in general) has a lot of repetition and redundancy. Some letters (like E, T, A, etc.) are much more common than others (like J, Q, and Z), and thus the likelihood of picking the same letter twice is much higher than if all letters happened with about the same frequency (it’s about 6.7% for English vs. 3.8% for random letters).
Now generally, Vigenère ciphers look pretty close to random, especially as the key gets longer. However, if you guess a key length and then break the ciphertext letters apart into groups based on which letter in the key would be used, and then average the comparison of each group, something very interesting happens:
>>> vt.indexOfCoincidence()
(11, 0.1484848484848485)
>>> vt.indexOfCoincidence(all=True)
[(11, 0.1484848484848485), (6, 0.03968253968253969), (3, 0.038126361655773426), (10, 0.03333333333333334), (1, 0.03265602322206096), (12, 0.030555555555555558), (2, 0.028205128205128206), (5, 0.024242424242424242), (9, 0.022222222222222223), (7, 0.02040816326530612), (4, 0.017857142857142856), (8, 0.005952380952380952)]
As you can see, every option for length from 1 to 12 has a value less than 4%… except for 11, where the probability jumps up to a whopping 14%!
Okay, so now we’re certain twice over that the key is 11 letters long. Where do we go from here? Well typically you would do what’s called a key elimination, decrypting the ciphertext with itself, offset by the key length (without an offset you’d just get all A’s), to effectively remove the key from consideration, and have just English words merged together that you can pretty easily tease apart. This is especially important if the key is assumed to be just a bunch of random letters. In this case though, we suspect from experience the key is something intelligible, if not English then at least something we’d be likely to recognize if it suddenly jumped out at us. So we’ll skip that step and jump straight to exploratory decryption.
>>> vt.setKey("Sabcdefghij")
>>> vt
ctext: S UPYTYH DIP GAVO QETHI MCBK OHK XEXJB VRW YOUWCHIA VRSV OQ LRDIA
key: S abcdef ghi jSab cdefg hijS abc defgh ijS abcdefgh ijSa bc defgh
ptext: A UOWQUC XBH XIVN OBPCC FUSS OGI UASDU NIE YNSTYCCT NIAV NO INYCT
First, we start with an 11 letter key and see where the repetition lands. Why start with an S at the beginning? Well another major element that works in our favor is that all of the spaces (and punctuation, when applicable) have been left in the cipher. That actually tells us a LOT, it’s basically unencrypted information freely given to us. So we guess right up front that the first letter is probably A. A = 0, so S – ? = A? S of course.
That immediately gives us some probable plaintext to work with, and one of the best candidates is the ciphertext “VRW”, currently decrypted into ” – – E” (ignoring the first two letters since they’re probably wrong), combined with the ciphertext “VRSV”, decrypted as ” – – A – “. What’s a three letter word that ends in E and a four letter word with the third letter A, knowing the first two letters of each are the same?
>>> vt.setKey("STbcdefghTH")
>>> vt
ctext: S UPYTYH DIP GAVO QETHI MCBK OHK XEXJB VRW YOUWCHIA VRSV OQ LRDIA
key: S tbcdef ght hStb cdefg hthS tbc defgh thS tbcdefgh thSt bc defgh
ptext: A BOWQUC XBW ZICN OBPCC FJUS VGI UASDU CKE FNSTYCCT CKAC NO INYCT
What I’m doing here is taking advantage of the way the modulo arithmetic in the Vigenère cipher works to quickly check target plaintext by using it as a key. Remember, the plaintext is the ciphertext minus the key, so that also means the key is the ciphertext minus the plaintext. I plugged the word “that” into the appropriate parts of the key, and “ck – c” popped out. Next, I can just plug that back in as key letters and see if the situation improves.
>>> vt.setKey("SCaaaaaaaCK")
>>> vt
ctext: S UPYTYH DIP GAVO QETHI MCBK OHK XEXJB VRW YOUWCHIA VRSV OQ LRDIA
key: S caaaaa aac kSca aaaaa ackS caa aaaaa ckS caaaaaaa ckSc aa aaaaa
ptext: A SPYTYH DIN WITO QETHI MARS MHK XEXJB THE WOUWCHIA THAT OQ LRDIA
I’ve replaced the unknown letters with A’s to make it easier for me to keep straight what’s been decrypted and what hasn’t. And hey, look at that, “GAVO” decrypts to “WIT – “, wonder what that word could be…
>>> vt.setKey('SCHaaaaaaCK')
>>> vt
ctext: S UPYTYH DIP GAVO QETHI MCBK OHK XEXJB VRW YOUWCHIA VRSV OQ LRDIA
key: S chaaaa aac kSch aaaaa ackS cha aaaaa ckS chaaaaaa ckSc ha aaaaa
ptext: A SIYTYH DIN WITH QETHI MARS MAK XEXJB THE WHUWCHIA THAT HQ LRDIA
Here we get lucky, turns out that H + H = O, so we already have the right letter in the key and can look for more partially decrypted words. “VRSV OQ” = “THAT H – ” is a good candidate.
>>> vt.setKey("SCHEaaaaaCK")
>>> vt
ctext: S UPYTYH DIP GAVO QETHI MCBK OHK XEXJB VRW YOUWCHIA VRSV OQ LRDIA
key: S cheaaa aac kSch eaaaa ackS che aaaaa ckS cheaaaaa ckSc he aaaaa
ptext: A SIUTYH DIN WITH METHI MARS MAG XEXJB THE WHQWCHIA THAT HM LRDIA
>>> vt.setKey("SCHMaaaaaCK")
>>> vt
ctext: S UPYTYH DIP GAVO QETHI MCBK OHK XEXJB VRW YOUWCHIA VRSV OQ LRDIA
key: S chmaaa aac kSch maaaa ackS chm aaaaa ckS chmaaaaa ckSc hm aaaaa
ptext: A SIMTYH DIN WITH EETHI MARS MAY XEXJB THE WHIWCHIA THAT HE LRDIA
We’re making very good progress here, everything so far seems to be decrypting into something sensible, and there are just five letters left. I’m going to go ahead and guess that “MCBK” = “EARS” and see where that gets us.
>>> vt.setKey("SCHMaaaaECK")
>>> vt
ctext: S UPYTYH DIP GAVO QETHI MCBK OHK XEXJB VRW YOUWCHIA VRSV OQ LRDIA
key: S chmaaa aec kSch maaaa eckS chm aaaae ckS chmaaaae ckSc hm aaaae
ptext: A SIMTYH DEN WITH EETHI IARS MAY XEXJX THE WHIWCHIW THAT HE LRDIW
>>> vt.setKey("SCHMaaaaICK")
>>> vt
ctext: S UPYTYH DIP GAVO QETHI MCBK OHK XEXJB VRW YOUWCHIA VRSV OQ LRDIA
key: S chmaaa aic kSch maaaa ickS chm aaaai ckS chmaaaai ckSc hm aaaai
ptext: A SIMTYH DAN WITH EETHI EARS MAY XEXJT THE WHIWCHIS THAT HE LRDIS
We haven’t really even looked at the key we’re producing so far. “SCHM – – – – ICK” doesn’t look like any English word, but it does look like it could be a name or something, and in any case everything so far has been clicking into place across the entire cipher — that’s a really good way to tell you’re on the right track. With just four letters to go we’re going to take an intuitive leap and guess that the last word is “HEARS”. It rhymes and fits with “EARS” and it makes grammatical sense.
>>> vt.setKey("schmHEARick")
>>> vt
ctext: S UPYTYH DIP GAVO QETHI MCBK OHK XEXJB VRW YOUWCHIA VRSV OQ LRDIA
key: S chmhea ric kSch mhear ickS chm heari ckS chmheari ckSc hm heari
ptext: A SIMMUH MAN WITH EXPHR EARS MAY QAXST THE WHIPYHRS THAT HE ENDRS
>>> vt.setKey("schmENDRick")
>>> vt
ctext: S UPYTYH DIP GAVO QETHI MCBK OHK XEXJB VRW YOUWCHIA VRSV OQ LRDIA
key: S chmend ric kSch mendr ickS chm endri ckS chmendri ckSc hm endri
ptext: A SIMPLE MAN WITH EAGER EARS MAY TRUST THE WHISPERS THAT HE HEARS
And boom, the whole thing magically reveals itself. “SCHMENDRICK” is certainly not a key we’d be likely to guess, but with the proper methods and some educated guesswork, it doesn’t really matter WHAT the key is if there’s something to be found in the text. Compared to some (the really short messages are the hardest), this cipher is actually one of the easier ones to break.
So while it may be a little late in coming with the show ending and all, now you know more about codebreaking than most people ever learn. And if the popularity of Gravity Falls leads to a resurgence of interest in classical cryptography, you may just be able to spy on some supposedly secret conversations. Use your crypto powers for good!
If you have any questions about Vigenère ciphers, or anything else from this post, feel free to email Wervyn: wervyn {at} gmail {dot} com.
3 Comments
Jonnatan Munguia Chavez
Thanks for the info of the cipher text.
Adrien
I thanks you for this demonstration but that is not a demonstration of “key elimination”, it’s a simple attack by know plaintext. You can eliminate the key without know it.
“Well typically you would do what’s called a key elimination” > false
example: “SUPYTYHDIPG” offset of key length 11
CT1: SUPYTYHDIPG AVOQETHIMCBKOHKXEXJBVRWYOUWCHIA
CT2: AVOQETHIMCBKOHKXEXJBVRWYOUWCHIAVRSVOQLRDIA
CT3: SZBIPFAVWNFQHHGHPKZLHKOQTQBCQBBAAEDAELLEAA
PT1: A SIMPLE MAN WITH EAGER EARS MAY TRUST THE WHISPERS
PT2: ITH EAGER EARS MAY TRUST THE WHISPERS THAT HE HEARS
PT3: S ZBIPFA VWN FQHH GHPKZ LHKO QTQ BCQBB AAE DAELLEAA LOTP HY DNWRB
CT3 = PT3 !
hope you enjoy 🙂
Christina {PuellaDocta}
Thanks for the comment!
Wervyn says: