Class 3: Styles of Attacks, Frequency Analysis, One-Time Pad

Overview:This lesson is designed to give you insight into the types of attackers that you must protect against. We will break it down into types of attacks and then further into attack-models. The types of attacks are big umbrellas and the attack-models are specific frameworks that let you analyze the security of a scheme. In the process we will learn our first analytical attack (frequency analysis) which we will use to break an historical cipher.

The major classes of attack

Brute-Force

So far we have only executed one type of attack, brute-force . In a brute force attack you just try to decrypt with every possible key. If the number of keys is small enough you'll win. If the number of keys is around \(2^{22}\) we should break it in around a second on a PC running Python.

ROT-x Task: The following line of text has been encrypted using a variation of the old school caesar-cipher, each letter has been shifted by a fixed amount. Write a procedure to crack it:

hvwgw gqozz srobv wghcf wqozq wdvsf psqoi gswhw gpfcy sbhbl

Now crack:

sahyk iapkp daben opzwu kbpda naopk bukqn hebau

Side-Channel Attacks

We won't do this in a hand's on way during this course but we will at least see some attacks of this type. A side-channel attack is when you use awareness of the physical implementation of the code to leak information. A couple of examples:

Watching the power usage of a CPU encrypting or decrypting a message and using some signal processing and the machine code to figure out the inputs.
Observing the amount of shared memory consumption of a process in a virtual machine which is doing encrypting or decrypting. Reverse engineering something about the secret.

Typically these involve some very clever engineering and awareness of the hardware involved.

Social Engineering

The other "cheater's" attack is exploiting a human that knows how to gain access to the message. Sometimes called "rubber hose cryptanalysis" (beat the answer out of someone with a rubber hose), most social engineering attacks involve (forceful) coercion. But they can also involve less violent means like, watching someone type in their password via cameras. Feel free to start imagining cool versions of the movie "The Sting" and long cons. We are starting to study this type of attack here at UD but not in this course, it's more about raw human psychology than anything else.

Analytical Attacks

This is where we will put most of our effort. These are the attacks which work without needing to check all of the keys. This is where we find underlying weaknesses in the design of the encryption scheme that we exploit.

To demonstrate I'm going to give you an in-class challenge.

The following script is a random variant of the Vigenere cipher, more precisely called a poly-alphabetic shift cipher .

The classic Vigenere is to have a code word like "doggy" which you repeat and use to give you the shift at each letter. For instance:

andyr	ulesx
doggy	doggy
dbjep	xzkyv

Analysis Task: What is the size of the key-space (the number of possible keys) when keyLength == 10 ?

Kata: Break a poly-alphabetic shift cipher

First Analytic Attack: The above snippet was used to encrypt a long plaintext message (I stripped all non-alpha characters and converted to lowercase). The result has been saved at encoded.txt . The key has length 10 (that's a lot of info). Break my scheme (don't try brute force it will take too long).

In the next part I have given some extra info laying out how to use "frequency analysis" to make some progress. Take a look when you're ready for a hint then come back to try again.

Frequency Analysis

Overview: The heart of this part is to teach a computer when a set of letters matches the frequency distribution of normal english text. You can run with this idea in hundreds of ways so try to look for the essence of this idea on top of pulling off the actual attack.

The following snippet has some frequencies for letters in the english language stored in a dictionary by lowercase letter.

YOU SHOULD SAVE THIS PAGE/GIST/SNIPPET. YOU'LL WANT TO REFERENCE IT EVERY TIME YOU NEED TO BREAK AN HISTORICAL ENCRYPTION.

When it comes to cracking old-school crypto schemes frequency analysis is great. Here is a trick which can let you programmatically detect that your text looks plain.

The sum of the frequencies squared is always going to be near .065 when the text has a the same distribution as most English language text. In symbols that is \( \sum_{i=0}^{25} p_i^2 \approx 0.065\) where \(p_i\) is the rate of character \(i\) in standard English text. So if you check for that squared sum on the frequencies of characters in encrypted text you'll know if your text has been substituted in some way or another.

Check with Moby Dick Run that script to download a copy of MobyDick and check the frequency of each letter of the alphabet.

Adapt Adjust the script to grab the file at http://crypto.prof.ninja/encrypted_mobydick.txt and confirm that the sum of the squared frequencies is around .04

Fixing shifts

The next idea, if you suspect some shifting, is to look at the sum of the products of the expected frequency and your observed frequency (after attempting a shift) at each letter. In symbols, \(\sum_{i=0}^{25} p_i \cdot q_{i+j \% 26}\) should be \(\approx .065\) when \(j\) is the correct shift. Here \(p_i\) means the standard English frequency for some letter (found in the first gist above) and \(q_{i+j}\) is the observed frequency of some other letter in your encrypted text.

Here is a snippet using this idea to figure out the key in a Caesar-cipher.

Kata: Break Caesar ciphers programmatically

Caesar: In the Caesar cipher every letter is rotated by the same amount. Why does that lead to a sum of character frequency squared being close to .065? Why does that change when I multiply by the standard frequency?

In something like this Vigenere cipher with key-length 10 you won't get the nice 0.065 number from the text because there are ten distinct shifts going on. But if you only look at every tenth character you should see the 0.065 squared frequency property! So if you isolate your search to every tenth character the problem is reduced to your first shift cipher attack (which we can even detect using the \(p_i \cdot q_{i+j}\) analysis).

Head back if needed: If you didn't break the key-length 10 encryption from the part 1 go and solve it now using the above insights and finding the shift using only every tenth letter.

Full Vigenere cracking

Now what if you didn't know the key-length? Well, if you know that this poly-alphabetic cipher is the one being used then you can start testing potential key-lengths looking for the 0.065 rate along the subsequences. That is pretend the key-length is 1 and check for .065, then try looking in every other character for .065, then every third character, and so on until you find the key-length. THEN find the shift sequence / key.

Kata: Break arbitrary Vigenere ciphers when the text is long enough.

Frequency Task: Decode challenge.txt with unknown keylength.

Threat Models

Overview: This part should be a paradigm shift for how you look at the problem of cryptography. Instead of just seeing an encrypted text and trying to break it we start to open the door to other reasonable situations. For instance, what if I can listen to many many messages? Does that help me decrypt? Of course it does, but how? These ideas are important to get your mind right about why we have to be so careful when we get into properly deployed crypto in later sections. We can actually start to analyze a scheme in more ways than just brute-force attack cost. I do want to point out that this section is not something you have to master. It's more that I want you to be familiar with the ideas as we go forward but we're not running a theoretical course here.

So far we've looked at some basic shift-ciphers and substitution ciphers. How do we analyze their strength?

It's not easy. We know how to reason about brute-force attacks. We also know to be scared because better analytical attacks exist (you even broke Vigenere). What I'm about to tell you doesn't simplify the situation but it does give you new realms to play in.

Here are the classical threat models for provable cryptography (which is the academic cousin of real in-the-trenches crypto):

Major Attack models in order of severity

Cipher-text Only

This is the classic attack, you get the encrypted text and that's it. All that you have to work with is in that one encrypted message.

Known-Plaintext

Here you know something about the plaintext message. Maybe it starts with the same word every time or has the Gettysburg address somewhere in the middle.

Chosen-Plaintext

Here you gain the ability to have any message you choose encoded with the same key. Let me see what "andy" encrypts to.

Chosen-Ciphertext

This one is easy to misunderstand. You get all of the powers of a chosen-plaintext attack AND you can ask for the pre-image of any message. Different books adjust this model to say either, your goal is the key or the only restriction is your target message. Probably the right way to think about this attack is that you can test a slightly adjusted encryption. Change 1 bit and see if the decrypted message turns to gibberish. For instance, if you are a server on the internet and you change an encrypted packet in some way you can see how the target reacts to your change when they go to decrypt.

Now with each of those models you can also add the word "MULTIPLE" to form the question, what if there are many messages and you can pick some of them.

Vigenere CPA Analysis: What was the cost of breaking the vigenere scheme in the cipher-text only attack? (This might be a function of key-length, message-length, etc.) What is the best attack on vigenere in the chosen-plaintext model? How expensive is your proposed attack?

Less Canonical Attacks

Related-key

If you know that keys are generated in some sort of a sequence can you gain leverage?

Chosen-key

What if you can pick part of the key and your goal is the rest of the key?

Padding-Oracle

When plaintext is too short you pad it. In many situations there is a decryption oracle somewhere which will bark at you when an encryption you've sent is incorrectly padded (CAPTCHA servers, Java's BadPaddingException).

The real goal is distinguishing attacks

Your internal compass needs to be this: is there any chance that some attacker can distinguish, with any statistical significance, my encryption scheme from an ideal encryption scheme.

This definition captures the subtle idea of leaking information. Can they detect that bit 23 is a 1 or a 0? If they know bit 12 do they have any extra info about any other bit?

A version of this I like is the following mental experiment (the adversarial indistinguishability experiment ):

If my adversary can give me two messages \(m_0, m_1\) and I pick one at random to encrypt and give back to them can they guess the correct message with probability better than 50%?

This is why we need randomization in any secure encryption scheme (under a multiple-encryption threat model). If my encryption is deterministic then I can just check to see if two messages are the same and information is leaking.

Here is a vigenere Semantic Security Attack Game

This is the kind of thought experiment research cryptographers do in order to show when something is clearly broken.

This game is the MSS game (Multiple Semantic Security) which means the adversary can strategically pick many words, but the key changes each time so the semantic security is just whether the ciphertext leaks anything about the plaintext.

See the Pen Adversarial Indistinguishability MSS Attack Game by Andy Novocin (@AndyNovo) on CodePen.

Can You Beat 50/50?: Try this game and figure out a way to beat 50% odds. Now change the key length to match the message length and try again.

Here is the CPA (Chosen Plaintext Attack) variant

Now what changed is that I use the same key for every encryption you ask for.

See the Pen Adversarial Indistinguishability CPA Attack Game by Andy Novocin (@AndyNovo) on CodePen.

Can You Beat 50/50?: Try this game and figure out a way to beat 50% odds.

To learn more about these games check out: These Graduate Course Notes

Binary and XOR

Part 1 of our lesson is designed to get you comfortable with the binary hidden beneath every character you read in this course. You need to learn how to convert from messages into integers and how XOR works on a bit-by-bit basis. This is a crucial part of all of cryptography. Hopefully you start to see why as we go, certainly it will make more sense as we progress.

Deep inside your CPU is a hardware implementation of many binary operations. It is as fast as fast can be and really interesting. Of course it all works on just 1s and 0s so most of what we see is made for us to see.

Let's take a look at the basic integers in binary:

Now binary is the computer's truth. The decimal digits we see are just an illusion for our sake (thanks computer).

Test and Interpolate: Try out 10 << 3 and 8 << 1 and 1 << 5 and 7 << 2 . Try to look at the results in binary before and after. What do you think the << operator does? What about >>?

XOR, AND, and OR

In logic courses we use statements that are either True or False. You can combine those statements with AND, OR, and XOR. It works like this (each row is a parallel universe where A and B have the four possible values, in each case we can see what these three operators do):

For instance "1 > 2 AND 2 > 1" is a false statement.

But when it comes to computers and text and crypto it gets funkier. Now these three operators act bit-wise on integers and characters.

Here is what I mean by that: \(9 == 1001\) and \(15 == 1111\) so \(9\) XOR \(15 == 0110 \), that is at the 8's digit since 9 and 15 are both 1 the result will be 0, at the 4's and 2's digit they disagree so the output is 1 and on the ones digit they agree so the output is 0. If that is a bit complex let's do a ton of examples until it makes sense.

By the way the notation for XOR is \( \oplus \) but in coding we use \( ^ \) (which was actually AND in logic class, sorry).

Try it: What do you predict the values are of \( 1100 \) AND \( 0101 \), \( 1100 \) OR \( 0101 \) , and \( 1100 \) XOR \( 0101 \). (Answer in binary.)

Coding with XOR, AND, OR: In programming the operators are & for AND, | for OR, and ^ for XOR. For instance 9 ^ 15 == 6. Write a program that performs all 3 operations on the numbers 8, 11, 15 combined with 10, 12, 14 (27 results total: 8 | 10, 8 & 10, 8 ^ 10, 8 | 12, etc.) Now repeat this with binary representations displayed. Now pick two numbers off the top of your head and predict the result of XORing them. Do this until you can always correctly predict the result.

XOR ROX!

XOR is the secret weapon of cryptography. There are two reasons why:

Reason 1: patterned XOR random looks random

Every single character is actually an 8-bit number (ord('A') == 65 == 0b01000001). If I XOR that 8-bit number with a random 8-bit number I will get something that LOOKS random.

Imagine I have the very patterned string of bits 1111000011110000. Now I want to generate a random 16-bit string to XOR with. Well if at each bit I have a 50/50 chance of a 0 then my output ALSO has a 50/50 chance of a 0.

Think about it, if 1 is my input then 1 ^ 1 is 0 and 1 ^ 0 is 1, so the result of the coin-flip is reversed. If my input is a 0 then 0 ^ 1 is 1 and 0 ^ 0 is 0 so the coin-flip would be preserved. Either way I've got the same random distribution as what I XOR with.

8-bit random: Let x = 240 (0b11110000). Now generate a random integer from 0 to 255 which you can save as y. Look at the binary of their XOR, bin(x ^ y) for several different values of y. Can you find a pattern?

Reason 2: (A XOR B) XOR A is B (it is self cancelling/decrypting).

XOR reversed is XOR: 240 ^ 112 == 128 . Look at the value of 128 ^ 112 and 128 ^ 240.

KATA: Convert from strings to an integer and back

To XOR a whole string with another string we have to convert plaintext into an integer then XOR the integer and reverse that.

Here is a Python3 Version:

Convert to an integer: Convert the message "Andy" into an integer (by first going to hex then hex to an integer like in the above snippet). XOR that number with the random number 320945240 and convert the result back into plaintext (the second half of that snippet).

Sub-Lesson: The 12 transformations of crypto text

In playing with real crypto and CTF problems you NEED to be able to quickly convert strings between any of the following formats:

ASCII
Byte Arrays
Integer
Hex Digest

That is 12 conversions you need to be able to do with the fewest number of mental cycles possible. Here they are in my perceived order of usefulness:

ASCII to Hex Digest
Hex Digest to ASCII
ASCII to Integer
Integer to ASCII
Hex Digest to Integer
Integer to Hex Digest
ASCII to Byte Arrays
Byte Arrays to ASCII
Byte Arrays to Hex Digest
Hex Digest to Byte Array
Byte Arrays to an Integer
Integer to Byte Array

These 12 conversions are the biggest reason I still use Python 2.

If you are a Python 3 user then if you are at Python 3.5 it gets easier.

I also use utilities in libraries to make these transformations faster.

Without looking: Try all 12 conversions from your natural instincts and google. Then try mine:

Python 3 Version: For extra credit give me 12 one-liners for the Python 3 versions of those 12 transformations.

Sub-Lesson: XORing Strings

There are many flavors of XORing strings. You need to:

UNDERSTAND what is happening on some level
Find your favorite method to quickly XOR strings

Here is my typical XORing method:

Write your own: Make your own XOR utility.

Overview After that last part on XOR it is easy to understand what One-Time pad is. Generate a truly random string of bits which is as long as your message and XOR the message and the random bits. Now your partner can XOR the ciphertext with the same random bits and get back your message. So we go three steps farther in this part. 1) We give you a formal description of the algorithm so you can get used to seeing math-notation style encryption scheme descriptions, 2) we give a loose sense of what a "perfect" encryption looks like, 3) we try to break the one-time pad when it is used in a repetitive scenario.

The one-time pad

Encryption schemes are often described by three algorithms, GEN the key generation algorithm, ENC the encryption algorithm which consumes a key and a message and returns a ciphertext, and DEC a decryption algorithm which consumes a ciphertext and a key and returns the plaintext message.

Here is the one-time pad:

GEN: choose a random key uniformly from \(\{0,1\}^{\ell} \) (the set of binary strings of length \( \ell \))

ENC: given \(k \in \{0,1\}^{\ell}\) and \(m \in \{0,1\}^{\ell}\) then output is \(c := k \oplus m \)

DEC: given \(k \in \{0,1\}^{\ell} \) and \(c \in \{0,1\}^{\ell}\), the output message is \(m := k \oplus c\).

In words: take a random sequence of 0s and 1s as the secret key. XOR that key with your message.

The ciphertext has all of the properties of white noise. In fact in the cipher-text only attack model (without any variation) this is the perfect encryption scheme .

Kata: perform the OTP

Use the One-Time Pad: For the message "Andy ROX!" generate a 9 character random string (import os, os.urandom(9) is ideal). Use the ideas of the last part to convert both to integers and XOR them. Share the hexdigest of that result with someone in Piazza along with your random 9 characters. Confirm that you can get back to "Andy ROX!". GEN is the urandom part, ENC is the XOR part and DEC is decrypting your partner's message.

"Perfect" Secrecy

This scheme satisfies Claude Shannon's notion of perfect secrecy which I'll try to capture in an intuitive way (my intuitive version is only accurate for messages, keys, and ciphers all having the same size space):

Imagine all possible messages, all possible keys, and all possible ciphertexts. For every message and ciphertext PAIR there is ONE key which causes that message to encrypt to that ciphertext.

This is really saying that each key gives you a one-to-one and onto mapping from messages to ciphertexts, and changing the key shuffles the mapping without ever repeating a pair.

OK, it's perfect. So now let's break it!

Kata: Break OTP when in a multiple encryption attack model

Each of the attack models from Lesson 3 can be adjusted to allow multiple encryptions to be observed. In the base case, you intercept many encrypted messages. In the CPA case the attacker can adaptively pick their plaintexts. (Any scheme which is CPA-secure is also CPA-multiple secure btw.)

Deterministic encryption schemes are NEVER secure under a multiple-encryption threat model (since you can see when two messages are the same, that's a ton of info).

One-time pad, used many times. Please head to this plunker site I made . It has a panagram in which each letter has been encrypted by the same one-time pad key. Find the key and my message.

The MSS Experiment, OTP edition

See the Pen Adversarial Indistinguishability MSS Attack Game OTP by Andy Novocin (@AndyNovo) on CodePen.

Flawless Edition (or so I think)

See the Pen Adversarial Indistinguishability MSS Attack Game OTP Flawless by Andy Novocin (@AndyNovo) on CodePen.