Class 10: Basics of Block Ciphers

Overview: This lesson introduces block ciphers which encrypt in blocks. For instance 128-bits in, 128-bits out. Most of the encryption in the world is done by block ciphers.

Canonical Block Ciphers

Stream ciphers worked by generating pseudo-random bits and XORing them with your message. Block ciphers will take in a message of a fixed-length, your private key, and they produce a ciphertext of that length. The guts of block ciphers often do a lot of similar mechanics, and we'll look just a little bit at how they are built. BUT for this course I want to show you how to use the block ciphers of the real world.

Quick Example

The classic Block Ciphers that we will use are AES, and Triple DES. DES is fine but the world out-paced it so it's an example. I just want to show you how to get cranking with using PyCrypto the baked-in encryption library in Python.

My key was '87654321' and my message was 'andyrulz', the ciphertext was also 8 bytes long.

If you're following along in the command-line (you should be) you'll note that DES.key_size and DES.block_size are both 8. This out-dated scheme takes in 64-bit keys and 64-bits of message and encrypts them into 64-bit ciphertexts.

Another interesting thing to note is that I had to add DES.MODE_ECB into the code to make it work. Let me explain that in the next chunk of notes, but ignore it while you practice.

DES decrypt: Using the key "smnsdddy" I created the cipher text "7a689d7aa98bd217" (that is the hex digest of my message, it is only 8-bytes not 16). Find the message. 'testnote'

Separating Blocks from Modes

OK, so a few notes on using Block Ciphers. One, they only encrypt a fixed number of bytes at a time, 32 or 8 or 24 or whatever that is. So if your message is more than one block-length you'll have to decide how to deal with that.

Block Ciphers are what we might call a cryptographic primitive. A small, well-studied, tool that we put to work in a larger context. In this case the strategy for tackling larger and larger messages is called the Block Cipher MODE.

We'll look at many modes and their weaknesses and strength. In this case we used ECB mode which stands for Electronic Codebook and it's the crappiest mode. It just encrypts each block as if it were all by itself. So if you have a message of lenght 16 it just encrypts the first block and then the second block. We'll play with this to give you a much better feel in the next lesson.

You'll also note that we didn't provide a NONCE. That's another task of the MODE we pick. The Block Cipher itself has one job only, a straight encryption with a single key and a fixed-length message. It's the cryptographic version of software engineering. A single responsibility for each part of your scheme.

A Fairly Simple CTF Problem

Let's Chat It Out: Some of you might have already solved this but I want to have each table work out HOW to solve it before coding anything.

In this section we explore block ciphers as pseudo-random shuffles. That is, since they can encrypt and decrypt then every input has one and only one output. Since there is a finite supply of inputs and exactly that many outputs you can see them like shuffling a giant deck of cards. The section is somewhat philosophical but outlines some weaknesses we will see in a mode of operation from Lesson 4.

In mathematics a function is something (a relation) which maps each input to a unique output.

An invertible function is one that also has a unique input for every output.

So imagine a block cipher with block-length 512. It takes in a 512-bit binary string and maps it to another 512-bit binary string.

But we can also decrypt any 512-bit binary string, so this function is invertible. Also every 512-bit binary string could be used as input to either encryption or decryption.

That all amounts to the following insight: A block cipher is actually a permutation.

Put a different way it is a shuffling of all 512-bit binary strings. In this context the strength of the cipher is the extent to which its shuffling is indistinguishable from random shuffling.

The permutation is on the set of inputs, for instance this diagram could be seen as a particular block cipher with a particular key on length 3 strings:

Permutation diagram

We can't even talk about random with only one instance of a permutation. The randomness is really saying "pick a permutation uniformly from the set of all permutations on length L strings".

Count permutations: how many permutations are there on length 3 binary strings? On length N binary strings?

If a block cipher does its job well then specifying a key should be like grabbing a random permutation from the set of all possible permutations. If we can determine that a particular scheme has some pattern in it which we can use to distinguish the cipher from truly random permutations then it is weak.

Find an impossible permutation: imagine the one-time pad as a block-cipher on length 3 strings. Find one of the permutations on length 3 strings which cannot be achieved with one-time pad.

Lite-Preview of CTR and OFB

So to understand one of the weaknesses in OFB mode (which we'll explore in lesson 4) we need to explore the math behind permutations.

Let me show you a math notation for writing down a permutation: $$(1 8 2)(4 3 9)(7 5 6)$$

That notation would identify a permutation on the numbers 1 through 9 and in this case 1 gets sent to 8, 8 gets sent to 2 and 2 gets sent to 1.

Since block ciphers are really just permutations let us pretend like the message space is the non-zero single digit numbers.

CTR style

This is "COUNTER" mode for block ciphers, and it's a way of generating a "stream" of blocks. The idea is to take as input the NONCE and encrypt it. Then add one to the NONCE and encrypt that, and so on. What you get on the other side is a "stream" of random-esque blocks.

OFB style

This is "Output Feedback" mode for block ciphers. The idea is to Encrypt the NONCE for the first chunk, then encrypt that output for the next chunk, then that output and so on.

From Permutation to baby block cipher

So pretend that the above permutation is a particular key in a particular block cipher and the input space is the numbers 1-9 and the output space is the numbers 1-9.

CTR style: if the nonce is 6 then get the random stream out of our permutation done in CTR style.

OFB style: if the IV is 6 then get the random stream out of our permutation in OFB mode.

So the length of the permutation cycles has something to do with the strength of OFB as a CSPRNG. CTR is more resistant to this issue as a CSPRNG.

In this part we're going to actually use the king block cipher, AES. Further we'll test the PyCrypt implementation against NIST standards (another healthy skill).

Testing AES

AES stands for Advanced Encryption Standard and it is the only public encryption scheme that the NSA approves for confidential information. So we'll use it as our main block cipher from now on. Our first foray will be to make sure that the implementation in PyCrypto has the correct output on the test cases laid out by NIST.

Here is the FIPS document http://csrc.nist.gov/publications/fips/fips197/fips-197.pdf

AES is the current defacto block cipher and it works on 16 bytes at a time. It has 3 possible key lengths 16-bytes, 24-bytes, or 32-bytes.

We know that a block cipher is effectively a deterministic permutation on binary strings, like a fixed-length reversible hash.

So given a proper length key, and a 16 byte input we should always get the same 16 byte output.

AES.new(secret_key) will generate the simplest cipher (technically ECB mode with IV = "0x00000000000000000000000000000000"). We can use this simple mode to test the individual runs of AES.

FIPS check: in the documentation of AES there is a set of example benchmarks. Use from Crypto.Cipher import AES to confirm that a single round of AES-128 with key binascii.unhexlify('000102030405060708090a0b0c0d0e0f') and plaintext binascii.unhexlify('00112233445566778899aabbccddeeff') yields hexdigest 69c4e0d86a7b0430d8cdb78070b4c55a .

AES-192, AES-256 check: confirm that the block cipher matches NIST specs for the longer keys in Appendix C2 and C3.

Note that there are typically 3 ways to work with bytes, plain ascii, hex digest, and base64 (we haven't played with this yet but we will). A good chunk of your bugs will come from transferring between hex and raw.