Skip Class: Stream Ciphers

Overview: In this Module we are exploring private-key cryptography in a truly modern setting. We begin with stream ciphers and look at them as a combination of the One-Time Pad and CSPRNGs. The concept that makes the one-time pad perfectly secure is what we try to make more robust with stream ciphers and even block ciphers.

Recall the One Time Pad (OTP). It is elegant, beautiful, and perfect in a very small sense of the word perfect.

It takes in a message of \(n\) bits, a uniformly random secret key of \(n\) bits and generates a ciphertext by bit-wise XOR. Encryption and Decryption are the same operation.

Analysis of One-Time Pad

Strengths:

Structured bits XORed with uniform random bits come out with the same distribution as the uniform bits.
If they correctly figure out any part of the key they have uncovered nothing about the rest of the key.
The ciphertext looks like white-noise which is ideal.
It obeys Shannon's perfect secrecy requirements: every plaintext can be mapped to any and every ciphertext by one and only one key.

Weaknesses:

You have to arrange a key swap (true with all symmetric schemes).
You can only use securely once per key exchange (all of our perfection is lost if we use the same key twice).
The key is just as long as the message so if you have a way to send your key securely then perhaps you should have just sent the message.

Fixing OTP

We can fix the biggest OTP problems if we could "generate" keys on the fly with our partner. If both of us could whip up a completely uniform key anytime we need and guarantee that our partner whipped up the same key.

Quantum Coin

New Goal: I want to send a magically linked coin to my buddy. So that every time I flip that coin my friend can see the result. If that's the case then we can generate (and both have) an infinite sequence of random bits. The idea of the one-time pad can be used to generate an infinite pad and I only had to give you the magic coin.

We would call this a "stream" of bits, \(s_1, \ldots, s_i, \ldots \). Given a shared stream of bits the new ENC/DEC algorithms become simple:

Given a message m == [m_1, m_2, … ] (possibly infinite number of bits):

ENC(m, s) ( s is my stream of random yet shareable bits) would be c_i = m_i ^ s_i .

DEC(c,s) is the same as ENC(c,s) just m_i = c_i ^ s_i .

Stream Ciphers: What we just described is a stream cipher, which essentially encrypts one bit at a time by having a pseudo-random keystream. So our magic coin is made in practice by having a key (not infinite) which generates an infinite sequence of pseudo-random bits.

Tying it all together

So THAT's why unpredictable pseudo-random number generators are essential to crypto. If we start from the same seed we can generate a uniformly random stream of noise that acts like our one-time pad key.

We've seen that predictable random number generators can be "cracked" and now you should feel the need to find better Cryptographically Secure Pseudo-Random Number Generators.

Overview: In this section we build an insecure stream cipher as an experiment in the basic mechanics. It will get you used to XOR, seeds, and the general structure of a proper stream cipher. It will also set the stage for the improvements that we have to introduce to get true security.

Building a stream-cipher

So we are going to use the idea of the One-Time Pad mixed with the convenience of PRNGs to create an encryption scheme. So far we haven't explored any non-breakable PRNGs so we're just going to work with an insecure one, but the mechanics will be identical when we have a more secure source of pseudo-random bits (soon, I promise).

Since we want to really use this thing then we'll want something that works on real inputs. If it's going to encrypt ASCII characters then we need to think about random bytes more than random bits, but that's no big deal.

For picking our PRNG let's stick with the GLIBC rand() since we have code for it ready to go. It generates 31-bits at a time (the numbers are mod \(2^{31}\)) and we need 8-bits at a timeso we're just going to work mod 256.

Here is the code again.

Example: A c-rand generated number might look like: temp = 2158094741372. I could extract three bytes of randomness from that. So I say temp % 2**8, (temp >> 8) % 2**8, and (temp >> 16) % 2**8. That should be three numbers uniformly chosen between 0 and 255.

What are they? In this case what are the three numbers that pop out? 124, 111, 120

What's wrong? What range of values can I expect from (temp >> 24) % 2**8?

Test it: Can you test the uniformity of these numbers by generating many and using a histogram?

In code it might be nicer for us to get a string of hex characters so that we can easily concatenate long sequences. In that case you can do hex(temp)[-6:]. Just be careful with the order of the numbers.

The XOR part

Confirm: confirm that with seed 1983 you get [322084512, 1307613328, 1354801484, 546092182] as the first four output numbers.

OK, I'm going to encrypt the message "andy rules!!" using the secret seed 1983 and numbers mod 256.

Decrypt: I have encoded a CTF FLAG using seed 54321. The message is '5ecce596975db5e0c3c7516f05db6ffb43c4e055e6dd' (as hex of course). Find the original. 'satisfying right??'

Full Scheme: Write your own encryption/decryption which takes in the message/ciphertext and a seed and spits out the ciphertext/message using this style stream cipher.

Overview: One of the most important ideas to understand that separates a living-room cryptographer from a practitioner is the idea of a nonce/IV. It is the encryption version of a salt and saves us from the inherent risks of a user sending the same message many times. This idea would have been huge in World War II.

The need for noise

So far we've fixed one flaw of the One-Time Pad, the need for a key which is the same length as the message. But I claim that we still can't use the same key multiple times. There are two quick ways to fix that:

One: Don't restart the PRNG. This requires us to carefully coordinate with our partner so that we stay at exactly the same part of the "stream" on both the encryption and decryption side. This gets tricky if some messages aren't making it to our partner or if there are parallel message sending channels. We'll show you some tricks later that help prevent that. But in this lesson we're going to go with a different notion.

Two: Use some public randomness to effectively change our secret key. This idea is that we generate (from a true entropy source) a NONCE, a one-time set of random bytes. Then we let this NONCE mingle with our private key to change the output of our encryption scheme.

The real goal of this is that I could send the same message 100 times in a row, with the same private key, and each time it would look completely random and uncorrelated.

From now on we will insist on a NONCE (sometimes called an IV for initialization vector) so that the same message is never encrypted the same way twice.

Most machines we work on have a source of randomized bytes from entropy that we can pull from. They pool sources of entropy like temperatures, user actions, timings, etc.. Then they use CSPRNGs to turn that entropy into uniform random bytes. It's a very different topic than PRNGs because we never want to reproduce it, but computers are inherently deterministic and anti-random. So a poorly generated key or initial seed will kill even the most complex CSPRNG.

Making it happen

OK, let's come up with a way to inject noise to our previous scheme. Suppose we had the secret seed 54321 and we're using GLIBC's rand to make our stream cipher.

I'll show you a method that introduces the basic mechanics. We'll improve on this as we go.

Generating a NONCE

So when encrypting a message we are going to first generate some true entropy bytes that will be sent along with our ciphertext. That is the NONCE will be sent in the clear, so it should look random. Here is how to use your machine's entropy to generate 6 bytes of noise:

Now we just need a convention to translate this public noise and our secret (54321) into a new seed/key. Other schemes will have their own ways of letting the NONCE mess with the scheme, in our case I think that changing the seed is fine for learning the idea.

My method will be to concatenate the NONCE and our secret-key, then SHA256 those bytes, and take the lowest 32-bits as our new seed. Here it is in code:

Test it: If I send a 6-byte NONCE of "cc4304c09aee" and suppose our secret is "simone" then calculate the new seed using this method. 3238040436.

Putting it all together

Now we just need a system for sending the nonce in our ciphertext. I suggest having the first 6 bytes be NONCE bytes then the rest as the cipher text. I'm using 6 arbitrarily in fact 6 bytes is not particularly strong, but this scheme we've made uses GLIBC's rand so we're clearly not trying to protect nation-state secrets yet.

Decrypt This Message: Assuming a secret key of "salmon" and the first 6 bytes are NONCE bytes then decrypt this message using our DIY stream-cipher: '4c701b07d5839267e97ac4e1c4afd312ff658c882f7ca806c6c48efef14cb87cb247'

Aside: In this part we introduce our first full strength encryption scheme, Trivium. It is a stream cipher with a triple layered LFSR involved. We will play with the idea and could use this to make our first encryptions and decryptions that might actually be good enough for the real world.

From diagram to code

Stream ciphers are generally not as secure or well-understood as block ciphers (which we study next). In software you are unlikely to deal with stream ciphers. Their role in the eco-system is in hardware. With space-constrained devices that need encrypted data streams, we need fast hardware implementations that encrypt bit-by-bit. So, as you code these, imagine the hardware version.

We saw that breaking an LFSR takes 2L outputs. We also used a PRNG to make a stream-cipher. The next layer of complexity that I imagine is something like the Trivium "stream cipher" which is really just a CSPRNG that generates one bit at a time. It tries to make the LFSR idea more secure by having multiple registers which interfere with each other. The intuitive notion is that LFSRs yield to linear algebra so let's add just enough complexity to be non-linear.

Those * are bit wise AND which is the same as \(x \cdot y\) over \(\mathbb{Z}_2\).

Structure

When looking at this we want to see it as 3 separate registers which each produce their own output. The final output bit is the XOR of all three output bits.

The output of each register is also used to help form the input of another register.

Initialization: To kick-start trivium it accepts two inputs, an 80-bit key and an 80-bit IV (which is their NONCE). The 80-bit key is loaded into the left-most 80 bits of the first register. The 80-bit IV is loaded into the left-most 80 bits of the second register. Finally the final 3 bits of the third register are set to 1 (the right most-bits).

The stream is then run \(4 \cdot 288\) times with the output discarded. This is now the opening state.

Let's do the first (tossed out) run using an all 1 key and an all 1 IV, just to show the idea. Bits 1-80 are all 1, bits 94-173 are all 80 and bits 286-288 are all 1, everything else is 0. The first output bit will be an XOR of 3 different bits, so let's look at the first register. The output of register 1 is the XOR of bit 66 (1) and bit 93 (0) which is 1. Register 2's output is bit 162 XOR bit 177 which is also 1. Finally register 3 is the XOR of bit 288 (1) and bit 243 (0) so also a 1. Thus the first output is 1. Now everything has to slide so let's look at the feedback. The input to register 2 is the XOR of bit 171 (1) and the XOR of that first register's output bit (1) and the product of bit's 91 and 92 (0), so that is 1 + 1 + 0 = 0 is the new input bit to register 2. All of the other bits in register two slide one to the right. (Register two is now 0 in bit 94, 1 in bits 95-174, and 0 everywhere else. Likewise you can trace the new input bit of register 3 will be a 1. The new input bit for register 1 is a 0.

(ASIDE:) One of our books "Understanding Cryptography" has a slightly different diagram which is not accurate (they XOR the AND output of each register as part of the output). Here is the Trivium paper https://eprint.iacr.org/2009/431.pdf

Implement it: Consume an 80-bit IV and an 80-bit key, then run the warm-up cycles, finally output the next 160 bits. If you share your IV, key, and output someone else should be able to verify.