# Class 17: RSA day

Overview: RSA is the most famous modern encryption scheme. It has a great brand. I want to take a moment and demonstrate that key lengths must be dealt with more carefully in the public-key world.

We've just set up El Gamal and Diffie-Hellman where we can see that the secrecy entirely depends on the difficulty of a general solution to the discrete log problem.

Our next public-encryption scheme is also the most famous, RSA. It's secrecy depends on the difficulty of integer factorization. It is easy to multiply two large numbers (can be done in $$\mathbb{O}(\log{n})$$ time) but expensive to reconstruct those factors given just the product ( $$\mathbb{O}(n)$$ time (actually a bit better than this)) (here $$n$$ is the size of the numbers NOT the number of digits).

Now a linear-time algorithm sounds fast to me. But we're talking about giant numbers, loop through every number from 1 to $$10^{600}$$ and it'll eat up some CPU time. But something like AES is a far more complex process to start unraveling than just finding a factor.

I wanted to say this just to demonstrate that the security factor (block-length in AES and log of the modulus in RSA/DH) must be significantly larger when deploying public-key crypto vs symmetric crypto. Just remember this: 128-bits of security for AES is equivalent to 3,248-bits of security in RSA/DH.

Overview: Let's set up the math that we need to really pull off the RSA encryption scheme.

The world's most popular public-key encryption scheme is the RSA algorithm (named after the three authors which means that the NSA probably calls it something else in-house).

The idea is the following:

• Generate two large primes (not just any will do): $$p, q$$
• Compute $$n = p \cdot q$$
• Compute $$\Phi(n) = (p-1) \cdot (q-1)$$
• Select an $$e \in \{2, \cdots, \Phi(n)-1\}$$ such that GCD($$e,\Phi(n)$$)=1
• Compute private key $$d$$ such that $$e \cdot d \equiv 1 \mod{\Phi(n)}$$
• Publish Public key $$(n, e)$$

Now to encrypt use the public key $$(n,e)$$ and secret message $$m$$ the cipher text is: $$C := m^{e} \mod{n}$$.

The decryption of $$C$$ is just $$C^{d} \mod{n} \equiv m^{e \cdot d} \mod{n} \equiv m \mod{n}$$. This is because the size of the cyclic subgroup is $$\Phi(n)$$ and $$e \cdot d = 1 + \Phi(n) \cdot k$$ for some $$k$$.

Trench Generation: Generate an RSA key pair using the command  ssh-keygen -t rsa  (it will ask you questions, the first question is where to save it, pick something in your current folder like  rsa_key  ).

Read the keys: Now use the command  openssl rsa -text -noout -in rsa_key  to see the various parameters in the private key file. There are 8 numbers stored there: the modulus $$n$$, the public exponent $$e$$, the private exponent $$d$$, $$p$$, $$q$$, $$d \mod{p-1}$$, $$d \mod{q-1}$$, and $$q^{-1} \mod{p}$$. Those last three are there to make the exponentiation faster by computing mod p-1, q-1 and using chinese remainder theorem to reconstruct the plaintext.

Confirm: confirm that $$n = p \cdot q$$, $$e \cdot d \equiv 1 \mod{(p-1)\cdot (q-1)}$$, and that $$p$$ and $$q$$ are prime.

You can now see the hard problem that an attacker must solve: given $$n, e, C$$ find $$d$$ or $$m$$. The reason this is a factoring problem is that knowing $$\Phi(n)$$ is enough to take an inverse for $$e$$. Knowing $$Phi(n)$$ is equivalent to knowing $$p$$ or $$q$$ just given $$n$$.

#### HEY PYTHON READ THIS FOR ME:

Here is the fastest way to consume an RSA PEM key that I found. I was able to import rsa directly but probably some of you will have to pip install rsa or something like that:

Overview: Finally we'll both encrypt with RSA and break it (when the problem is small enough). This helps you really understand all of the mechanics and begin to appreciate some subtlety that we'll explore in the next module.

In public key encryption anyone that wants to receive a message must publish a public-key. Anyone that wants to send that person a message uses it to encrypt a message. RSA is a bit different than DH in this sense because the sender of the message doesn't need to create a public/private key-pair in order to participate.

### Mini-Bletchley Contest

Let's have a three stage contest. Form 4-person teams, team A, team B, team C, etc. . On each team we will have two message senders and two message crackers. SenderA1, SenderA2, CrackerA1, CrackerA2, SenderB1, etc.

For this competition we will have a three stage race.

• Stage 1: Each sender must generate an RSA key-pair with modulus-length 240 bits. The crackers should prepare some utility functions / plan to help them decrypt when the time comes.
• Stage 2: Each sender publishes their public key $$N, e$$. The CrackerA1 begins work factoring SenderB1's public modulus (use SAGE or Wolfram Alpha) and CrackerB1 tries to break the public key of SenderA1. Meanwhile the senders use their partner's public key to encrypt a message of no more than 26 characters and broadcast it. SenderA1 generates a ciphertext for SenderA2, and SenderA2 generates a ciphertext for SenderA1.
• Stage 3:The first team to successfully read all four messages wins.

Some details you'll want:

• Generating weak RSA keys:  openssl genrsa -out rsa240.pem 240
• Reading RSA keys: openssl rsa -text -noout -in rsa240.pem
• From a factorization to a private key: if you calculate $$N = p \cdot q$$ and know $$e$$ then you have to calculate $$d = e^{-1} \mod{(p-1)\cdot(q-1)}$$.
• From message to integer (SAGE):  Integer(msg.encode('hex'), 16)
• From integer to message (SAGE):  format(intmessage, 'x').decode('hex')  (if you get odd length, inspect hex, and pad with a 0)

### Some "Common" RSA Attacks

• Common Modulus (using the same $$N$$ but issuing several private keys) leads to an easy factoring method
• Moduli that share factors on accident. Turns out that 20% of RSA moduli share factors because of laziness/restrictions. If you GCD a bunch of moduli you might just find factors.
• Small values of $$d$$ lead to the Weiner Attack
• Related messages: If $$M_1 = f(M_2)$$ and you know $$C_1, C_2$$ then do GCD of $$g_1(x) = (f(x))^e - C1$$ and $$g_2(x) = x^2 - C2$$ and you have a chance of finding the factor $$x-M_2$$ straight up.
• Timing attacks if you can get them to decrypt many messages, you can deduce the timing of the repeated squaring involved in computing pow(ct, d, n)
• Cyclic Group Check: when setting up it's possible that $$e^k = 1$$ mod $$\phi(n)$$ for a small $$k$$, in which case multiple encryption of the message can just break it.
• Related Primes, if you know something about the relationship of the two prime factors to each other attacks exist (Fermat Factoring).
• COPPERSMITH ATTACKS!!! So there is a whole series of attacks related to small values of $$d$$, small values of $$e$$, knowing some bits of $$d$$, some bits of $$p$$, some bits of $$q$$, or some bits of $$M$$ (the message). These are important enough that I'm going to take a couple of lectures to teach you lattices, lattice reduction, and the flavors of Coppersmith.
• Blind Signature Forgeries (next set of notes)

#### CTF Flag

Check this one out: ctf.tar.gz