cstef

The Maths Behind RSA

13/09/2024 · 9 min read · sapling 🪴

cryptography
maths

RSA (named after its inventors, Ron Rivest, Adi Shamir, and Leonard Adleman) is one of the most widely used encryption algorithms in the world. It’s used to secure everything from online banking transactions to secure communication between devices. But how does it work? What are the mathematical principles behind it? Let’s dive in.

Historical Context

Asymmetric encryption, also known as public-key cryptography, was a game-changer in the world of cryptography. Before its invention, symmetric encryption was the norm, where both parties shared a secret key to encrypt and decrypt messages.

As the name implies, with asymmetric encryption, we now have two keys: a public and a private one. The public one can be shared with anyone, and the private needs to be kept secret. Let’s take our good old Alice and Bob to illustrate this!

Bob I love you <3 Encryption with Alice's public key spq349832uç*%&2(Z*48 Internet Alice Decryption with Alice's private key
Bob I love you <3 Encryption with Alice's public key spq349832uç*%&2(Z*48 Internet Alice Decryption with Alice's private key

Now, let’s say, we have Charles spying on our two lovers:

Bob I love you <3 Encryption with Alice's public key spq349832uç*%&2(Z*48 Internet Alice Decryption with Alice's private key Charles Can see encrypted message but can't decrypt it
Bob I love you <3 Encryption with Alice's public key spq349832uç*%&2(Z*48 Internet Alice Decryption with Alice's private key Charles Can see encrypted message but can't decrypt it

Charles will never be able to eavesdrop on Bob and Alice unless he gets access to Alice’s private key. But how is that mathematically possible? Because humans are not clever enough at the moment.

Prime numbers

You are probably used to integers (whole numbers), but did you know some of them are special? The two most basic groups of integers are even (2,4,6,…) and odd (3,5,7,…) numbers. However, some of these numbers are even more special: they cannot be divided into a smaller whole number other than themselves! As small as this detail might seem, this is on what (pretty much) the whole current cryptography system relies on!

Machines are very bad at finding the divisors of a number (humans are even worse). For example, let’s take 91

91=713

We cannot divide 7 and 13 further down, that’s because these are in fact prime numbers. Now let’s say you have two huge prime numbers, like 1066340417491710595814572169 and 19134702400093278081449423917. Now multiply them together:

106634041749171059581457216919134702400093278081449423917=20404106545895102906154128522206995414761716518871165973

That’s quite a big integer! If someone was given the final number and was asked to find the unique solution (the two prime numbers you multiplied together), this would be an impossible task.

Well, in fact here I chose two Fibonacci prime numbers so it would be easier but anyways…

The same goes for a computer, but with even larger numbers.

Modulo

The modulo function also plays an important role in the RSA algorithm. Modulo can be seen as “wrapping around” when a number reaches a certain value (or the remainder of a division). The most common example of this in everyday life is how we use clocks: After 12PM, the clocks “resets” at 1PM and so on at 12AM.

This can be mathematically written as:

131 (mod12)

We say that ”13 is congruent to 1 modulo 12”, because

13=112+1

Generally speaking,

𝑎𝑏 (mod𝑚)𝑎=𝑘𝑚+𝑏 (𝑘)

The RSA Algorithm

The procedure to generate the private (𝑛, 𝑑) and public (𝑛, 𝑒) keys is the following:

  1. Choose two prime numbers (preferably big ones).
  2. Compute 𝑛=𝑝𝑞.
  3. Compute 𝜙(𝑛)=(𝑝1)(𝑞1).
    • This value comes from the fact that both 𝑝 and 𝑞 are prime, thus 𝜙(𝑝)=(𝑝1), and 𝜙(𝑝𝑞)=𝜙(𝑝)𝜙(𝑞)
  4. Choose the encryption exponent 𝑒 so that it is prime with, and smaller than 𝜙(𝑛)
  5. Compute the decryption exponent 𝑑, the inverse of 𝑒 modulo 𝜙(𝑛), 𝑑𝑒1(mod𝜙(𝑛))
    • A quick and easy solution here is to use Euler’s extended algorithm, which we know will work because 𝑒<𝜙(𝑛) and is 𝑒 is prime to 𝜙(𝑛), which means gcd(𝑒,𝜙(𝑛))=1
    • We then get 𝑒𝑑+𝜙(𝑛)𝑦=1𝑑𝑒1(mod𝜙(𝑛))

The alorithm’s security relies on the following equivalence: (𝑚𝑒)𝑑(mod𝑛)

We can now encrypt our message 𝐶 (clear) to 𝑀 (encrypted) with the following formula:

𝐶𝑀𝑒 (mod𝑛)

and decrypt 𝑀 to 𝐶 with:

𝑀𝐶𝑑 (mod𝑛)

You may now ask: “But these are numbers, I want to encrypt a text message!”. Well, text is just an illusion (on your phone), as its characters are represented as numbers in the underlying system (see UTF-8). We can first transcode our text to numbers:

For simplicity, let’s take each letter’s position in the alphabet, starting from 1: “I luv u” becomes:

𝑖=9, 𝑙=12, 𝑢=21, 𝑣=22, 𝑢=21

Then we can encrypt each number separately and concatenate them to get our encrypted message.

Let’s encrypt this message by hand !

  1. We choose 𝑝=𝟓 and 𝑞=𝟏𝟑
  2. 𝑛=𝑝𝑞=𝟔𝟓
  3. 𝜙(𝑛)=𝜙(65)=(𝑝1)(𝑞1)=412=𝟒𝟖
  4. Let’s take 𝑒=𝟓
  5. We apply Euler’s Extended algorithm for 𝜙(𝑛)=48 and 𝑒=5 to find 𝑑
    1. 483 (mod5)
    2. 52 (mod3)
    3. 31 (mod2)
    4. 20 (mod1)
    • gcd(48,5) is effectively 1, we can find 48𝑥+5𝑦=1 by simplifying the steps backwards:
    1. 1=321
    2. 2=5311=3(531)11=235
    3. 3=48951=2(4895)5482+5(19)=1
    • We have 𝑑=𝑥=19, but a modular inverse needs to be positive, so we add 48: 19+48=29𝑑=2951 (mod48)
    • Our keys are: (𝑛=65, 𝑑=29) and (𝑛=65, 𝑒=5)
  6. We can start encrypting!
    1. 𝑖=9𝐶95 (mod65)9529 (mod65)
    2. 𝑙=12𝐶125 (mod65)12512 (mod65)
    3. 𝑢=21𝐶155 (mod65)15521 (mod65)
    4. 𝑣=22𝐶225 (mod65)22542 (mod65)

Our encrypted message is [29,12,21,42,21]

Let’s now check if everything went well by decrypting it:

We did it!

Attack Vectors

The RSA algorithm is not magic, and it can be broken if not implemented correctly. For instance, you may have noticed that the letter “u” has been encoded twice in the sentence, and the output remains the same. Frequency analysis could be used here to guess letters from a longer text. To remedy this, we can use a technique called padding or combine RSA with symmetric encryption, like AES:

Bob I love you <3 AES Encryption with a random symmetric key spq349832uç*%&2(Z*48 Clear key RSA Encryption with Alice's public key spq349832uç*%&2(Z*48 Encrypted key Alice RSA Decryption with Alice's private key spq349832uç*%&2(Z*48 Decrypted key AES Decryption with the decrypted key I love you <3
Bob I love you <3 AES Encryption with a random symmetric key spq349832uç*%&2(Z*48 Clear key RSA Encryption with Alice's public key spq349832uç*%&2(Z*48 Encrypted key Alice RSA Decryption with Alice's private key spq349832uç*%&2(Z*48 Decrypted key AES Decryption with the decrypted key I love you <3

Other attack vectors are a bit more complex, like timing attacks or a few others. This is why it is important to use a well-tested library to implement RSA, like OpenSSL.

Data is also often encrypted in blocks, which adds another layer of “security”. This is called block cipher mode of operation.

Proof using Fermat’s little theorem

But who tells us that the RSA algorithm actually works? Let’s prove it!

Warning

This section involves a lot of math, so if you’re not a fan of equations, feel free to skip it.

Fermat’s little theorem states that if 𝑝 is a prime number and 𝑎 is an integer not divisible by 𝑝, then:

𝑎𝑝11 (mod𝑝)

And we want to show that

(𝑚𝑒)𝑑𝑚 (mod𝑛)

where 𝑛=𝑝𝑞 with 𝑝,𝑞 and 𝑒𝑑1 (mod𝜙(𝑛)) with 𝑒,𝑑.

This basically shows that encrypting and decrypting the same message gives us the original input by first applying the encryption exponent 𝑒 and then the decryption one 𝑑.

Since 𝑒𝑑1 (mod𝜙(𝑛))𝑒𝑑=𝑘𝜙(𝑛)+1, we can rewrite (𝑚𝑒)𝑑 as following:

𝑚𝑒𝑑=𝑚𝑘𝑛+1=𝑚𝑚𝑘𝜙(𝑛)=𝑚(𝑚𝜙(𝑛))𝑘

𝜙(𝑛) can be rewritten as (𝑝1)(𝑞1):

𝑚(𝑚(𝑝1)(𝑞1))𝑘=𝑚(𝑚(𝑝1))(𝑞1)𝑘

If 𝑚 isn’t divisible by both 𝑝 and 𝑞 (coprime to 𝑛=𝑝𝑞), with Fermat’s little theorem, we know that 𝑚𝑝11 (mod𝑛) or 𝑚𝑞11 (mod𝑛)

We can then simply replace 𝑚𝑝1:

𝑚(𝑚(𝑝1))(𝑞1)𝑘?𝑚 (mod𝑛)𝑚1(𝑞1)𝑘𝑚(mod𝑛)

Now, if 𝑚 is divisible by 𝑝 or 𝑞, this means that:

𝑚0 (mod𝑝)𝑚𝑒𝑑0 (mod𝑝)

and likewise for 𝑞:

𝑚0 (mod𝑞)𝑚𝑒𝑑0 (mod𝑞)

The Chinese Remainder Theorem tells us that:

𝑚𝑒𝑑0 (mod𝑝𝑞)𝑚𝑒𝑑0 (mod𝑛)

And that’s about it 😄

Conclusion

RSA is a fascinating algorithm that relies on the mathematical properties of prime numbers and modular arithmetic. It’s a great example of how abstract mathematical concepts can be applied to solve real-world problems.

If you’re interested in learning more about RSA, I highly recommend diving into the details of the algorithm and trying to implement it yourself. It’s a great way to deepen your understanding of cryptography and computer science in general.

VIS
/blog/the-maths-behind-rsa
ONLINE
TOP