cstef
/post/rsa
TOP

The Maths Behind RSA

We use RSA every day to secure our online transactions, but how does it work? Let's dive into the mathematical principles behind this widely used encryption algorithm.

21/11/2024 · 8 min read · sapling 🪴

RSA (named after its inventors, Ron Rivest, Adi Shamir, and Leonard Adleman) is one of the most widely used encryption algorithms in the world. It’s used to secure everything from online banking transactions to secure communication between devices. But how does it work? What are the mathematical principles behind it? Let’s dive in.

Historical Context

Asymmetric encryption, also known as public-key cryptography, was a game-changer in the world of cryptography. Before its invention, symmetric encryption was the norm, where both parties shared a secret key to encrypt and decrypt messages.

As the name implies, with asymmetric encryption, we now have two keys: a public and a private one. The public one can be shared with anyone, and the private needs to be kept secret. Let’s take our good old Alice and Bob to illustrate this!

BobI love you <3Encryptionwith Alice'spublic keyspq349832uç*%&2(Z*48InternetAliceDecryptionwith Alice'sprivate key
BobI love you <3Encryptionwith Alice'spublic keyspq349832uç*%&2(Z*48InternetAliceDecryptionwith Alice'sprivate key

Now, let’s say, we have Charles spying on our two lovers:

BobI love you <3Encryptionwith Alice'spublic keyspq349832uç*%&2(Z*48InternetAliceDecryptionwith Alice'sprivate keyCharlesCan see encrypted messagebut can't decrypt it
BobI love you <3Encryptionwith Alice'spublic keyspq349832uç*%&2(Z*48InternetAliceDecryptionwith Alice'sprivate keyCharlesCan see encrypted messagebut can't decrypt it

Charles will never be able to eavesdrop on Bob and Alice unless he gets access to Alice’s private key. But how is that mathematically possible? Because humans are not clever enough at the moment.

Prime numbers

You are probably used to integers (whole numbers), but did you know some of them are special? The two most basic groups of integers are even (22,44,66,…) and odd (33,55,77,…) numbers. However, some of these numbers are even more special: they cannot be divided into a smaller whole number other than themselves! As small as this detail might seem, this is on what (pretty much) the whole current cryptography system relies on!

Machines are very bad at finding the divisors of a number (humans are even worse). For example, let’s take 9191

91 = 7 * 1391 = 7 * 13

We cannot divide 77 and 1313 further down, that’s because these are in fact prime numbers. Now let’s say you have two huge prime numbers, like 1'066'340'417'491'710'595'814'572'1691'066'340'417'491'710'595'814'572'169 and 19'134'702'400'093'278'081'449'423'91719'134'702'400'093'278'081'449'423'917. Now multiply them together:

1066340417491710595814572169 * 19134702400093278081066340417491710595814572169 * 1913470240009327808

That’s quite a big integer! If someone was given the final number and was asked to find the unique solution (the two prime numbers you multiplied together), this would be an impossible task.

Well, in fact here I chose two Fibonacci prime numbers so it would be easier but anyways…

The same goes for a computer, but with even larger numbers.

Modulo

The modulo function also plays an important role in the RSA algorithm. Modulo can be seen as “wrapping around” when a number reaches a certain value (or the remainder of a division). The most common example of this in everyday life is how we use clocks: After 12PM, the clocks “resets” at 1PM and so on at 12AM.

This can be mathematically written as:

13 eq.triple 1 space (mod 12)13 eq.triple 1 space (mod 12)

We say that ”1313 is congruent to 11 modulo 1212”, because

13 = 1*12+113 = 1*12+1

Generally speaking,

a eq.triple b space (mod m) <==> a = k*m + b spacea eq.triple b space (mod m) <==> a = k*m + b space

The RSA Algorithm

The procedure to generate the private (nn, dd) and public (nn, ee) keys is the following:

  1. Choose two prime numbers (preferably big ones).
  2. Compute n = p*qn = p*q.
  3. Compute phi.alt(n) = (p-1) * (q-1)phi.alt(n) = (p-1) * (q-1).
    • This value comes from the fact that both pp and qq are prime, thus phi.alt(p) = (p-1)phi.alt(p) = (p-1), and phi.alt(p*q) = phi.alt(p) * phi.alt(q)phi.alt(p*q) = phi.alt(p) * phi.alt(q)
  4. Choose the encryption exponent ee so that it is prime with, and smaller than phi.alt(n)phi.alt(n)
  5. Compute the decryption exponent dd, the inverse of ee modulo phi.alt(n)phi.alt(n), d eq.triple e^(-1) (mod phi.alt(n))d eq.triple e^(-1) (mod phi.alt(n))
    • A quick and easy solution here is to use Euler’s extended algorithm, which we know will work because e < phi.alt(n)e < phi.alt(n) and is ee is prime to phi.alt(n)phi.alt(n), which means gcd(e, phi.alt(n)) = 1gcd(e, phi.alt(n)) = 1
    • We then get e*d + phi.alt(n)*y = 1 \ <==> d eq.triple e^(-1) (e*d + phi.alt(n)*y = 1 \ <==> d eq.triple e^(-1) (

The alorithm’s security relies on the following equivalence: (m^e)^d eq.triple (mod n)(m^e)^d eq.triple (mod n)

We can now encrypt our message CC (clear) to MM (encrypted) with the following formula:

C eq.triple M^e space (mod n)C eq.triple M^e space (mod n)

and decrypt MM to CC with:

M eq.triple C^d space (mod n)M eq.triple C^d space (mod n)

You may now ask: “But these are numbers, I want to encrypt a text message!”. Well, text is just an illusion (on your phone), as its characters are represented as numbers in the underlying system (see UTF-8). We can first transcode our text to numbers:

For simplicity, let’s take each letter’s position in the alphabet, starting from 11: “I luv u” becomes:

i = 9i = 9, l = 12l = 12, u = 21u = 21, v = 22v = 22, u = 21u = 21

Then we can encrypt each number separately and concatenate them to get our encrypted message.

Let’s encrypt this message by hand !

  1. We choose p = bold(5)p = bold(5) and q = bold(13)q = bold(13)
  2. n = p * q = bold(65)n = p * q = bold(65)
  3. phi.alt(n) = phi.alt(65) &= (p-1)*(q-1) \ &= 4*12 phi.alt(n) = phi.alt(65) &= (p-1)*(q-1) \ &= 4*12
  4. Let’s take e = bold(5)e = bold(5)
  5. We apply Euler’s Extended algorithm for phi.alt(n) = 48phi.alt(n) = 48 and e = 5e = 5 to find dd
    1. 48 eq.triple 3 space (mod 5)48 eq.triple 3 space (mod 5)
    2. 5 eq.triple 2 space (mod 3)5 eq.triple 2 space (mod 3)
    3. 3 eq.triple 1 space (mod 2)3 eq.triple 1 space (mod 2)
    4. 2 eq.triple 0 space (mod 1)2 eq.triple 0 space (mod 1)
    • gcd(48, 5)gcd(48, 5) is effectively 1, we can find 48*x + 5*y = 148*x + 5*y = 1 by simplifying the steps backwards:
    1. 1 = 3 - colred(2)*11 = 3 - colred(2)*1
    2.  colred(2) = 5 - 3*1 \ <==> 1 = 3 - colred((5-3*1) colred(2) = 5 - 3*1 \ <==> 1 = 3 - colred((5-3*1)
    3.  colgreen(3) = 48 - 9*5 \ <==> 1 = 2 * colgreen((4 colgreen(3) = 48 - 9*5 \ <==> 1 = 2 * colgreen((4
    • We have d=x=-19d=x=-19, but a modular inverse needs to be positive, so we add 4848:  -19+48=29 \ <==> underline(d = 29 eq.triple 5^(-1 -19+48=29 \ <==> underline(d = 29 eq.triple 5^(-1
    • Our keys are: (n=65, space d=29)(n=65, space d=29) and (n=65, space e=5)(n=65, space e=5)
  6. We can start encrypting!
    1. i = 9 ==> C eq.triple 9^5 space (mod 65)   \ <==> i = 9 ==> C eq.triple 9^5 space (mod 65)   \ <==>
    2. l = 12 ==> C eq.triple 12^5 space (mod 65) \ <==> l = 12 ==> C eq.triple 12^5 space (mod 65) \ <==>
    3. u = 21 ==> C eq.triple 15^5 space (mod 65) \ <==> u = 21 ==> C eq.triple 15^5 space (mod 65) \ <==>
    4. v = 22 ==> C eq.triple 22^5 space (mod 65) \ <==> v = 22 ==> C eq.triple 22^5 space (mod 65) \ <==>

Our encrypted message is [29, 12, 21, 42, 21][29, 12, 21, 42, 21]

Let’s now check if everything went well by decrypting it:

We did it!

Attack Vectors

The RSA algorithm is not magic, and it can be broken if not implemented correctly. For instance, you may have noticed that the letter “u” has been encoded twice in the sentence, and the output remains the same. Frequency analysis could be used here to guess letters from a longer text. To remedy this, we can use a technique called padding or combine RSA with symmetric encryption, like AES:

BobI love you <3AES Encryptionwith a randomsymmetric keyspq349832uç*%&2(Z*48Clear keyRSA Encryptionwith Alice'spublic keyspq349832uç*%&2(Z*48Encrypted keyAliceRSA Decryptionwith Alice'sprivate keyspq349832uç*%&2(Z*48Decrypted keyAES Decryptionwith thedecrypted keyI love you <3
BobI love you <3AES Encryptionwith a randomsymmetric keyspq349832uç*%&2(Z*48Clear keyRSA Encryptionwith Alice'spublic keyspq349832uç*%&2(Z*48Encrypted keyAliceRSA Decryptionwith Alice'sprivate keyspq349832uç*%&2(Z*48Decrypted keyAES Decryptionwith thedecrypted keyI love you <3

Other attack vectors are a bit more complex, like timing attacks or a few others. This is why it is important to use a well-tested library to implement RSA, like OpenSSL.

Data is also often encrypted in blocks, which adds another layer of “security”. This is called block cipher mode of operation.

Proof using Fermat’s little theorem

But who tells us that the RSA algorithm actually works? Let’s prove it!

Warning

This section involves a lot of math, so if you’re not a fan of equations, feel free to skip it.

Fermat’s little theorem states that if pp is a prime number and aa is an integer not divisible by pp, then:

a^(p-1) eq.triple 1 space (mod p)a^(p-1) eq.triple 1 space (mod p)

And we want to show that

(m^e)^d eq.triple m space (mod n)(m^e)^d eq.triple m space (mod n)

where n = p*qn = p*q with p,q in PPp,q in PP and e*d eq.triple 1 space (mod phi.alt(n))e*d eq.triple 1 space (mod phi.alt(n)) with e,d in NNe,d in NN.

This basically shows that encrypting and decrypting the same message gives us the original input by first applying the encryption exponent ee and then the decryption one dd.

Since e*d eq.triple 1 space (mod phi.alt(n)) <==> e*d = e*d eq.triple 1 space (mod phi.alt(n)) <==> e*d = , we can rewrite (m^e)^d(m^e)^d as following:

m^(e*d) = m^(k*n + 1) = m*m^(k*phi.alt(n)) = m * (m^(e*d) = m^(k*n + 1) = m*m^(k*phi.alt(n)) = m * (

phi.alt(n)phi.alt(n) can be rewritten as (p-1)*(q-1)(p-1)*(q-1):

m * (m^((p-1)*(q-1)))^k = m * (m^((p-1)))^((q-1)*km * (m^((p-1)*(q-1)))^k = m * (m^((p-1)))^((q-1)*k

If mm isn’t divisible by both pp and qq (coprime to n = p*qn = p*q), with Fermat’s little theorem, we know that m^(p-1) eq.triple 1 space (mod n)m^(p-1) eq.triple 1 space (mod n) or m^(q-1) eq.triple 1 space (mod n)m^(q-1) eq.triple 1 space (mod n)

We can then simply replace m^(p-1)m^(p-1):

m * (m^((p-1)))^((q-1)*k) eq.triple^? m space (modm * (m^((p-1)))^((q-1)*k) eq.triple^? m space (mod

Now, if mm is divisible by pp or qq, this means that:

m eq.triple 0 space (mod p) <==> m^(e*d) eq.triplem eq.triple 0 space (mod p) <==> m^(e*d) eq.triple

and likewise for qq:

m eq.triple 0 space (mod q) <==> m^(e*d) eq.triplem eq.triple 0 space (mod q) <==> m^(e*d) eq.triple

The Chinese Remainder Theorem tells us that:

m^(e*d) eq.triple 0 space (mod p*q) <==> m^(e*d) em^(e*d) eq.triple 0 space (mod p*q) <==> m^(e*d) e

And that’s about it 😄

Conclusion

RSA is a fascinating algorithm that relies on the mathematical properties of prime numbers and modular arithmetic. It’s a great example of how abstract mathematical concepts can be applied to solve real-world problems.

If you’re interested in learning more about RSA, I highly recommend diving into the details of the algorithm and trying to implement it yourself. It’s a great way to deepen your understanding of cryptography and computer science in general.