Skip to article frontmatterSkip to article content
from __init__ import install_dependencies

await install_dependencies()
from math import gcd
from random import randint

Motivation

For symmetric key cipher to work, a secret key needs to be exchanged between a sender and a receiver. Isn’t this is a chicken-and-egg problem? How to securely communicate a secret key in public to begin with?

This motivates the idea of using asymmetric keys for encryption and decryption:[1]

  • A sender uses the public key kk to encrypt xx to a ciphertext yy using an encryption function ff:

    y=fk(x).y = f_k(x).
  • A receiver uses a private key ll to decrypt yy back to xx using a decryption function gg:

    x=gl(y).x = g_{l}(y).
  • To ensure secrecy, anyone with only the knowledge of kk, ff, gg, and yy but NOT the knowledge of ll should not be able to invert fkf_k.

Formulation

Is it possible to construct an asymmetric key cipher?

Hopefully, but there is an obvious security issue:

YOUR ANSWER HERE

In essence, fkf_k must be invertible for the ciphertext to be decryptable. To overcome this limitation, we can adopt a different approach. Instead of requiring fkf_k to be invertible, we can make it computationally infeasible to invert. For instance:

We seek a collection of fkf_k’s that are easy to compute in one way but difficult to invert directly except using glg_l. Such a function is called a trapdoor function, which is a special kind of one-way function:

How to construct the trapdoor function fkf_k and the inverse glg_l?

No one knows:

Nevertheless, there are plausible implementations such as the RSA algorithm.

RSA Algorithm

In the sequel, we will dive into the RSA algorithm, which was a public key cipher invented by Rivest, Shamir, and Adleman at MIT. Its security is based on the computational difficulty of factoring numbers with prime factors.

Encryption and Decryption

The encryption and decryption use modular exponentiation instead of addition:

ee, dd, and nn need to be chosen carefully to ensure security. For instance, it is insecure to choose e=1e=1 because, otherwise, x=fk(x)x=f_k(x), i.e., the plaintext is unencrypted, or equivalently, the private key is readily computed with d=1d=1.

YOUR ANSWER HERE

Computing the exponentiation for encryption and decryption can be fast using repeated squaring. The built-in function pow already has an efficient implementation:

x, e, n = 3, 2 ** 1000000, 4
pow(x, e, n)

Is pow(x, e, n) the same as the following code?

x ** e % n

Try it in a new console.

Source
def mod_pow(x, e, n):
    """
    Return mod exponentiation (x ** e) % n efficiently.
    """
    # YOUR CODE HERE
    raise NotImplementedError
# tests
x, e, n = 3, 2 ** 100000, 4
assert mod_pow(x, e, n) == pow(x, e, n)

Is that all to RSA algorithm? Not really... The question is how to choose the integers ee, dd, and nn properly so that the cipher is both decryptable and secure.

Decryptability

To ensure decryptability, we require for all xXx\in \mathcal{X} that

xedxmodn.\begin{align} x^{ed} &\equiv x \mod n. \end{align}
Solution to Exercise 4

RSA makes use of the following result to choose (e,d,n)(e, d, n):

There are elegant and elementary combinatorial proofs of the theorem. A simply algebraic proof is as follows.

Consider the non-trivial case where xx is not divisible by pp. It follows that, for i,j{1,,p1}i, j \in \Set{1, \dots, p-1},

xixjmodp    ijmodp,x\cdot i \equiv x\cdot j \mod p \iff i \equiv j \mod p,

which, in turn, implies

i=1p1(xi)i=1p1iα:=modpxp1ααmodp,\begin{align} \prod_{i=1}^{p-1} (x\cdot i) &\equiv \overbrace{\prod_{i=1}^{p-1} i}^{\alpha:=} \mod p\\ x^{p-1} \alpha &\equiv \alpha \mod p, \end{align}

which implies (10) as desired by canceling the non-zero value α on both sides. (Q.E.D.)

Comparing (10) in Fermat’s little theorem to the descryptability condition (8), it appears we can choose

n=p=ed.n = p = ed.

Not really... because, otherwise, either ee or dd is 1, i.e., the private key can be easily computed from the public key. (How?)

Consider the following corollary, which follows from (10):

Solution to Exercise 5

We can similarly rewrite (8) as:

xed11modpx^{ed-1} \equiv 1 \mod p

for x{1,,p1}x\in \Set{1,\dots, p-1}.

Comparing with (14) with (16), can we have choose

n=p and ed1modp1?n = p \kern1em \text{ and } \kern1em ed \equiv 1 \bmod p-1?

No because dd is the modular multiplicative inverse of ee, which is easy to compute, e.g., using pow with an exponent of -1. For instance:

p = 7
e, n = 5, p
try:
    d = pow(e, -1, n - 1)    # easy to compute
    x = randint(1, n-1)      # plaintext
    y = pow(x, e, n)         # encryption
    assert x == pow(y, d, n) # decryption
except ValueError as e:
    print(e)

Note also that some choice of ee does not have a multiplicative inverse. E.g, try setting ee to be 0, 2, 3, 4.

By the Euclidean algorithm,

ed=mλ+gcd(e,λ) for some integers m and d,ed = m\lambda + \operatorname{gcd}(e, \lambda) \kern1em \text{ for some integers $m$ and $d$,}

which is called Bézout’s identity. Equivalently,

edgcd(e,λ)modλ for some integers d,ed \equiv \operatorname{gcd}(e, \lambda) \mod \lambda \kern1em \text{ for some integers $d$,}

which implies (18) is equivalent to (19) as desired. (Q.E.D.)

def mod_inverse(e, n):
    """
    Return the modular multiplicative inverse of e mod n, or
    None if no inverse exists.
    """
    # YOUR CODE HERE
    raise NotImplementedError
# tests
p = 7
assert [mod_inverse(e, p - 1) for e in range(p - 1)] == [None, 1, None, None, None, 5]

How to make it difficult to compute dd?

RSA makes use of the hardness of factoring a product of large primes to create the desired trapdoor function.

Proof (Theorem 1)

With m(p1)m(p-1) in (14) being the least common multiple lcm(p1,q1)\operatorname{lcm}(p-1,q-1) for another prime number qq, we have

xlcm(p1,q1)1modpandxlcm(p1,q1)1modqby symmetry.\begin{align} x^{\operatorname{lcm}(p-1, q-1)} &\equiv 1 \mod p && \text{and}\\ x^{\operatorname{lcm}(p-1, q-1)} &\equiv 1 \mod q && \text{by symmetry.} \end{align}

This implies xlcm(p1,q1)1x^{\operatorname{lcm}(p-1, q-1)} - 1 is divisible by both pp and qq, and so

xlcm(p1,q1)1modpq.x^{\operatorname{lcm}(p-1, q-1)} \equiv 1 \mod p q.

Raising both sides to the power of any positive integer mm gives (22).

The choice of ee guarantees a multiplicative inverse dd exists by Proposition 2. Note that 2 is excluded from K\mathcal{K} because λ(n)\lambda(n) must be an even number. (Why?)

Solution to Exercise 7

How to generate the public key and private key?

from math import gcd
from random import randint


def get_rsa_keys(p, q):
    n = p * q
    lcm = (p - 1) * (q - 1) // gcd(p - 1, q - 1)
    while True:
        e = randint(2, lcm - 1)
        # YOUR CODE HERE
        raise NotImplementedError
    d = pow(e, -1, lcm)
    return e, d, n, lcm

As an example, if we choose two prime numbers p=17094589121p=17094589121 and q=1062873761q=1062873761:

e, d, n, lcm = get_rsa_keys(17094589121, 1062873761)
assert e * d % lcm == 1
print(
    f"""Public key : k = (e, n) = ({e}, {n})
Private key: l = (d, n) = ({d}, {n})"""
)

The integer 1302 can be encrypted using the public key as follows:

x = 1302
y = pow(x, e, n)
print(f'The plaintext {x} gets encrypted into {y}.')

With the private key, the ciphertext can be decrypted easily as follows:

output = pow(y, d, n)
print(f'The ciphertext {y} gets decrypted into {output}.')

To complete the implementation of RSA, we need to generate large prime numbers. Computing large prime numbers or testing the primality of a large number is difficult. (See the largest known prime number.) Fortunately, there are approximate primality test that is fast, e.g., the Rabin-Miller test. An implementation is available from sympy:

import sympy as sp

x = 531137992816767098689588206552468627329593117727031923199444138200403559860852242739162502265229285668889329486246501015346579337652707239409519978766587351943831270835393219031728127
sp.isprime(x)

As the docstring of isprime mentioned, there is a small chance of false positive, but this is acceptable as there is no known cases.

Test if n is a prime number (True) or not (False). For n < 2^64 the
answer is definitive; larger n values have a small probability of actually
being pseudoprimes.

Negative numbers (e.g. -2) are not considered prime.

The first step is looking for trivial factors, which if found enables
a quick return.  Next, if the sieve is large enough, use bisection search
on the sieve.  For small numbers, a set of deterministic Miller-Rabin
tests are performed with bases that are known to have no counterexamples
in their range.  Finally if the number is larger than 2^64, a strong
BPSW test is performed.  While this is a probable prime test and we
believe counterexamples exist, there are no known counterexamples.
Footnotes
  1. Computational complexity is an asymptotic notion with respect to the order of growth of mm.

References
  1. Diffie, W., & Hellman, M. E. (2022). New Directions in Cryptography. In Democratizing Cryptography (pp. 365–390). ACM. 10.1145/3549993.3550007
  2. Rivest, R. L., Shamir, A., & Adleman, L. (1978). A method for obtaining digital signatures and public-key cryptosystems. Communications of the ACM, 21(2), 120–126. 10.1145/359340.359342