Asymmetric Key Cipher

from __init__ import install_dependencies

await install_dependencies()

from math import gcd
from random import randint

Motivation¶

For symmetric key cipher to work, a secret key needs to be exchanged between a sender and a receiver. Isn’t this is a chicken-and-egg problem? How to securely communicate a secret key in public to begin with?

This motivates the idea of using asymmetric keys for encryption and decryption:^[1]

A sender uses the public key $k$ to encrypt $x$ to a ciphertext $y$ using an encryption function $f$ :
$y = f_k(x).$
(1)
A receiver uses a private key $l$ to decrypt $y$ back to $x$ using a decryption function $g$ :
$x = g_{l}(y).$
(2)
To ensure secrecy, anyone with only the knowledge of $k$ , $f$ , $g$ , and $y$ but NOT the knowledge of $l$ should not be able to invert $f_k$ .

Formulation¶

Is it possible to construct an asymmetric key cipher?

Hopefully, but there is an obvious security issue:

YOUR ANSWER HERE

In essence, $f_k$ must be invertible for the ciphertext to be decryptable. To overcome this limitation, we can adopt a different approach. Instead of requiring $f_k$ to be invertible, we can make it computationally infeasible to invert. For instance:

We seek a collection of $f_k$ ’s that are easy to compute in one way but difficult to invert directly except using $g_l$ . Such a function is called a trapdoor function, which is a special kind of one-way function:

How to construct the trapdoor function $f_k$ and the inverse $g_l$ ?

No one knows:

Nevertheless, there are plausible implementations such as the RSA algorithm.

RSA Algorithm¶

In the sequel, we will dive into the RSA algorithm, which was a public key cipher invented by Rivest, Shamir, and Adleman at MIT. Its security is based on the computational difficulty of factoring numbers with prime factors.

Encryption and Decryption¶

The encryption and decryption use modular exponentiation instead of addition:

$e$ , $d$ , and $n$ need to be chosen carefully to ensure security. For instance, it is insecure to choose $e=1$ because, otherwise, $x=f_k(x)$ , i.e., the plaintext is unencrypted, or equivalently, the private key is readily computed with $d=1$ .

YOUR ANSWER HERE

Computing the exponentiation for encryption and decryption can be fast using repeated squaring. The built-in function pow already has an efficient implementation:

x, e, n = 3, 2 ** 1000000, 4
pow(x, e, n)

Is pow(x, e, n) the same as the following code?

x ** e % n

Try it in a new console.

Source

def mod_pow(x, e, n):
    """
    Return mod exponentiation (x ** e) % n efficiently.
    """
    # YOUR CODE HERE
    raise NotImplementedError

# tests
x, e, n = 3, 2 ** 100000, 4
assert mod_pow(x, e, n) == pow(x, e, n)

Is that all to RSA algorithm? Not really... The question is how to choose the integers $e$ , $d$ , and $n$ properly so that the cipher is both decryptable and secure.

Decryptability¶

To ensure decryptability, we require for all $x\in \mathcal{X}$ that

\begin{align} x^{ed} &\equiv x \mod n. \end{align}

(8)

Solution to Exercise 4

RSA makes use of the following result to choose $(e, d, n)$ :

There are elegant and elementary combinatorial proofs of the theorem. A simply algebraic proof is as follows.

Proof (Proposition 1)

Consider the non-trivial case where $x$ is not divisible by $p$ . It follows that, for $i, j \in \Set{1, \dots, p-1}$ ,

x\cdot i \equiv x\cdot j \mod p \iff i \equiv j \mod p,

(11)

which, in turn, implies

\begin{align} \prod_{i=1}^{p-1} (x\cdot i) &\equiv \overbrace{\prod_{i=1}^{p-1} i}^{\alpha:=} \mod p\\ x^{p-1} \alpha &\equiv \alpha \mod p, \end{align}

(12)

which implies (10) as desired by canceling the non-zero value α on both sides. (Q.E.D.)

Comparing (10) in Fermat’s little theorem to the descryptability condition (8), it appears we can choose

n = p = ed.

(13)

Not really... because, otherwise, either $e$ or $d$ is 1, i.e., the private key can be easily computed from the public key. (How?)

Consider the following corollary, which follows from (10):

Solution to Exercise 5

We can similarly rewrite (8) as:

x^{ed-1} \equiv 1 \mod p

(16)

for $x\in \Set{1,\dots, p-1}$ .

Comparing with (14) with (16), can we have choose

n = p \kern1em \text{ and } \kern1em ed \equiv 1 \bmod p-1?

(17)

No because $d$ is the modular multiplicative inverse of $e$ , which is easy to compute, e.g., using pow with an exponent of -1. For instance:

p = 7
e, n = 5, p
try:
    d = pow(e, -1, n - 1)    # easy to compute
    x = randint(1, n-1)      # plaintext
    y = pow(x, e, n)         # encryption
    assert x == pow(y, d, n) # decryption
except ValueError as e:
    print(e)

Note also that some choice of $e$ does not have a multiplicative inverse. E.g, try setting $e$ to be 0, 2, 3, 4.

Proof (Proposition 2)

By the Euclidean algorithm,

ed = m\lambda + \operatorname{gcd}(e, \lambda) \kern1em \text{ for some integers $m$ and $d$,}

(20)

which is called Bézout’s identity. Equivalently,

ed \equiv \operatorname{gcd}(e, \lambda) \mod \lambda \kern1em \text{ for some integers $d$,}

(21)

which implies (18) is equivalent to (19) as desired. (Q.E.D.)

def mod_inverse(e, n):
    """
    Return the modular multiplicative inverse of e mod n, or
    None if no inverse exists.
    """
    # YOUR CODE HERE
    raise NotImplementedError

# tests
p = 7
assert [mod_inverse(e, p - 1) for e in range(p - 1)] == [None, 1, None, None, None, 5]

How to make it difficult to compute $d$ ?

RSA makes use of the hardness of factoring a product of large primes to create the desired trapdoor function.

Proof (Theorem 1)

With $m(p-1)$ in (14) being the least common multiple $\operatorname{lcm}(p-1,q-1)$ for another prime number $q$ , we have

\begin{align} x^{\operatorname{lcm}(p-1, q-1)} &\equiv 1 \mod p && \text{and}\\ x^{\operatorname{lcm}(p-1, q-1)} &\equiv 1 \mod q && \text{by symmetry.} \end{align}

(24)

This implies $x^{\operatorname{lcm}(p-1, q-1)} - 1$ is divisible by both $p$ and $q$ , and so

x^{\operatorname{lcm}(p-1, q-1)} \equiv 1 \mod p q.

(25)

Raising both sides to the power of any positive integer $m$ gives (22).

The choice of $e$ guarantees a multiplicative inverse $d$ exists by Proposition 2. Note that 2 is excluded from $\mathcal{K}$ because $\lambda(n)$ must be an even number. (Why?)

Solution to Exercise 7

How to generate the public key and private key?

from math import gcd
from random import randint


def get_rsa_keys(p, q):
    n = p * q
    lcm = (p - 1) * (q - 1) // gcd(p - 1, q - 1)
    while True:
        e = randint(2, lcm - 1)
        # YOUR CODE HERE
        raise NotImplementedError
    d = pow(e, -1, lcm)
    return e, d, n, lcm

As an example, if we choose two prime numbers $p=17094589121$ and $q=1062873761$ :

e, d, n, lcm = get_rsa_keys(17094589121, 1062873761)
assert e * d % lcm == 1
print(
    f"""Public key : k = (e, n) = ({e}, {n})
Private key: l = (d, n) = ({d}, {n})"""
)

The integer 1302 can be encrypted using the public key as follows:

x = 1302
y = pow(x, e, n)
print(f'The plaintext {x} gets encrypted into {y}.')

With the private key, the ciphertext can be decrypted easily as follows:

output = pow(y, d, n)
print(f'The ciphertext {y} gets decrypted into {output}.')

To complete the implementation of RSA, we need to generate large prime numbers. Computing large prime numbers or testing the primality of a large number is difficult. (See the largest known prime number.) Fortunately, there are approximate primality test that is fast, e.g., the Rabin-Miller test. An implementation is available from sympy:

import sympy as sp

x = 531137992816767098689588206552468627329593117727031923199444138200403559860852242739162502265229285668889329486246501015346579337652707239409519978766587351943831270835393219031728127
sp.isprime(x)

As the docstring of isprime mentioned, there is a small chance of false positive, but this is acceptable as there is no known cases.

Test if n is a prime number (True) or not (False). For n < 2^64 the
answer is definitive; larger n values have a small probability of actually
being pseudoprimes.

Negative numbers (e.g. -2) are not considered prime.

The first step is looking for trivial factors, which if found enables
a quick return.  Next, if the sieve is large enough, use bisection search
on the sieve.  For small numbers, a set of deterministic Miller-Rabin
tests are performed with bases that are known to have no counterexamples
in their range.  Finally if the number is larger than 2^64, a strong
BPSW test is performed.  While this is a probable prime test and we
believe counterexamples exist, there are no known counterexamples.

Footnotes¶

See Diffie and Hellman (1976).
↩
Computational complexity is an asymptotic notion with respect to the order of growth of $m$ .
↩

References¶

Diffie, W., & Hellman, M. E. (2022). New Directions in Cryptography. In Democratizing Cryptography (pp. 365–390). ACM. 10.1145/3549993.3550007
Rivest, R. L., Shamir, A., & Adleman, L. (1978). A method for obtaining digital signatures and public-key cryptosystems. Communications of the ACM, 21(2), 120–126. 10.1145/359340.359342

Lab 8

Cybersecurity

Lecture 9

Associative Containers