from __init__ import install_dependencies
await install_dependencies()
from math import gcd
from random import randint
Motivation¶
For symmetric key cipher to work, a secret key needs to be exchanged between a sender and a receiver. Isn’t this is a chicken-and-egg problem? How to securely communicate a secret key in public to begin with?
This motivates the idea of using asymmetric keys for encryption and decryption:[1]
A sender uses the public key to encrypt to a ciphertext using an encryption function :
A receiver uses a private key to decrypt back to using a decryption function :
To ensure secrecy, anyone with only the knowledge of , , , and but NOT the knowledge of should not be able to invert .
Formulation¶
Is it possible to construct an asymmetric key cipher?
Hopefully, but there is an obvious security issue:
YOUR ANSWER HERE
In essence, must be invertible for the ciphertext to be decryptable. To overcome this limitation, we can adopt a different approach. Instead of requiring to be invertible, we can make it computationally infeasible to invert. For instance:
We seek a collection of ’s that are easy to compute in one way but difficult to invert directly except using . Such a function is called a trapdoor function, which is a special kind of one-way function:
How to construct the trapdoor function and the inverse ?
No one knows:
Nevertheless, there are plausible implementations such as the RSA algorithm.
RSA Algorithm¶
In the sequel, we will dive into the RSA algorithm, which was a public key cipher invented by Rivest, Shamir, and Adleman at MIT. Its security is based on the computational difficulty of factoring numbers with prime factors.
Encryption and Decryption¶
The encryption and decryption use modular exponentiation instead of addition:
, , and need to be chosen carefully to ensure security. For instance, it is insecure to choose because, otherwise, , i.e., the plaintext is unencrypted, or equivalently, the private key is readily computed with .
YOUR ANSWER HERE
Computing the exponentiation for encryption and decryption can be fast using repeated squaring. The built-in function pow
already has an efficient implementation:
x, e, n = 3, 2 ** 1000000, 4
pow(x, e, n)
Is pow(x, e, n)
the same as the following code?
x ** e % n
Try it in a new console.
Source
def mod_pow(x, e, n):
"""
Return mod exponentiation (x ** e) % n efficiently.
"""
# YOUR CODE HERE
raise NotImplementedError
# tests
x, e, n = 3, 2 ** 100000, 4
assert mod_pow(x, e, n) == pow(x, e, n)
Is that all to RSA algorithm? Not really... The question is how to choose the integers , , and properly so that the cipher is both decryptable and secure.
Decryptability¶
To ensure decryptability, we require for all that
Solution to Exercise 4
RSA makes use of the following result to choose :
There are elegant and elementary combinatorial proofs of the theorem. A simply algebraic proof is as follows.
Proof (Proposition 1)
Comparing (10) in Fermat’s little theorem to the descryptability condition (8), it appears we can choose
Not really... because, otherwise, either or is 1, i.e., the private key can be easily computed from the public key. (How?)
Consider the following corollary, which follows from (10):
Solution to Exercise 5
No because is the modular multiplicative inverse of , which is easy to compute, e.g., using pow
with an exponent of -1
. For instance:
p = 7
e, n = 5, p
try:
d = pow(e, -1, n - 1) # easy to compute
x = randint(1, n-1) # plaintext
y = pow(x, e, n) # encryption
assert x == pow(y, d, n) # decryption
except ValueError as e:
print(e)
Note also that some choice of does not have a multiplicative inverse. E.g, try setting to be 0, 2, 3, 4.
Proof (Proposition 2)
By the Euclidean algorithm,
which is called Bézout’s identity. Equivalently,
which implies (18) is equivalent to (19) as desired. (Q.E.D.)
def mod_inverse(e, n):
"""
Return the modular multiplicative inverse of e mod n, or
None if no inverse exists.
"""
# YOUR CODE HERE
raise NotImplementedError
# tests
p = 7
assert [mod_inverse(e, p - 1) for e in range(p - 1)] == [None, 1, None, None, None, 5]
How to make it difficult to compute ?
RSA makes use of the hardness of factoring a product of large primes to create the desired trapdoor function.
The choice of guarantees a multiplicative inverse exists by Proposition 2. Note that 2 is excluded from because must be an even number. (Why?)
Solution to Exercise 7
How to generate the public key and private key?
from math import gcd
from random import randint
def get_rsa_keys(p, q):
n = p * q
lcm = (p - 1) * (q - 1) // gcd(p - 1, q - 1)
while True:
e = randint(2, lcm - 1)
# YOUR CODE HERE
raise NotImplementedError
d = pow(e, -1, lcm)
return e, d, n, lcm
As an example, if we choose two prime numbers and :
e, d, n, lcm = get_rsa_keys(17094589121, 1062873761)
assert e * d % lcm == 1
print(
f"""Public key : k = (e, n) = ({e}, {n})
Private key: l = (d, n) = ({d}, {n})"""
)
The integer 1302 can be encrypted using the public key as follows:
x = 1302
y = pow(x, e, n)
print(f'The plaintext {x} gets encrypted into {y}.')
With the private key, the ciphertext can be decrypted easily as follows:
output = pow(y, d, n)
print(f'The ciphertext {y} gets decrypted into {output}.')
To complete the implementation of RSA, we need to generate large prime numbers. Computing large prime numbers or testing the primality of a large number is difficult. (See the largest known prime number.) Fortunately, there are approximate primality test that is fast, e.g., the Rabin-Miller test. An implementation is available from sympy
:
import sympy as sp
x = 531137992816767098689588206552468627329593117727031923199444138200403559860852242739162502265229285668889329486246501015346579337652707239409519978766587351943831270835393219031728127
sp.isprime(x)
As the docstring of isprime
mentioned, there is a small chance of false positive, but this is acceptable as there is no known cases.
Test if n is a prime number (True) or not (False). For n < 2^64 the
answer is definitive; larger n values have a small probability of actually
being pseudoprimes.
Negative numbers (e.g. -2) are not considered prime.
The first step is looking for trivial factors, which if found enables
a quick return. Next, if the sieve is large enough, use bisection search
on the sieve. For small numbers, a set of deterministic Miller-Rabin
tests are performed with bases that are known to have no counterexamples
in their range. Finally if the number is larger than 2^64, a strong
BPSW test is performed. While this is a probable prime test and we
believe counterexamples exist, there are no known counterexamples.
Computational complexity is an asymptotic notion with respect to the order of growth of .
- Diffie, W., & Hellman, M. E. (2022). New Directions in Cryptography. In Democratizing Cryptography (pp. 365–390). ACM. 10.1145/3549993.3550007
- Rivest, R. L., Shamir, A., & Adleman, L. (1978). A method for obtaining digital signatures and public-key cryptosystems. Communications of the ACM, 21(2), 120–126. 10.1145/359340.359342