Our Cryptography Path

What we learned in the Cryptography course for
the Mathematical Lyceum

Michael Lodi, Marco Sbaraglia, Simone Martini

Fundamentals of Cryptography
with [ incorrect | correct ] choices

Secret communication: altering an unencrypted (plaintext) message so that, even if intercepted by someone other than the legitimate recipient, it is not understood.

A cryptosystem consists of:

plaintext messages;
encrypted messages;
[ encryption | decryption ] algorithm → a sequence of steps to execute to go from the encrypted message to the plaintext message;
[ encryption | decryption ] algorithm → a sequence of steps to execute to go from the plaintext message to the encrypted message;
key(s) → secret(s) known only to the sender and the receiver that, when inserted into the encryption and decryption algorithm, allow[s] the message to be encrypted and decrypted.

Two simple encryption mechanisms are transposition (swapping the letters’ position) and substitution (substituting each letter with another one).

A simple example of a [ substitution | transposition ] cipher is one that chooses a permutation (i.e., a rule that associates an initial position with a new, different, final position) and applies it to the message, thus “mixing” the letters. The recipient applies the permutation rule in reverse and so goes back to the plaintext message.

This system is easily attacked by searching for the anagrams of the encrypted message.

A famous example of [ substitution | transposition ] cipher is the Caesar cipher.

Plaintext messages: all possible sequences of the alphabet’s characters. abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&'()*+,-./:;<=>?@[\]^{|}~0123456789àèéìòù
Key: a number K between 1 and [ the length of the message | the length of the alphabet ] (98 in our scenario).
Encryption algorithm:

for each letter of the message, identify its position in the alphabet (a→0, b→1, etc.);
add the number K to each position;
the encrypted message is the one made of the letters in the positions calculated in point b.

Decryption algorithm:

for each letter of the encrypted message, identify its position in the alphabet;
subtract the number K from each position;
the plaintext message is the one made of the letters in the positions calculated in point b.

Two weaknesses of the Caesar cipher follow here.

With enough time and accuracy, or using a computer, it is easy to “try to decipher the encrypted message with [ all the keys | some conveniently chosen keys ]”, i.e., to perform a brute-force attack. Indeed, the number of keys (and therefore the number of possible encryptions for each message) corresponds just to the length of the [ alphabet | message ].
In each language, different letters occur with different frequencies. Since in the Caesar cipher, given a key K, a letter is always encoded with [ the same letter | a different letter ] (pos+K), letters in the ciphertext maintain the frequencies of their corresponding letters in the plaintext message. It is easy to make assumptions about what the key K is when the average frequencies in the language of the message are known.

The Caesar cipher's weaknesses are shared by all the classical substitution ciphers, even the much more complex ones. Therefore we should:

[ increase | decrease ] the number of possible encryptions;
make the encrypted message [ show | hide ] all the information in the plaintext message; among the information that the encrypted message must [ hide | show ] is the frequency of the plaintext message’s letters.

The [ One-time pad | Diffie-Hellman ] cryptosystem can be a way to do so.

Given a message M of length L, a key K is generated, consisting of a random sequence of [ L | K ] numbers. Then, the first letter of M is encrypted with the first number of K, the second letter of M with the second number of K, and so on.

In this cryptosystem, it is important to [ maintain | change ] the key every time, to avoid that an attacker, having the possibility to use the system to encrypt messages of his choice, could deduce the secret key, knowing the messages he had encrypted and the corresponding encrypted messages.

This system, which we have seen as the application of the Caesar cipher with a different key for each letter of the message, in reality is realized with the logical operation XOR between the bits of the plaintext message and the corresponding bits of the key. XOR is a very fast operation for computers since it is done at the hardware level through a simple electronic circuit.

OTP is not attackable by brute force for two reasons.

The number of possible encryptions is huge. If A is the length of the alphabet (e.g., 98) and L is the length of the message, the number of possible ciphers is [ A×L | AL ]. Even for messages of only a few dozen characters, we reach numbers [ unmanageable | manageable ] for any computer (even in the future).
Since each character of the plaintext message [ has | does not have ] an equal probability of being encrypted with any key, given an encrypted message of length L, each sequence of L letters has [ the same | different ] probability of being the corresponding plaintext message. Therefore, it is [ possible | impossible ] to determine which is the original one among all the plaintext messages of length L.

OTP [ is attackable | is not attackable ] with frequencies, because the randomness of the key “flattens” the frequencies of the letters: in sufficiently long texts, each letter will be sooner or later encoded with all possible keys. So in the encrypted, message the characters will have (basically) the same frequency, [ revealing information | not revealing any information ] on the corresponding letters of the plaintext message.

So OTP achieves so-called [ “perfect security” | “computational security” ].

OTP...

[ does not require | requires ] changing the key every time;
it is [ difficult | easy ] to generate random keys, even with computers;
it is expensive to use keys as long as the messages to be encrypted (it doubles the amount of data to be exchanged);
[ requires | does not require ] securely exchanging secret keys before being able to communicate.

Therefore, modern systems are not based on “perfect security,” but on so-called “computational security.” Although it is not possible to prove mathematically that a system is perfectly secure, we “settle” for systems for which security is guaranteed by the fact that - although [ not being | being ] theoretically attackable - attacking the system would require a power (greater than that of a supernova explosion) and a computational time (greater than the age of the universe) too big for such an attack to be realistic.

Nowadays, there are two kinds of these computationally secure systems.

“Secret-key” systems (also called “symmetric” because sender and receiver
[ share the same key | use different keys ] ).
“Public-key” systems (also called “asymmetric” because sender and receiver
[ share the same key | use different keys ] ).

“Secret-key” systems are the heirs to the substitution and transposition ciphers used throughout history, but they work on bits (the binary digits with which messages are represented in the computer) instead of letters.

These bits are swapped, shifted, replaced, and mixed with the key to generate confusion (hiding the relationship between the plaintext and the key) and diffusion (hiding the relationship between the plaintext and the ciphertext).

Such systems are very efficient because they use operations such as substitutions, exchanges, XOR between bits, which are very fast to perform on electronic circuits.

Today, the most widely used cryptosystem is AES, which uses 256-bits long keys. AES is an evolution of Lucifer/DES.

However, these systems suffer from the problem of [ key distribution | communication security ]. It is necessary, for each conversation between two parties, to securely exchange a secret key to share, and use it to secure the communication. In a globalized way, it would be unthinkable [ for these exchanges to take place “in person” | to guarantee the security of communications ].

The [ Diffie-Hellman key exchange protocol | AES cryptosystem ] solved this problem.

With this system, it is possible to generate a key (in our metaphor, a color) shared and [ public | secret ] between two people, communicating over [ an insecure public channel (accessible by all) | a private, secure channel (accessible only to the two parties) ].

Mathematical tools such as modular arithmetic, power and its inversion, prime and coprime numbers are the basis of how this system works.

The system's functioning is based on the fact that private information [ can | cannot ] be traced based only on publicly exchanged information.

This is a recent discovery (1976) and of enormous worldwide scope because it allows you to initiate private communications with anyone, anywhere in the world, [ exchanging | without having first to exchange ] a shared secret key.

The security of this key exchange is not perfect (as the security of OTP was).

Having a lot of time (more than the age of the universe), it would be possible to brute-force attack the D-H system to find the exponents (Alice and Bob's private colors).
More: so far, no one has found an efficient way (i.e., smarter than brute-force) to compute the exponents, but that doesn't mean that no one will be able to do it in the future (although it is considered unlikely) or that new systems (such as quantum computers) won't be able to solve this math problem in a reasonable amount of time.

This is an example of “computational safety:” safety is not absolute (i.e., mathematically provable) but depends on the [ ease | difficulty ] for computers of solving certain mathematical problems.

In addition, the Diffie-Hellmann key exchange suffers from a problem: an adversary can [ pretend to be one of the two parties and participate in the key exchange | get the secret key by listening to public communications ] and then join in subsequent secret communications without the other party ever noticing. This type of attack is called a “man in the middle” attack.

One solution is the so-called “public-key” or “asymmetric” cryptography. There is a very special pair of keys in this cryptography, called public key (since it is known to everyone) and private key (since the owner keeps it secret from everyone).

Properties of asymmetric keys:

if I encrypt with the private key, I can only decrypt with the [ private | public ] key;
if I encrypt with the public key, I can only decrypt with the [ private | public ] key.

The keys are mathematically bound, but having only the public key, it is (computationally) impossible to trace the private key.

This works based on so-called one-way functions: easy to compute but hard to reverse, unless you know a secret. For example, it is [ easy | difficult ] to multiply two very big prime numbers, but it is [ easy | difficult ] to decompose a very big number into its prime factors.

It is possible to overcome the Diffie-Hellman weakness (“man in the middle”) by using the properties of asymmetric keys. In other words, it is possible to guarantee [ the authenticity | the secrecy ] of a message.

The sender encrypts the message with his private key, the receiver decrypts it with the sender's public key, and then it is certain that it was that sender who encrypted and sent the message. In fact, it would be impossible to decrypt that message if it had been encrypted by another sender (with his secret key).

At the same time, thanks again to the properties of asymmetric keys, it is possible to exchange a secret message [ by simply exchanging a secret key first | without first exchanging a secret key ]. The sender encrypts the message with the recipient's public key. At that point, the message can only be decrypted with the recipient's private key, which is undoubtedly known only to him.

Public-key systems (“asymmetric”) require demanding calculations for the computers implementing them. Therefore, in practice, a mix of the presented systems is used:

a public-key system is used to be sure of the identity of the two parties exchanging messages;
Diffie-Hellman key exchange is used to generate a shared secret key securely;
the shared secret key is used to communicate via symmetric (i.e., secret-key) encryption systems, such as AES, which are the fastest and most efficient.

Fundamentals of Cryptography with [ incorrect | correct ] choices

Fundamentals of Cryptography
with [ incorrect | correct ] choices