Lecture 2 (Weds 9/23/97)

Scribes: Edwin Chan, Adrian Perrig, Lian Keng Lim, Andy Myers

Elliptic curve cryptography:

The problem with using the integers is that there is only one set of them. We'd like to have more choices.

Elliptic curves are cubic equations in 2 variables: y² = x³ + Ax + B mod n. Each curve gives a distinct set of points, and we can define operations on them analogous to the operations we use on the integers. Note that it's not immediately obvious that our new addition operation is associative.

Multiple sets of points allows choosing smaller keys, which enables faster computation.

Secret Sharing

Ref: Schneier 23.2

Problem: We want to store a secret in n places such that a quorum q of parties is necessary to recover it. E.g., missile silo launch scenario that requires 2 or more people to use their keys simultaneously to launch a missile.

Shamir's approach: Choose a qth-degree function

f(x) = x^q + a_q-1*x^q-1 + ... + a₀ mod p.

The unknowns are the a_i's, and a₀ is the secret value. Therefore, f(0) = secret = a₀. We require that p is prime, p >> secret, and a₁ through a_q-1 are random.

To distribute the secret, give f(n) to person n. Now each person has one equation in q unknowns. Only by getting together all q equations do we have enough information to solve. (Lagrangian interpolation). The q equations are linearly independent (basic result from number theory) so we know that getting the set of q equations does yield enough information to solve.

An example of this technique's usefulness: computing the average salary of a group without anyone having to reveal his/her salary to the group.

Bad Approach 1: Alice chooses a random constant and adds it to her salary. She passes this value to Bob, who adds his salary and passes the sum to Charlie, and so on down the line. Then Alice subtracts the random constant and the total salary is known. Unfortunately, this can be broken if just two people collaborate (e.g., A & C can figure out B's salary).

Bad Approach 2: Everyone adds a random constant and passes the sum along. Then the number circulates a second time, with everyone subtracting off their random value. Again, two collaborators is enough to break this.

Successful Approach: Modify Shamir's method. Each person has their own equation (a₀, b₀, etc. are the secret salaries).

f_A = x^q + a_q-1*x^q-1 + ... + a₀ mod p
f_B = x^q + b_q-1*x^q-1 + ... + b₀ mod p
and so on.

If we add these up we get:

f_sum = q*x^q + (a_q-1 + b_q-1 + ... + z_q-1)*x^q-1 + ... + (a₀ + b₀ + ... + z₀) mod p

and the last term is just what we want, the sum of all the salaries.

Now each person sends out values to all the others: A sends f_A(1), f_A(2), etc. B sends f_B(1), f_B(2), etc. Each person adds up all the values received, so that the nth person has f_sum(n). Once again, if and only if all the participants get together, they have enough equations to solve for the coefficients in f_sum.

Note: this also works for multiplication, and once we have + and * we can construct digital circuits for pretty much any operation.

Key Exchange

Suppose there are two parties that are interested in sharing a secret key. How can they agree on a key secretly, protected from eavesdroppers? The Diffie-Hellman method (Schneier 22.1), based on discrete logarithm, solves this problem by letting the parties exchange enough information to be able to compute the key, without actually exchanging the key. This works because computing discrete logs is hard -- like factoring. The method can be extended to more than 2 parties. Each one needs to receive n-1 exchanges to compute the secret key.

A & B choose a generator g mod p (a generator maps (g,g²,g³,...,g^p-1) to a permutation of (1,...,p-1) ). p must be prime, and (p-1)/2 must be prime as well.
A picks value a and computes g^a mod p.
B picks value b and computes g^b mod p.
They exchange the computed values and compute g^ab mod n, which is the key.

Remote Coin Flipping

Alice and Bob want to flip a coin fairly, communicating over the phone. How can they do this? (Compare mental chess to mental poker -- which one would YOU bet money on?)

The following method uses this property: All odd numbers are either of the form 4j+1 or 4j-1. If you choose two numbers of the same type and multiply them, you always get a product of the form 4j+1. So from looking at the product, you can't tell which type the factors were.

Alice randomly picks one of the two types, and multiplies together two randomly chosen large primes of that type. She sends the product to Bob.
Bob guesses at the type of the factors.
Alice reveals the prime factors to verify whether Bob won (guessed correctly) or lost.

This method relies on the hardness of factoring, and the fact that there are an infinite number of both types of primes.

Major concerns about Protocols

Exportability
Key size
Dumbing down -- SHTTP involves a dialog between the parties to determine the level of encryption. This is vulnerable to insertion of messages that lower the level.
Replay attacks -- recording and resending copies of previously successful messages.
Patent status (who owns the protocol)
Open standards
Disclosure of critical keys -- there may be a single key that, if compromised, endangers security over a large range. Example: an encrypted letter containing information about all spies in a region. This encryption must be strong enough to protect the spies' identities for years to come.

Authentication

Authentication == proving your identity to another entity (person or computer). We want 2-way authentication, so that each party is sure of the other.

Standard, simple method: passwords or PINS. These are easy to forget, so when we use many passworded systems we typically build a small set of heavily re-used passwords. They are also easy to capture in transit (e.g. packet-sniffers, or the fake ATMs that take in a card, ask for the PIN, and then reject the card). Exhaustive search, cryptanalysis, and word-guessing also can yield passwords.

Biometrics (e.g. retinal scan, fingerprint) offers new means of authentication, but these methods aren't generally practical or comfortable for the user. It's still possible to inject data into the system further down the wire. Also, this makes people's body parts valuable.

We need a secure way to authenticate identity.

Challenge-Response Authentication

General idea: one party sends a message and the other responds to the message in a pre-agreed way that demonstrates identity. There is usually a time constraint as well; if the response arrives too slowly, the authentication is not successful.

Simple public-key version:

Bob sends a random challenge message r.
Alice signs the message, sending Sign(r) back to Bob.
Bob accepts the message by checking Alice's signature.

What's wrong with this? Bob now has a mechanism to get Alice to sign ANYTHING. The supposedly random challenge could be a document giving him all her money! Also, Alice is giving Bob information that she doesn't really want to: if Bob wants to figure out Alice's private key, he's got a way to conduct a chosen-plaintext attack.

Zero-Knowledge Authentication

We can avoid the problems above, using the same challenge-response model with a zero-knowledge method of demonstrating identity. The following conditions must be met (ref: Schneier 5.1, 5.2, 21.1):

Alice & Bob must be able to execute the protocol.
A bogus Alice will be caught with high probability. (It's not certain that a bogus-A is caught. However, in any protocol there's a chance of bogus-A guessing the correct response, and in this protocol, we can raise the probability arbitrarily high by repeating the process.)
Alice leaks no information not already computable by Bob.

Example: Alice wants to prove that she has an algorithm for finding Hamiltonian paths, but she doesn't want to give away the algorithm. So, for a particular graph, such as

A *_          _* B
  | \_      _/
  |   \_  _/
  |     \/
  |    _/\_
  |  _/    \_
  | /        \
C *------------* D

here's what Alice and Bob do:

Alice picks a random permutation of vertices, say (A=2,B=4,C=1,D=3).
She figures out a path, say B-C-A-D.
She sets up "envelopes" labelled with all the pairs of vertices: 1-2, 1-3, 1-4, etc. In each envelope, she puts "Yes" if there is an edge in the graph connecting the nodes with those numbers. For example, the envelope marked 1-2 gets "Yes" because there is an edge between C and A.
Bob flips a coin.
- If it turns up heads, Alice opens the envelopes and shows the permutation.
- If it turns up tails, Alice gives a path in terms of the envelope numbers (in this case, 4-1-2-3) and opens the relevant envelopes to show that they all contain "Yes".

This method is not very practical -- think how many envelopes it would take to get 2^-40 certainty! But there's a better method based on Rabin.

Rabin Zero-Knowledge Authentication

This is a simplified version of Fiat-Feige-Shamir method.

pq is publicly known. p & q are not.
Alice publishes a² mod pq. (a is Alice's secret value)
Alice picks a random value r and computes r² mod pq.
Alice sends Bob a message stating her identity and giving the computed value.
Bob flips a coin:
- If it turns up heads, Bob asks Alice to send the square root of r² mod pq.
- If it turns up tails, Bob asks Alice to send the square root of (ar)² mod pq.
If Alice answers correctly, Bob knows her identity with probability at least 1/2.

This doesn't leak information:

HEADS: Bob has a², r², and r. a² is public knowledge, and r² & r are random.
TAILS: Bob has a², r², and ar. a² is public knowledge, r² is random, and (ar)² / a² = r² -- that is, no new info is obtained.

Bogus Alice will be caught 50% of the time: if bogus-A knows both (ar mod pq) and (r mod pq), she can perform modular division to get a.

As before, this is based on the belief that it's difficult to factor.

Zero-knowledge authentication is good as a key to unlock something once. It's not useful for setting up a channel for ongoing communication, as no session key has been exchanged. Also, it's open to attack if an intruder intercepts messages from Alice and replays them to Bob, and vice-versa. Then the intruder has authenticated to Alice as "Bob", and to Bob as "Alice".

Private Key Cryptography

Private-key cryptography, in which the communicating parties share a single secret key, is an old, old method. Julius Caesar was a famous early user of private-key systems (discussed in Gallic Wars). These break down into various types by the way they act on data.

Character-by-character ciphers may be simple rotations of the alphabet, or substitution ciphers (mapping the alphabet to a permutation of the alphabet). The same letter always maps to the same encrypted letter.
Block ciphers, e.g. DES, act on chunks of data at once.
Stream ciphers, e.g. Enigma, where the cipher changes every time a character is encrypted.

The only perfect cipher is a one-time pad. This is a randomly-generated key which is used only once and then discarded. The key may need to be very large (possibly the same size as the data, for example when the encryption & decryption consist of XOR'ing with the key).

Enigma

Enigma was a German rotor cipher. It's the basis of most published private key cryptosystems. It was cracked by an early version of differential cryptanalysis.

Book recommendations:

Codebreakers by Hinseley and Stripp (1993)
Alan Turing: The Enigma by Hodges (1983)

Enigma uses 3 rotors, which turn odometer-like after each character is encrypted, to produce a constantly-changing encryption key. Input character signals pass through all three rotors (transformed to another letter by each rotor), are reflected, and then pass back through the rotors. Decryption works the same way.

Data Encryption Standard (DES)

DES history:

1972 -- 1st interest in non-secret crypto
1973 -- 1st call for public crypto system proposals
1974 -- IBM submits Lucifer (128 bits) in 2nd call
1975 -- IBM/NSA revised & published DES (56 bit)
1976 -- Algorithm widely criticized
1977 -- DES was standard for 5 years; renewed for another 5 years.
1987 -- NSA doesn't recertify DES, and proposes CCEP (Skipjack precursor).
Industry complains. NSA recertifies but says this is the last time.
1992 -- NSA really doesn't want to recertify.
1993 -- No better proposals; NSA recertifies DES.
1998 -- Will DES be recertified???

DES criticisms and realities

56 bit key size is less secure than Lucifer's 128-bit key.: Bogus! NSA improved the algorithm and removed extra redundant bits.
By fiddling with the S-boxes, NSA made them less secure.: Bogus! They are more secure than random S-boxes.
16 rounds is not sufficient.: Bogus! It has been shown that more than 16 won't help.
Exhaustive attack will be feasible by 1985.: Got the date wrong, but otherwise correct. Recently a key was cracked using a collection of workstations. Special purpose machines should be practical soon -- NSA probably has some already.

Note that the FBI is still using 40-bit keys...

How to improve DES?

Encrypt with 2 keys (J,K), so that ciphertext c = E_J(E_K(m)).

Unfortunately, the cost of cracking this is 2⁵⁷, not 2¹¹² as might be expected. Here's why (The "Meet In The Middle" attack):

Given c & m for c = E_J(E_K(m)),

compute table of E_K(m) for all K. Cost: 2⁵⁶
compute table of D_J(c) for all J. Cost: 2⁵⁶
look for common values.

Consider this in terms of the Birthday Paradox. There are 2⁶⁴ possible J's & K's. DES uses 2⁵⁶ of them. Thus 1/2⁸ of the possible J's are used, and the same for K's. So 1/2¹⁶ of the values are used as both J & K. This is 2⁴⁸ values. That's quite a lot to check, but it's still O(2⁵⁶).

Thus, the total cost is O(2⁵⁷).

This happens because DES is not a group, so E_J(E_K(m)) != E_JK(m). RSA, by comparison, *is* a group. (n^e)^e' = n^e''.

Triple DES

If you encrypt 3 times, you get 2¹¹² security.

Now encryption is c = E_L(D_J(E_K(m))). The second encryption is written as D_J (a decryption) for historical reasons.

You can generate (J,K,L) from two DES keys (X,Y), thus reducing the size of the overall key:

K = E_X(D_Y(E_X(T₁)))
J = E_X(D_Y(E_X(T₂)))
L = E_X(D_Y(E_X(T₃)))

where T₁, T₂, T₃ are constants.

Next time: differential cryptanalysis. Read 12.4, 3.3, 24.5.

Last modified: Tue Nov 11 16:15:48 EST 1997