P11_260


I wrote an elliptic curve signature system that uses a Granger Moss Prime and has some nice properties:

  • The field would vectorize well in hardware or on an FPGA.
  • I have benchmarked it and it scales well on AVX2 and AVX512, with separate implementations for both.
  • Unlike Ed25519, the keys form a group because we use the entire keyspace.
  • The session key algorithm is a defense in depth that resists fault attacks and poor randomness

Why Write an Elliptic Curve Signature System

Yes, I know it is unlikely to be widely adopted. There were several reasons to write a signature system.

  • I understand a lot more how a signature system is built including the scalar manipulation.
  • I learned a lot about Intel’s AVX2 and AVX512 instruction sets while vectorizing the code, and a lot about performance testing by using the linux perf tool.
  • It is a good reminder that papers don’t exist just to be read, but to be implemented as well.
  • It helped me understand some of the weaknesses in a system like Ed25519

So How Does it Work

The signature system is based on scalar multiplication of points on an elliptic curve over a prime field. The prime is (t11−1)/(t−1) where t = 226 − 15. This prime is congruent to 3 mod 4 for fast square roots. Approximate division by t is straightforward: $q = z >> 2^26; r = t & (2^26 - 1) + 15 ⋅ q$.

The curve is an edwards curve, defined as: x2 + y2 = dx2y2, d =  − 49142 This is the smallest such d that yields a curve with cofactor 4. Thanks to the team at safecurves.cr.yp.to I was able to use sage and numerical sieves to construct proofs of safety for the curve.

Signed-All-Bits-Set

The software uses a signed-all-bits-set representation for speedups and to avoid timing attacks by early termination. In signed-all-bits-set representation, a number is represented in binary, but each bit is  ± 1. To convert a b bit number to this representation mod l, you do the following: First write the number e = sum(i=0 to b,2idi), di ∈ −1, 1. Then we can modify the equation. (e+2b−1)/2 = sum(i=0 to b,2i(di+1)/2). So given a binary number e, if we want its digits in SABS representation, we compute (e+2b−1)/2. Since the scalars that we are concerned about form a prime field, we can always divide by 2.

The Signature Algorithm

To Sign:

There is a well known point B. The signing key is a pair: A, a where A = a * B, and a is a scalar, the secret key. Choose a session key k. then compute the point R = kB. Use blake2b512 to produce a 512 bit value h = Hash(R,A,M), where A is the compressed public key. Compute s = k − ha. The signature is the pair R, s where R is compressed.

To Verify: Compute h = Hash(R,A,M). Check that sB + hA = R. This can be done without decompressing R. We can verify the check equation: sB + hA = (kha) ⋅ B + ha ⋅ B = k ⋅ B = R.

This particular system was chosen based on guidance from Dan Bernstein and Tanja Lange

Combs

The signature algorithm uses 4 5-bit combs with teeth separation of 13 bits. Combs are straightforward once you see what they are doing. The first table contains 16 entries:

index      contents
0 B ⋅ 252 − B ⋅ 239 − B ⋅ 226 − B ⋅ 213 − B ⋅ 20
1 B ⋅ 252 − B ⋅ 239 − B ⋅ 226 − B ⋅ 213 + B ⋅ 20
15 B ⋅ 252 + B ⋅ 239 + B ⋅ 226 + B ⋅ 213 + B ⋅ 20

We only need 16 entries because we can use the 5th bit as a sign bit. The next comb is constructed in a similar manner, starting with B ⋅ 265. To multiply B by a scalar, we start with acc equal to the identity and do the following 13 times:

  1. Square acc
  2. Do the following 4 times:
    1. Extract the relevant bits e.g. bits 52, 39, 26, 13, 0
    2. Do a constant time lookup in the table based on the lower 4 bits (possibly tweaked by 5th bit)
    3. Do a constant time negate/identity based on the 5th bit
    4. Multiply acc by the result.

You always use a comb when signing, because B is known. You can use it in half of the verification, because you compute sB, and again B is known. If you need to verify several signatures from the same key, in can be effective to compute a comb for a known public key A to speed up the computation of hA.

Session Key Algorithm

At the very end of the article by DJB and Tanja Lange, there is a discussion of what to use to derive the session key. The article argues that using the message and the private key to generate the session key is best because it is stateless and simple, and there is no security to be gained by including randomness in the generation. This analysis misses a specific kind of attack: A fault attack. In this system, the message is hashed twice, once to generate the session key, k and then again to produce h. If an attacker can cause the document to be different for the second hashing, they can produce multiple signatures with the same session key, breaking the private key. If, however, the random number hardware is working, and you include a random value in the session key hash, then the fault attack won’t work.