1 of 27

SDitH in Hardware

Sanjay Deshpande, James Howe, Jakub Szefer, and Dongze (Steven) Yue

CBCrypto

May 25, 2024

Computer Architecture

and Security Lab (CASLAB)

https://caslab.io

2 of 27

Motivation

  • Quantum Computing holds tremendous potential that could solve complex problems that are out of reach for current high-performance computers
    • Life-saving pharmaceuticals
    • Green-battery technology
  • However, they also pose significant cybersecurity risks
    • Can easily break existing standards of public key cryptography
    • Can jeopardize payment systems, encrypted chat, emails, etc.
  • The Quantum Insider’s report from 2022 forecasts the quantum security market worth $10 billion by 2030
  • Currently, we do not have large-scale quantum computers
    • In 2023, IBM announced the 1,121-qubit quantum processor “Condor”
  • Hence, there is a need for Quantum-safe Cryptography!
    • Post quantum cryptography emerges as a beacon of hope

Condor Image Source: IBM

Image Source: MIT Technology Review

2

Computer Architecture

and Security Lab (CASLAB)

https://caslab.io

3 of 27

NIST Post Quantum Cryptography Standardization Effort

07

08

09

10

11

12

13

14

15

16

17

18

19

20

21

22

SHA-3

CAESAR

Post-Quantum

(PQC-KEM+PQC-DSA)

Lightweight

69 Public Key Post-Quantum

Cryptography Schemes

multiple winners

further evaluation

56 Lightweight authenticated ciphers & hash functions

1 winner

57 authenticated ciphers

multiple winners

Completed

In Progress

2007

2012

2013

2019

2016

2018

Year

TBD

51 hash functions

one winner

23

2023

24

PQC

DSA

TBD

2022

40 Post-Quantum

Cryptography Digital

Signature Schemes

[Gaj20]

PQC-KEM: Post Quantum Cryptography-Key Encapsulation Mechanism

PQC-DSA: Post Quantum Cryptography-Digital Signature Algorithm

1

2

3

Computer Architecture

and Security Lab (CASLAB)

https://caslab.io

4 of 27

Outline

  • Introduction
    • SDitH Signature Scheme
  • Hardware Design and Challenges
  • Comparison with Related and Relevant Work
  • Conclusion and Future Work

4

Computer Architecture

and Security Lab (CASLAB)

https://caslab.io

5 of 27

Introduction

Computer Architecture

and Security Lab (CASLAB)

https://caslab.io

6 of 27

SDitH Parameter Sets

  • Two Variants of the Algorithm
    • Hypercube
    • Threshold
  • Three Security Levels
  • Two Syndrome Decoding Fields
    • GF256 and GF251
  • d=2 splits for L3 and L5 Parameter Sets

6

Computer Architecture

and Security Lab (CASLAB)

https://caslab.io

7 of 27

SDitH Key Generation

ComputeS

Mat Vec

Mult and Add

Sampling Elements for ExpandH

Compute Q

Compute P

Sampling Elements

i_start

o_done

ExpandSeed

Variable time due to rejection sampling

Constant time

7

Computer Architecture

and Security Lab (CASLAB)

https://caslab.io

8 of 27

SDitH Sign

  • Only a little scope for parallelism at the algorithm level
  • Processing Message (m) input happens much later in the algorithm
    • Hence, could be divided in to Offline and Online Parts

8

Computer Architecture

and Security Lab (CASLAB)

https://caslab.io

9 of 27

SDitH Sign - Offline

TREEPRG

Commit

Sampling

Hash1

ExapandSeed

2D

ExpandMPCChallenge

ComputePlainBroadCast

i_start

τ

τ

o_done

Constant time

9

Computer Architecture

and Security Lab (CASLAB)

https://caslab.io

10 of 27

SDitH Sign - Online

Constant time

Hash2

PartyComputation

τ x D

ExpandViewChallenge

GenerateSeedSiblingPath

τ

i_start

o_done

10

Computer Architecture

and Security Lab (CASLAB)

https://caslab.io

11 of 27

SDitH Verify

  • Similar to sign_offline and sign_online not so much scope for the parallelism at the algorithmic level
  • Possibility of parallelism at the function/module level at cost of additional hardware
  • Unrolling the for loops at the cost of duplicating modules

11

Computer Architecture

and Security Lab (CASLAB)

https://caslab.io

12 of 27

Hardware Design and Challenges

Computer Architecture

and Security Lab (CASLAB)

https://caslab.io

13 of 27

Hardware Design Architecture

13

Computer Architecture

and Security Lab (CASLAB)

https://caslab.io

14 of 27

Our Contributions

  • First parameterizable hardware realization of Hypercube Variant of SDitH Signature Scheme
  • Two Variants of Syndrome Decoding Modules
    • Sample first, then multiply
    • Sample and multiply on the fly
  • Split Hardware Implementation of Sign into Offline and Online phases
  • Drastic Reduction in terms of Clock Cycles when compared to
    • Key Generation – Up to 250x
    • Signature Generation – Up to 3.4x
    • Signature Verification – Up to 2.2x

14

Computer Architecture

and Security Lab (CASLAB)

https://caslab.io

15 of 27

Syndrome Computation Module – Key Generation

  • Y = sB + H’sA
  • Syndrome Computation needs to be done after S (Sa, Sb) is computed by ComputeS module
  • Hence, Sample First then Multiply approach (STFM)

ComputeS

Mat Vec

Mult and Add

Sampling Elements for ExpandH

Compute Q

Compute P

Sampling Elements

i_start

o_done

ExpandSeed

15

Computer Architecture

and Security Lab (CASLAB)

https://caslab.io

16 of 27

Syndrome Computation Module – Sign and Verify

  • y and Sa are inputs.
  • Only need H’ to compute the syndrome.
  • “Sample and Multiply On the Fly” (SaMO) approach

GF256

GF251

16

Computer Architecture

and Security Lab (CASLAB)

https://caslab.io

17 of 27

Evaluate Module

  • Evaluate - takes as input an Fq-vector Q representing the coefficients of a polynomial Fq[X] and a point r ∈ Fpoints and computes evaluation as follows:

  • Evaluate is used in both sign and verify operations. It contributes to:
    • 99% of cycles of the online part of the signing
    • 70%-90% of clock cycles in the verification based on the parameter set
  • ri-1 is a 32-bit modular exponentiation; it is an expensive operation
    • Software implementation (target device Intel Xeon E-2378 CPU) accomplishes this by two large look-up-tables (370 KB to 1.5 MB for full design - based on parameter set)
    • Our lightweight target, Artix 7 FPGA, does not have these resources. Hence, we take an on-the-fly computation approach

r0, r1, r2

Modular

Multiplication

Pipeline Register

Stages

Control logic

i0, i1, i2

r0i0, r1i1, r2i2

17

Computer Architecture

and Security Lab (CASLAB)

https://caslab.io

18 of 27

Signature Generation Module

  • The block shown is for our area optimized implementation of SDitH signature generation scheme
  • Signature generation is divided into two phases offline and online – they can run in parallel
  • SHAKE256 is a hash function that is used in both the offline and online phases
  • However, SHAKE256 is area expensive 31% of overall hardware design
  • Hence, we design an optimized SHAKE scheduler such that
    • the same SHAKE module is switched between Offline and Online phases without wasting cycles and additional area

18

Computer Architecture

and Security Lab (CASLAB)

https://caslab.io

19 of 27

Comparison with Related and Relevant Work

Computer Architecture

and Security Lab (CASLAB)

https://caslab.io

20 of 27

Clock Cycles Comparison – Optimized Software v/s Our Hardware Implementation – Hypercube Variant

250x

3.4x

2.2x

Improvement

171x

2.1x

3.1x

Galois Field New Instructions from Intel are used

~70-99% of the clock cycles are taken in the ‘sign_online’ and ‘verify’ modules by the ‘Evaluate’ module

20

Computer Architecture

and Security Lab (CASLAB)

https://caslab.io

21 of 27

Time Comparison – Optimized Software v/s Our Hardware Implementation – Hypercube Variant

17.1x

0.21x

0.13x

Improvement

11.4x

0.13x

0.19x

~70-99% of the clock cycles are taken in the ‘sign_online’ and ‘verify’ modules by the ‘Evaluate’ module

Decline

Operating Frequency:

Intel Xeon Processor = 2.6 GHz

Xilinx Artix 7 FPGA = 164 MHz

21

Computer Architecture

and Security Lab (CASLAB)

https://caslab.io

22 of 27

Comparison with other PQC-DSA candidates – Security Level 1

Latest NIST

Competition

Candidates

Old NIST

Competition

Candidates

*No KeyGen

^Low Multiplication Complexity

22

Computer Architecture

and Security Lab (CASLAB)

https://caslab.io

23 of 27

Conclusion and Future Work

Computer Architecture

and Security Lab (CASLAB)

https://caslab.io

24 of 27

Conclusion

  • This work presents first hardware realization of SDitH Signature Scheme.
    • Parameterizable across three Security Levels and Two Arithmetic Fields.
  • SDitH could be realized as a light-weight implementation. However, the memory consumption is higher.

24

Computer Architecture

and Security Lab (CASLAB)

https://caslab.io

25 of 27

Future Work

  • The lower-level modules implemented as part of this work could be used to construct the ‘threshold’ variant of SDitH easily.
  • Module level parallelism could be exploited to build a high-performance design which could speed-up the sign and verify operations.
  • The MPC hardware modules’ components could be reused outside SDitH.

25

Computer Architecture

and Security Lab (CASLAB)

https://caslab.io

26 of 27

References

[Gaj20] Kris Gaj, Implementation and Benchmarking of Round 2 Candidates in the NIST Post-Quantum Cryptography Standardization Process Using FPGAs, NIST Seminars, Oct 2020.

[BWM+23] Luke Beckwith, Robert Wallace, Kamyar Mohajerani, and Kris Gaj. A high-performance hardware implementation of the less digital signature scheme. In Thomas Johansson and Daniel Smith-Tone, editors, Post-Quantum Cryptography, pages 57–90, Cham, 2023. Springer Nature Switzerland.

[KRR+20] Daniel Kales, Sebastian Ramacher, Christian Rechberger, Roman Walch, and Mario Werner. Efficient FPGA implementations of lowmc and picnic. In Stanislaw Jarecki, editor, Topics in Cryptology – CT-RSA 2020, pages 417–441, Cham, 2020. Springer International Publishing.

[ZZW+23] Cankun Zhao, Neng Zhang, Hanning Wang, Bohan Yang, Wenping Zhu, Zhengdong Li, Min Zhu, Shouyi Yin, Shaojun Wei, and Leibo Liu. A compact and high-performance hardware architecture for CRYSTALS-Dilithium. IACR Transactions on Cryptographic Hardware and Embedded Systems,2022(1):270–295, Nov. 2021.

[ALC+20] Dorian Amiet, Lukas Leuenberger, Andreas Curiger, and Paul Zbinden. Fpga-based sphincs+ implementations: Mind the glitch. In 2020 23rd Euromicro Conference on Digital System Design (DSD), pages 229–237, 2020.

26

Computer Architecture

and Security Lab (CASLAB)

https://caslab.io

27 of 27

Thank you!

Sanjay Deshpande, James Howe, Jakub Szefer, and Dongze Yue, "SDitH in Hardware", in Transactions on Cryptographic Hardware and Embedded Systems (TCHES), September 2024.

Sanjay Deshpande

email: sanjay.deshpande@yale.edu

Computer Architecture

and Security Lab (CASLAB)

https://caslab.io