Cryptology ePrint Archive: Report 2022/112

Faster Kyber and Dilithium on the Cortex-M4

Amin Abdulrahman and Vincent Hwang and Matthias J. Kannwischer and Daan Sprenkels

Abstract: This paper presents faster implementations of the lattice-based schemes Dilithium and Kyber on the Cortex-M4. Dilithium is one of the three signature finalists in the NIST post-quantum project (NIST PQC), while Kyber is one of the four key-encapsulation mechanism (KEM) finalists.

Our optimizations affect the core polynomial arithmetic using the number-theoretic transform (NTT) of both schemes. Our main contributions are threefold: We present a faster signed Barrett reduction for Kyber, propose to switch to a smaller prime modulus for the polynomial multiplications \(c\mathbf{s}_1\) and \(c\mathbf{s}_2\) in the signing procedure of Dilithium, and apply various known optimizations to the polynomial arithmetic in both schemes. Using a smaller prime modulus is particularly interesting as it allows using the Fermat number transform resulting in especially fast code.

We outperform the state-of-the-art for both Dilithium and Kyber. For Dilithium, our NTT and iNTT are faster by 5.2% and 5.7%. Switching to a smaller modulus results in speed-up of 33.1%-37.6% for the relevant operations (sum of basemul and iNTT) in the signing procedure. For Kyber, the optimizations results in 15.9%-17.8% faster matrix-vector product which presents the core arithmetic operation in Kyber.

Category / Keywords: implementation / Dilithium,Kyber,NIST PQC,Fermat Number Transform,Number-Theoretic Transform,Arm Cortex-M4

Date: received 29 Jan 2022, last revised 13 May 2022

Contact author: amin abdulrahman at mpi-sp org, vincentvbh7 at gmail com, matthias at kannwischer eu, daan at dsprenkels com

Available format(s): PDF | BibTeX Citation

Version: 20220513:123335 (All versions of this report)

Short URL:

[ Cryptology ePrint archive ]