Paper 2018/1018
Faster multiplication in $\mathbb{Z}_{2^m}[x]$ on Cortex-M4 to speed up NIST PQC candidates
Matthias J. Kannwischer, Joost Rijneveld, and Peter Schwabe
Abstract
In this paper we optimize multiplication of polynomials in $\mathbb{Z}_{2^m}[x]$ on the ARM Cortex-M4 microprocessor. We use these optimized multiplication routines to speed up the NIST post-quantum candidates RLizard, NTRU-HRSS, NTRUEncrypt, Saber, and Kindi. For most of those schemes the only previous implementation that executes on the Cortex-M4 is the reference implementation submitted to NIST; for some of those schemes our optimized software is more than factor of 20 faster. One of the schemes, namely Saber, has been optimized on the Cortex-M4 in a CHES 2018 paper; the multiplication routine for Saber we present here outperforms the multiplication from that paper by 42%, yielding speedups of 22% for key generation, 20% for encapsulation and 22% for decapsulation. Out of the five schemes optimized in this paper, the best performance for encapsulation and decapsulation is achieved by NTRU-HRSS. Specifically, encapsulation takes just over 400 000 cycles, which is more than twice as fast as for any other NIST candidate that has previously been optimized on the ARM Cortex-M4.
Metadata
- Available format(s)
- Category
- Implementation
- Publication info
- Published elsewhere. Minor revision. ACNS'19
- Keywords
- ARM Cortex-M4KaratsubaToomlattice-based KEMsNTRU
- Contact author(s)
-
joost @ joostrijneveld nl
matthias @ kannwischer eu
peter @ cryptojedi org - History
- 2019-04-09: revised
- 2018-10-24: received
- See all versions
- Short URL
- https://ia.cr/2018/1018
- License
-
CC BY
BibTeX
@misc{cryptoeprint:2018/1018, author = {Matthias J. Kannwischer and Joost Rijneveld and Peter Schwabe}, title = {Faster multiplication in $\mathbb{Z}_{2^m}[x]$ on Cortex-M4 to speed up {NIST} {PQC} candidates}, howpublished = {Cryptology {ePrint} Archive, Paper 2018/1018}, year = {2018}, url = {https://eprint.iacr.org/2018/1018} }