Paper 2015/1024

Parallel Implementation of Number Theoretic Transform

Hwajeong Seo, Zhe Liu, Yasuyuki Nogami, Jongseok Choi, Taehwan Park, and Howon Kim

Abstract

Number Theoretic Transform (NTT) based polynomial multiplication is the most important operation for Lattice-based cryptography. In this paper, we implement the parallel NTT computation over ARM-NEON architecture. Our contributions include the following optimizations: (1) we vectorized the Iterative Number Theoretic Transform, (2) we propose the 32-bit wise Shifting-Addition-Multiplication-Subtraction-Subtraction (SAMS2) techniques for speeding up the modular coefficient multiplication, (3) we exploit the incomplete arithmetic for representing the coefficient to ensure the constant time modular reduction. For medium-term security level, our optimized NTT implementation requires only 27; 160 clock cycles. Similarly for long-term security level, it takes 62; 160 clock cycles. These results are faster than the state-of-art sequential implementations by 31% and 34% respectively.

Metadata
Available format(s)
-- withdrawn --
Category
Implementation
Publication info
Preprint. MINOR revision.
Contact author(s)
hwajeong84 @ gmail com
History
2015-10-23: withdrawn
2015-10-23: received
See all versions
Short URL
https://ia.cr/2015/1024
License
Creative Commons Attribution
CC BY
Note: In order to protect the privacy of readers, eprint.iacr.org does not use cookies or embedded third party content.