Paper 2015/1024
Parallel Implementation of Number Theoretic Transform
Hwajeong Seo, Zhe Liu, Yasuyuki Nogami, Jongseok Choi, Taehwan Park, and Howon Kim
Abstract
Number Theoretic Transform (NTT) based polynomial multiplication is the most important operation for Lattice-based cryptography. In this paper, we implement the parallel NTT computation over ARM-NEON architecture. Our contributions include the following optimizations: (1) we vectorized the Iterative Number Theoretic Transform, (2) we propose the 32-bit wise Shifting-Addition-Multiplication-Subtraction-Subtraction (SAMS2) techniques for speeding up the modular coefficient multiplication, (3) we exploit the incomplete arithmetic for representing the coefficient to ensure the constant time modular reduction. For medium-term security level, our optimized NTT implementation requires only 27; 160 clock cycles. Similarly for long-term security level, it takes 62; 160 clock cycles. These results are faster than the state-of-art sequential implementations by 31% and 34% respectively.
Metadata
- Available format(s)
- -- withdrawn --
- Category
- Implementation
- Publication info
- Preprint. MINOR revision.
- Contact author(s)
- hwajeong84 @ gmail com
- History
- 2015-10-23: withdrawn
- 2015-10-23: received
- See all versions
- Short URL
- https://ia.cr/2015/1024
- License
-
CC BY