Cryptology ePrint Archive: Report 2014/760
Montgomery Modular Multiplication on ARM-NEON Revisited
Hwajeong Seo, Zhe Liu, Johann Großschädl, Jongseok Choi, and Howon Kim
Abstract: Montgomery modular multiplication constitutes the "arithmetic foundation"
of modern public-key cryptography with applications ranging from RSA, DSA
and Diffie-Hellman over elliptic curve schemes to pairing-based cryptosystems. The increased prevalence of SIMD-type instructions in commodity processors (e.g. Intel SSE, ARM NEON) has initiated a massive body of research on vector-parallel implementations of Montgomery modular multiplication. In this paper, we introduce the Cascade Operand Scanning (COS) method to speed up multi-precision multiplication on SIMD architectures. We developed the COS technique with the goal of reducing Read-After-Write (RAW) dependencies in the propagation of carries, which also reduces the number of pipeline stalls (i.e. bubbles). The COS method operates on 32-bit words in a row-wise fashion (similar to the operand-scanning method) and does not require a "non-canonical" representation of operands with a reduced radix. We show that two COS computations can be "coarsely" integrated into an efficient vectorized variant of Montgomery multiplication, which we call Coarsely Integrated Cascade Operand Scanning (CICOS) method. Due to our sophisticated instruction scheduling, the CICOS method reaches record-setting execution times for Montgomery modular multiplication on ARM-NEON platforms. Detailed benchmarking results obtained on an ARM Cortex-A9 and Cortex-A15 processors show that the proposed CICOS method outperforms Bos et al's implementation from SAC 2013 by up to 57% (A9) and 40% (A15), respectively. Furthermore, our COS multiplication is faster than lastest GMP 6.0.0 by up to 55% (A9) and 52% (A15), respectively.
Category / Keywords: implementation / Public-key cryptography, modular arithmetic, SIMD-level parallelism, vector instructions, ARM NEON
Original Publication (with minor differences): ICISC2014
Date: received 29 Sep 2014, last revised 30 Oct 2014
Contact author: hwajeong84 at gmail com
Available format(s): PDF | BibTeX Citation
Version: 20141031:014326 (All versions of this report)
Short URL: ia.cr/2014/760
Discussion forum: Show discussion | Start new discussion
[ Cryptology ePrint archive ]