Paper 2013/519

Montgomery Multiplication Using Vector Instructions

Joppe W. Bos, Peter L. Montgomery, Daniel Shumow, and Gregory M. Zaverucha

Abstract

In this paper we present a parallel approach to compute interleaved Montgomery multiplication. This approach is particularly suitable to be computed on 2-way single instruction, multiple data platforms as can be found on most modern computer architectures in the form of vector instruction set extensions. We have implemented this approach for tablet devices which run the x86 architecture (Intel Atom Z2760) using SSE2 instructions as well as devices which run on the ARM platform (Qualcomm MSM8960, NVIDIA Tegra 3 and 4) using NEON instructions. When instantiating modular exponentiation with this parallel version of Montgomery multiplication we observed a performance increase of more than a factor of 1.5 compared to the sequential implementation in OpenSSL for the classical arithmetic logic unit on the Atom platform for 2048-bit moduli.

Metadata
Available format(s)
PDF
Category
Implementation
Publication info
Published elsewhere. Minor revision. SAC 2013
Keywords
Montgomery multiplicationSIMDsoftware implementationvector instructions
Contact author(s)
jbos @ microsoft com
History
2013-08-21: received
Short URL
https://ia.cr/2013/519
License
Creative Commons Attribution
CC BY

BibTeX

@misc{cryptoeprint:2013/519,
      author = {Joppe W.  Bos and Peter L.  Montgomery and Daniel Shumow and Gregory M.  Zaverucha},
      title = {Montgomery Multiplication Using Vector Instructions},
      howpublished = {Cryptology {ePrint} Archive, Paper 2013/519},
      year = {2013},
      url = {https://eprint.iacr.org/2013/519}
}
Note: In order to protect the privacy of readers, eprint.iacr.org does not use cookies or embedded third party content.