Paper 2021/185

No Silver Bullet: Optimized Montgomery Multiplication on Various 64-bit ARM Platforms

Hwajeong Seo, Pakize Sanal, Wai-Kong Lee, and Reza Azarderakhsh


In this paper, we firstly presented optimized implementations of Montgomery multiplication on 64-bit ARM processors by taking advantages of Karatsuba algorithm and efficient multiplication instruction sets for ARM64 architectures. The implementation of Montgomery multiplication can improve the performance of (pre-quantum and post-quantum) public key cryptography (e.g. CSIDH, ECC, and RSA) implementations on ARM64 architectures, directly. Last but not least, the performance of Karatsuba algorithm does not ensure the fastest speed record on various ARM architectures, while it is determined by the clock cycles per multiplication instruction of target ARM architectures. In particular, recent Apple processors based on ARM64 architecture show lower cycles per instruction of multiplication than that of ARM Cortex-A series. For this reason, the schoolbook method shows much better performance than the sophisticated Karatsuba algorithm on Apple processors. With this observation, we can determine the proper approach for multiplication of cryptography library (e.g. Microsoft-SIDH) on Apple processors and ARM Cortex-A processors.

Available format(s)
Publication info
Preprint. MINOR revision.
Montogmery MultiplicationARM64Public Key CryptographySoftware Implementation
Contact author(s)
hwajeong84 @ gmail com
2021-06-18: last of 2 revisions
2021-02-20: received
See all versions
Short URL
Creative Commons Attribution


      author = {Hwajeong Seo and Pakize Sanal and Wai-Kong Lee and Reza Azarderakhsh},
      title = {No Silver Bullet: Optimized Montgomery Multiplication on Various 64-bit ARM Platforms},
      howpublished = {Cryptology ePrint Archive, Paper 2021/185},
      year = {2021},
      note = {\url{}},
      url = {}
Note: In order to protect the privacy of readers, does not use cookies or embedded third party content.