Paper 2021/998

Polynomial multiplication on embedded vector architectures

Hanno Becker, Jose Maria Bermudo Mera, Angshuman Karmakar, Joseph Yiu, and Ingrid Verbauwhede


High-degree, low-precision polynomial arithmetic is a fundamental computational primitive underlying structured lattice based cryptography. Its algorithmic properties and suitability for implementation on different compute platforms is an active area of research, and this article contributes to this line of work: Firstly, we present memory-efficiency and performance improvements for the Toom-Cook/Karatsuba polynomial multiplication strategy. Secondly, we provide implementations of those improvements on Arm® Cortex®-M4 CPU, as well as the newer Cortex-M55 processor, the first M-profile core implementing the M-profile Vector Extension (MVE), also known as Arm® Helium™ technology. We also implement the Number Theoretic Transform (NTT) on the Cortex-M55 processor. We show that despite being single issue, in-order and offering only 8 vector registers compared to 32 on A-profile SIMD architectures like Arm® Neon™ technology and the Scalable Vector Extension (SVE), by careful register management and instruction scheduling, we can obtain a 3× to 5× performance improvement over already highly optimized implementations on Cortex-M4, while maintaining a low area and energy profile necessary for use in embedded market. Finally, as a real-world application we integrate our multiplication techniques to post-quantum key-encapsulation mechanism Saber.

Note: Corrected some typos and errors in bibliography.

Available format(s)
Public-key cryptography
Publication info
Published by the IACR in TCHES 2022
Post-Quantum CryptographyPolynomial multiplicationIoTCortex-M55Cortex-M4M-profile Vector Extension (MVE)Helium vector extensionNumber Theoretic Transform (NTT)Toom-CookKaratsuba
Contact author(s)
Hanno Becker @ arm com
Jose Bermudo @ esat kuleuven be
angshuman karmakar @ esat kuleuven be
joseph yiu @ arm com
2021-10-15: last of 4 revisions
2021-07-28: received
See all versions
Short URL
Creative Commons Attribution


      author = {Hanno Becker and Jose Maria Bermudo Mera and Angshuman Karmakar and Joseph Yiu and Ingrid Verbauwhede},
      title = {Polynomial multiplication on embedded vector architectures},
      howpublished = {Cryptology ePrint Archive, Paper 2021/998},
      year = {2021},
      note = {\url{}},
      url = {}
Note: In order to protect the privacy of readers, does not use cookies or embedded third party content.