Paper 2021/1396

NTT software optimization using an extended Harvey butterfly

Jonathan Bradbury, Nir Drucker, and Marius Hillenbrand


Software implementations of the number-theoretic transform (NTT) method often leverage Harvey’s butterfly to gain speedups. This is the case in cryptographic libraries such as IBM’s HElib, Microsoft’s SEAL, and Intel’s HEXL, which provide optimized implementations of fully homomorphic encryption schemes or their primitives. We extend the Harvey butterfly to the radix-4 case for primes in the range [2^31, 2^52). This enables us to use the vector multiply sum logical (VMSL) instruction, which is available on recent IBM Z^(R) platforms. On an IBM z14 system, our implementation performs more than 2.5x faster than the scalar implementation of SEAL we converted to native C. In addition, we implemented a mixed-radix implementation that uses AVX512-IFMA on Intel’s Ice Lake processor, which happens to be ~1.1 times faster than the super-optimized implementation of Intel’s HEXL. Finally, we compare the performance of some of our implementation using GCC versus Clang compilers and discuss the results.

Available format(s)
Publication info
Number Theoretic TransformSoftware optimizationIBM Z^(R) systemsHomomorphic EncryptionIntel's AVX512-IFMA
Contact author(s)
drucker nir @ gmail com
jdbradbu @ us ibm com
marius hillenbrand @ ibm com
2021-10-18: received
Short URL
Creative Commons Attribution


      author = {Jonathan Bradbury and Nir Drucker and Marius Hillenbrand},
      title = {NTT software optimization using an extended Harvey butterfly},
      howpublished = {Cryptology ePrint Archive, Paper 2021/1396},
      year = {2021},
      note = {\url{}},
      url = {}
Note: In order to protect the privacy of readers, does not use cookies or embedded third party content.