Cryptology ePrint Archive: Report 2021/1396

NTT software optimization using an extended Harvey butterfly

Jonathan Bradbury and Nir Drucker and Marius Hillenbrand

Abstract: Software implementations of the number-theoretic transform (NTT) method often leverage Harvey’s butterfly to gain speedups. This is the case in cryptographic libraries such as IBM’s HElib, Microsoft’s SEAL, and Intel’s HEXL, which provide optimized implementations of fully homomorphic encryption schemes or their primitives. We extend the Harvey butterfly to the radix-4 case for primes in the range [2^31, 2^52). This enables us to use the vector multiply sum logical (VMSL) instruction, which is available on recent IBM Z^(R) platforms. On an IBM z14 system, our implementation performs more than 2.5x faster than the scalar implementation of SEAL we converted to native C. In addition, we implemented a mixed-radix implementation that uses AVX512-IFMA on Intel’s Ice Lake processor, which happens to be ~1.1 times faster than the super-optimized implementation of Intel’s HEXL. Finally, we compare the performance of some of our implementation using GCC versus Clang compilers and discuss the results.

Category / Keywords: implementation / Number Theoretic Transform, Software optimization, IBM Z^(R) systems, Homomorphic Encryption, Intel's AVX512-IFMA

Date: received 15 Oct 2021, last revised 15 Oct 2021

Contact author: drucker nir at gmail com, jdbradbu at us ibm com, marius hillenbrand at ibm com

Available format(s): PDF | BibTeX Citation

Version: 20211018:061238 (All versions of this report)

Short URL:

[ Cryptology ePrint archive ]