Cryptology ePrint Archive: Report 2018/335

Fast modular squaring with AVX512IFMA

Nir Drucker and Shay Gueron

Abstract: Modular exponentiation represents a signi cant workload for public key cryptosystems. Examples include not only the classical RSA, DSA, and DH algorithms, but also the partially homomorphic Paillier encryption. As a result, efficient software implementations of modular exponentiation are an important target for optimization. This paper studies methods for using Intel's forthcoming AVX512 Integer Fused Multiply Accumulate (AVX512IFMA) instructions in order to speed up modular (Montgomery) squaring, which dominates the cost of the exponentiation. We further show how a minor tweak in the architectural definition of AVX512IFMA has the potential to further speed up modular squaring.

Category / Keywords: implementation

Date: received 10 Apr 2018

Contact author: drucker nir at gmail com

Version: 20180411:201635 (All versions of this report)

