## Cryptology ePrint Archive: Report 2015/696

Novel algorithms and hardware architectures for Montgomery Multiplication over GF(p)

Miguel Morales Sandoval and Arturo Diaz Perez

Abstract: This report describes the design and implementation results in FPGAs of a scalable hardware architecture for computing modular multiplication in prime fields GF($p$), based on the Montgomery multiplication (MM) algorithm. Starting from an existing digit-serial version of the MM algorithm, a novel {\it digit-digit} based MM algorithm is derived and two hardware architectures that compute that algorithm are described. In the proposed approach, the input operands (multiplicand, multiplier and modulus) are represented using as radix $\beta = 2^k$. Operands of arbitrary size can be multiplied with modular reduction using almost the same hardware since the multiplier's kernel module that performs the modular multiplication depends only on $k$. The novel hardware architectures proposed in this paper were verified by modeling them using VHDL and implementing them in the Xilinx FPGAs Spartan and Virtex5. Design trade-offs are analyzed considering different operand sizes commonly used in cryptography and different values for $k$. The proposed designs for MM are well suited to be implemented in modern FPGAs, making use of available dedicated multiplier and memory blocks reducing drastically the FPGA's standard logic while keeping an acceptable performance compared with other implementation approaches. From the Virtex5 implementation, the proposed MM multiplier reaches a throughput of 242Mbps using only 219 FPGA slices and achieving a 1024-bit modular multiplication in 4.21$\mu$secs.

Category / Keywords: public-key cryptography / Field multiplication, Montgomery multiplication, finite fields, hardware architecture, FPGAs

Original Publication (with major differences): IET Computers and Digital Techniques