Paper 2021/1389

DPCrypto: Acceleration of Post-quantum Cryptographic Algorithms using Dot-Product Instruction on GPUs

Wai-Kong Lee
Hwajeong Seo
Seong Oun Hwang
Angshuman Karmakar
Jose Maria Bermudo Mera
Ramachandra Achar

Dot-product is a widely used operation in many machine learning and scientific computing algorithms. Recently, NVIDIA has introduced dot-product instructions (DP2A and DP4A) in modern GPU architectures, with the aim of accelerating machine learning and scientific computing applications. These dot-product instructions allow the computation of multiply-and-add instructions in a clock cycle, effectively achieving higher throughput compared to conventional 32-bit integer units. In this paper, we show that the dot-product instruction can also be used to accelerate matrix-multiplication and polynomial convolution operations, which are commonly found in post-quantum lattice-based cryptographic schemes. In particular, we propose a highly optimized implementation of FrodoKEM, wherein the matrix-multiplication is accelerated by the dot-product instruction. We also present specially designed data structures that allow an efficient implementation of Saber key encapsulation mechanism, utilizing the dot-product instruction to speed-up the polynomial convolution. The proposed FrodoKEM implementation achieves 4.37x higher throughput in terms of key exchange operations per second than the state-of-the-art implementation on V100 GPU. This paper also presents the first implementation of Saber on GPU platforms, achieving 124,418, 120,463, and 31,658 key exchange operations per second on RTX3080, V100, and T4 GPUs, respectively. Since matrix-multiplication and polynomial convolution operations are the most time-consuming operations in lattice-based cryptographic schemes, our proposed techniques are likely to benefit other similar algorithms. The proposed high throughput implementation of KEMs on various GPU platforms allows the heavy computations (KEMs) to be offloaded from the server. This is very useful for many emerging applications like Internet of Things and cloud computing.

Available format(s)
Publication info
Published elsewhere. IEEE TCAS-I
Post-quantum Cryptography Dot-product Polynomial Convolution Matrix-multiplication Graphics Processing Unit
Contact author(s)
waikong lee @ gmail com
2022-06-13: revised
2021-10-15: received
See all versions
Short URL
Creative Commons Attribution


      author = {Wai-Kong Lee and Hwajeong Seo and Seong Oun Hwang and Angshuman Karmakar and Jose Maria Bermudo Mera and Ramachandra Achar},
      title = {DPCrypto: Acceleration of Post-quantum Cryptographic Algorithms using Dot-Product Instruction on GPUs},
      howpublished = {Cryptology ePrint Archive, Paper 2021/1389},
      year = {2021},
      note = {\url{}},
      url = {}
Note: In order to protect the privacy of readers, does not use cookies or embedded third party content.