You are looking at a specific version 20210220:173248 of this paper. See the latest version.

Paper 2021/173

TensorCrypto

Wai-Kong Lee and Hwajeong Seo and Zhenfei Zhang and Seongoun Hwang

Abstract

Tensor core is a specially designed hardware included in new NVIDIA GPU chips, aimed at accelerating deep learning applications. With the introduction of tensor core, the matrix multiplication at low precision can be computed much faster than using conventional integer and floating point units in NVIDIA GPU. In the past, applications of tensor core were mainly restricted to machine learning and mixed precision scientific computing. In this paper, we show that for the first time, tensor core can be used to accelerate state-of-the-art lattice-based cryptosystems. In particular, we employed tensor core to accelerate NTRU, one of the finalists in NIST post-quantum standardization. Towards our aim, several parallel algorithms are proposed to allow the tensor core to handle flexible matrix sizes and ephemeral key pair. Experimental results show that the polynomial convolution using tensor core is 2.79× (ntruhps2048509) and 2.72× (ntruhps2048677) faster than the version implemented with conventional integer units of NVIDIA GPU. The proposed tensor core based polynomial convolution technique was applied to NTRU public key scheme (TensorTRU). It achieved 1.94×/1.95× (encryption) and 1.97×/2.02× (decryption) better performance for the two parameter sets, compared to the conventional integer based implementations in GPU. TensorTRU is also more than 20× faster than the reference implementation in CPU and 2× faster than the AVX2 implementation, for both encryption and decryption. To demonstrate the flexibility of the proposed technique, we have extended the implementation to other lattice-based cryptosystems that have a small modulus (LAC and two variant parameter sets in FrodoKEM). Experimental results show that the tensor core based polynomial convolution is flexible and useful in accelerating lattice-based cryptosystems that cannot utilize number theoretic transform in performing polynomial multiplication.

Metadata
Available format(s)
PDF
Category
Implementation
Publication info
Preprint. MINOR revision.
Contact author(s)
waikong lee @ gmail com
History
2022-02-17: revised
2021-02-20: received
See all versions
Short URL
https://ia.cr/2021/173
License
Creative Commons Attribution
CC BY
Note: In order to protect the privacy of readers, eprint.iacr.org does not use cookies or embedded third party content.