Paper 2024/136
Secure Transformer Inference Made Non-interactive
Abstract
Secure transformer inference has emerged as a prominent research topic following the proliferation of ChatGPT. Existing solutions are typically interactive, involving substantial communication load and numerous interaction rounds between the client and the server. In this paper, we propose NEXUS the first non-interactive protocol for secure transformer inference, where the client is only required to submit an encrypted input and await the encrypted result from the server. Central to NEXUS are two innovative techniques: SIMD ciphertext compression/decompression, and SIMD slots folding. Consequently, our approach achieves a speedup of 2.8$\times$ and a remarkable bandwidth reduction of 368.6$\times$, compared to the state-of-the-art solution presented in S&P '24.
Metadata
- Available format(s)
- Category
- Cryptographic protocols
- Publication info
- Preprint.
- Keywords
- Secure InferenceLLMHomomorphic Encryption
- Contact author(s)
-
kevinzh @ zju edu cn
jian liu @ zju edu cn
yangxinpeng @ zju edu cn
asternight @ zju edu cn
chenkejia @ zju edu cn
xiaoyanghou @ zju edu cn
kuiren @ zju edu cn
xiaoyanghou @ zju edu cn - History
- 2024-01-31: approved
- 2024-01-31: received
- See all versions
- Short URL
- https://ia.cr/2024/136
- License
-
CC BY
BibTeX
@misc{cryptoeprint:2024/136, author = {Jiawen Zhang and Jian Liu and Xinpeng Yang and Yinghao Wang and Kejia Chen and Xiaoyang Hou and Kui Ren and Xiaohu Yang}, title = {Secure Transformer Inference Made Non-interactive}, howpublished = {Cryptology ePrint Archive, Paper 2024/136}, year = {2024}, note = {\url{https://eprint.iacr.org/2024/136}}, url = {https://eprint.iacr.org/2024/136} }