Paper 2023/1147

CipherGPT: Secure Two-Party GPT Inference

Xiaoyang Hou, Zhejiang University
Jian Liu, Zhejiang University
Jingyu Li, Zhejiang University
Yuhan Li, Zhejiang University
Wen-jie Lu, Ant Group
Cheng Hong, Ant Group
Kui Ren, Zhejiang University
Abstract

ChatGPT is recognized as a significant revolution in the field of artificial intelligence, but it raises serious concerns regarding user privacy, as the data submitted by users may contain sensitive information. Existing solutions for secure inference face significant challenges in supporting GPT-like models due to the enormous number of model parameters and complex activation functions. In this paper, we develop CipherGPT, the first framework for secure two-party GPT inference, building upon a series of innovative protocols. First, we propose a secure matrix multiplication that is customized for GPT inference, achieving upto 6.2× speedup and 4.1× bandwidth reduction over SOTA. We also propose a novel protocol for securely computing GELU, surpassing SOTA by 1.8× in runtime, 2.5× in communication and 7.4× in precision. Furthermore, we come up with the first protocol for secure top-k sampling. We provide a full-fledged implementation and comprehensive benchmark for CipherGPT. In particular, we measure the runtime and communication for each individual operation, along with their corresponding proportions. We believe this can serve as a reference for future research in this area.

Metadata
Available format(s)
PDF
Category
Cryptographic protocols
Publication info
Preprint.
Keywords
Secure inferenceGPTLLMVOLE
Contact author(s)
xiaoyanghou @ zju edu cn
liujian2411 @ zju edu cn
jingyuli @ zju edu cn
yuhan2165 @ zju edu cn
juhou lwj @ antgroup com
vince hc @ antgroup com
kuiren @ zju edu cn
History
2024-05-26: last of 5 revisions
2023-07-25: received
See all versions
Short URL
https://ia.cr/2023/1147
License
Creative Commons Attribution
CC BY

BibTeX

@misc{cryptoeprint:2023/1147,
      author = {Xiaoyang Hou and Jian Liu and Jingyu Li and Yuhan Li and Wen-jie Lu and Cheng Hong and Kui Ren},
      title = {{CipherGPT}: Secure Two-Party {GPT} Inference},
      howpublished = {Cryptology {ePrint} Archive, Paper 2023/1147},
      year = {2023},
      url = {https://eprint.iacr.org/2023/1147}
}
Note: In order to protect the privacy of readers, eprint.iacr.org does not use cookies or embedded third party content.