Paper 2024/1093

Faster Lookup Table Evaluation with Application to Secure LLM Inference

Xiaoyang Hou, Zhejiang University
Jian Liu, Zhejiang University
Jingyu Li, Zhejiang University
Jiawen Zhang, Zhejiang University
Kui Ren, Zhejiang University
Abstract

As large language models (LLMs) continue to gain popularity, concerns about user privacy are amplified, given that the data submitted by users for inference may contain sensitive information. Therefore, running LLMs through secure two-party computation (a.k.a. secure LLM inference) has emerged as a prominent topic. However, many operations in LLMs, such as Softmax and GELU, cannot be computed using conventional gates in secure computation; instead, lookup tables (LUTs) have to be utilized, which makes LUT to be an essential primitive in secure LLM inference. In this paper, we propose $\mathsf{ROTL}$, a secure two-party protocol for LUT evaluations. Compared with FLUTE (the state-of-the-art LUT presented at Oakland '23), it achieves upto 11.6$\times$ speedup in terms of overall performance and 155$\times$ speedup in terms of online performance. Furthermore, $\mathsf{ROTL}$ can support arithmetic shares (which is required by secure LLM inference), whereas FLUTE can only support boolean shares. At the heart of $\mathsf{ROTL}$ is a novel protocol for secret-shared rotation, which allows two parties to generate additive shares of the rotated table without revealing the rotation offset. We believe this protocol is of independent interest. Based on $\mathsf{ROTL}$, we design a novel secure comparison protocol; compared with the state-of-the-art, it achieves a 2.4$\times$ bandwidth reduction in terms of online performance. To support boolean shares, we further provide an optimization for FLUTE, by reducing its computational complexity from $O(l\cdot n^2)$ to $O(n\log n+l\cdot n)$ and shifting $O(n\log n)$ computation to the preprocessing phase. As a result, compared with FLUTE, it achieves upto 10.8$\times$ speedup in terms of overall performance and 962$\times$ speedup in terms of online performance.

Metadata
Available format(s)
PDF
Category
Cryptographic protocols
Publication info
Preprint.
Keywords
Secure Two-Party ComputationLook Up TableSecure Inference
Contact author(s)
xiaoyanghou @ zju edu cn
liujian2411 @ zju edu cn
jingyuli @ zju edu cn
kevinzh @ zju edu cn
kuiren @ zju edu cn
History
2024-07-05: approved
2024-07-04: received
See all versions
Short URL
https://ia.cr/2024/1093
License
Creative Commons Attribution
CC BY

BibTeX

@misc{cryptoeprint:2024/1093,
      author = {Xiaoyang Hou and Jian Liu and Jingyu Li and Jiawen Zhang and Kui Ren},
      title = {Faster Lookup Table Evaluation with Application to  Secure {LLM} Inference},
      howpublished = {Cryptology ePrint Archive, Paper 2024/1093},
      year = {2024},
      note = {\url{https://eprint.iacr.org/2024/1093}},
      url = {https://eprint.iacr.org/2024/1093}
}
Note: In order to protect the privacy of readers, eprint.iacr.org does not use cookies or embedded third party content.