Paper 2025/1200
Tricycle: Private Transformer Inference with Tricyclic Encodings
Abstract
The growing adoption of Large Language Models in privacy-sensitive domains necessitates secure inference mechanisms that preserve data confidentiality. Homomorphic encryption offers a promising pathway by enabling computation on encrypted inputs, yet existing approaches struggle to scale efficiently to full transformer models due to limitations in packing schemes, which must efficiently support a wide range of operations, including matrix multiplications, row-wise nonlinear operations, and self-attention. In this work, we present Tricycle, a framework for private transformer inference built on our novel packing scheme, called tricyclic encodings, which are designed to efficiently support these core operations. Tricyclic encodings are a generalization of bicyclic encodings, enabling privacy-preserving batch matrix multiplications with optimal multiplicative depth in order to facilitate parallelized multi-head self-attention. We optimize our matrix multiplications by incorporating Baby-Step Giant-Step optimizations to reduce ciphertext rotations and presenting new ciphertext-plaintext matrix multiplication techniques that relax prior limitations. A further contribution of our work is a lightweight and effective approach for stabilizing the softmax function via statistical max estimation. Our end-to-end implementation on a BERT-Tiny model shows that Tricycle achieves a \(1.5 \times\) to \(3 \times\) speedup over previous approaches, marking a step toward practical and scalable private LLM inference without sacrificing model fidelity.
Metadata
- Available format(s)
-
PDF
- Category
- Applications
- Publication info
- Preprint.
- Keywords
- Private LLM InferenceTransformersHomomorphic Encryption
- Contact author(s)
-
lawrenceklim @ ucsb edu
vikaskalagi @ ucsb edu
divyagrawal @ ucsb edu
elabbadi @ ucsb edu - History
- 2025-06-30: revised
- 2025-06-27: received
- See all versions
- Short URL
- https://ia.cr/2025/1200
- License
-
CC BY
BibTeX
@misc{cryptoeprint:2025/1200, author = {Lawrence Lim and Vikas Kalagi and Divyakant Agrawal and Amr El Abbadi}, title = {Tricycle: Private Transformer Inference with Tricyclic Encodings}, howpublished = {Cryptology {ePrint} Archive, Paper 2025/1200}, year = {2025}, url = {https://eprint.iacr.org/2025/1200} }