Paper 2025/653
Fission: Distributed Privacy-Preserving Large Language Model Inference
Abstract
The increased popularity of large language models (LLMs) raises serious privacy concerns, where users' private queries are sent to untrusted servers. Many cryptographic techniques have been proposed to provide privacy, such as secure multiparty computation (MPC), which enables the evaluation of LLMs directly on private data. However, cryptographic techniques have been deemed impractical as they introduce large communication and computation. On the other hand, many obfuscation techniques have been proposed, such as split inference, where part of the model is evaluated on edge devices to hide the input data from untrusted servers, but these methods provide limited privacy guarantees. We propose Fission, a privacy-preserving framework that improves latency while providing strong privacy guarantees. Fission utilizes an MPC network for linear computations, while nonlinearities are computed on a separate evaluator network that receives shuffled values in the clear and returns nonlinear functions evaluated at these values back to the MPC network. As a result, each evaluator only gets access to parts of the shuffled data, while the model weights remain private. We evaluate fission on a wide set of LLMs and compare it against prior works. Fission results in up to eight times faster inference and eight times reduced bandwidth compared to prior works while retaining high accuracy. Finally, we construct an attack on obfuscation techniques from related works that show significant information leakage, and we demonstrate how Fission enhances privacy.
Metadata
- Available format(s)
-
PDF
- Category
- Implementation
- Publication info
- Preprint.
- Keywords
- Applied CryptographyLarge Language ModelsMachine LearningMultiparty Computation
- Contact author(s)
-
memo @ nillion com
dimitris @ nillion com
manuel santos @ nillion com
jose cabrero @ nillion com
miguel @ nillion com
shubho @ gmail com - History
- 2025-04-13: approved
- 2025-04-09: received
- See all versions
- Short URL
- https://ia.cr/2025/653
- License
-
CC BY
BibTeX
@misc{cryptoeprint:2025/653, author = {Mehmet Ugurbil and Dimitris Mouris and Manuel B. Santos and José Cabrero-Holgueras and Miguel de Vega and Shubho Sengupta}, title = {Fission: Distributed Privacy-Preserving Large Language Model Inference}, howpublished = {Cryptology {ePrint} Archive, Paper 2025/653}, year = {2025}, url = {https://eprint.iacr.org/2025/653} }