Paper 2025/653

Fission: Distributed Privacy-Preserving Large Language Model Inference

Mehmet Ugurbil, Nillion
Dimitris Mouris, Nillion
Manuel B. Santos, Nillion
José Cabrero-Holgueras, Nillion
Miguel de Vega, Nillion
Shubho Sengupta, Meta
Abstract

The increased popularity of large language models (LLMs) raises serious privacy concerns, where users' private queries are sent to untrusted servers. Many cryptographic techniques have been proposed to provide privacy, such as secure multiparty computation (MPC), which enables the evaluation of LLMs directly on private data. However, cryptographic techniques have been deemed impractical as they introduce large communication and computation. On the other hand, many obfuscation techniques have been proposed, such as split inference, where part of the model is evaluated on edge devices to hide the input data from untrusted servers, but these methods provide limited privacy guarantees. We propose Fission, a privacy-preserving framework that improves latency while providing strong privacy guarantees. Fission utilizes an MPC network for linear computations, while nonlinearities are computed on a separate evaluator network that receives shuffled values in the clear and returns nonlinear functions evaluated at these values back to the MPC network. As a result, each evaluator only gets access to parts of the shuffled data, while the model weights remain private. We evaluate fission on a wide set of LLMs and compare it against prior works. Fission results in up to eight times faster inference and eight times reduced bandwidth compared to prior works while retaining high accuracy. Finally, we construct an attack on obfuscation techniques from related works that show significant information leakage, and we demonstrate how Fission enhances privacy.

Metadata
Available format(s)
PDF
Category
Implementation
Publication info
Preprint.
Keywords
Applied CryptographyLarge Language ModelsMachine LearningMultiparty Computation
Contact author(s)
memo @ nillion com
dimitris @ nillion com
manuel santos @ nillion com
jose cabrero @ nillion com
miguel @ nillion com
shubho @ gmail com
History
2025-04-13: approved
2025-04-09: received
See all versions
Short URL
https://ia.cr/2025/653
License
Creative Commons Attribution
CC BY

BibTeX

@misc{cryptoeprint:2025/653,
      author = {Mehmet Ugurbil and Dimitris Mouris and Manuel B. Santos and José Cabrero-Holgueras and Miguel de Vega and Shubho Sengupta},
      title = {Fission: Distributed Privacy-Preserving Large Language Model Inference},
      howpublished = {Cryptology {ePrint} Archive, Paper 2025/653},
      year = {2025},
      url = {https://eprint.iacr.org/2025/653}
}
Note: In order to protect the privacy of readers, eprint.iacr.org does not use cookies or embedded third party content.