BOLT: Privacy-Preserving, Accurate and Efficient Inference for Transformers

Qi Pang; Jinhao Zhu; Helen Möllering; Wenting Zheng; Thomas Schneider

Paper 2023/1893

BOLT: Privacy-Preserving, Accurate and Efficient Inference for Transformers

Qi Pang, Carnegie Mellon University

Jinhao Zhu, University of California, Berkeley

Helen Möllering, Technical University of Darmstadt

Wenting Zheng, Carnegie Mellon University

Thomas Schneider, Technical University of Darmstadt

Abstract

The advent of transformers has brought about significant advancements in traditional machine learning tasks. However, their pervasive deployment has raised concerns about the potential leakage of sensitive information during inference. Existing approaches using secure multiparty computation (MPC) face limitations when applied to transformers due to the extensive model size and resource-intensive matrix-matrix multiplications. In this paper, we present BOLT, a privacy-preserving inference framework for transformer models that supports efficient matrix multiplications and nonlinear computations. Combined with our novel machine learning optimizations, BOLT reduces the communication cost by 10.91x. Our evaluation on diverse datasets demonstrates that BOLT maintains comparable accuracy to floating-point models and achieves 4.8-9.5x faster inference across various network settings compared to the state-of-the-art system.

Metadata

Available format(s): PDF
Category: Cryptographic protocols
Publication info: Published elsewhere. Minor revision. IEEE S&P 2024
Keywords: secure multi-party computation homomorphic encryption secure machine learning inference transformer
Contact author(s): qipang @ cmu edu
jinhao zhu @ berkeley edu
moellering @ encrypto cs tu-darmstadt de
wenting @ cmu edu
schneider @ encrypto cs tu-darmstadt de
History: 2024-07-06: last of 5 revisions; 2023-12-09: received; See all versions
Short URL: https://ia.cr/2023/1893
License: CC BY

BibTeX

@misc{cryptoeprint:2023/1893,
      author = {Qi Pang and Jinhao Zhu and Helen Möllering and Wenting Zheng and Thomas Schneider},
      title = {{BOLT}: Privacy-Preserving, Accurate and Efficient Inference for Transformers},
      howpublished = {Cryptology {ePrint} Archive, Paper 2023/1893},
      year = {2023},
      url = {https://eprint.iacr.org/2023/1893}
}