Paper 2024/1881

THOR: Secure Transformer Inference with Homomorphic Encryption

Jungho Moon, Hanyang University
Dongwoo Yoo, Yonsei University
Xiaoqian Jiang, University of Texas, Health Science Center at Houston
Miran Kim, Hanyang University
Abstract

As language models are increasingly deployed in cloud environments, privacy concerns have become a significant issue. To address this, we design THOR, a secure inference framework for transformer models on encrypted data. Specifically, we first propose new fast matrix multiplication algorithms based on diagonal-major order encoding and extend them to parallel matrix computation through the compact ciphertext packing technique. Second, we design efficient protocols for secure computations of four non-linear functions such as softmax, LayerNorm, GELU, and Tanh, by integrating advanced underlying approximation methods with tailored optimizations. Our matrix multiplication algorithms reduce the number of key-switching operations in the linear layers of the attention block in the BERT-base model by up to 14.5x, compared to the state-of-the-art HE-based secure inference protocol (Park et al., Preprint). Combined with cryptographic optimizations, our experimental results demonstrate that THOR provides secure inference for the BERT-base model with a latency of 10.43 minutes on a single GPU, while maintaining comparable inference accuracy on the MRPC dataset.

Metadata
Available format(s)
PDF
Category
Cryptographic protocols
Publication info
Preprint.
Keywords
Homomorphic encryptiontransformer
Contact author(s)
moonjungho @ hanyang ac kr
aydw0507 @ yonsei ac kr
Xiaoqian Jiang @ uth tmc edu
miran @ hanyang ac kr
History
2024-11-22: approved
2024-11-19: received
See all versions
Short URL
https://ia.cr/2024/1881
License
Creative Commons Attribution
CC BY

BibTeX

@misc{cryptoeprint:2024/1881,
      author = {Jungho Moon and Dongwoo Yoo and Xiaoqian Jiang and Miran Kim},
      title = {{THOR}: Secure Transformer Inference with Homomorphic Encryption},
      howpublished = {Cryptology {ePrint} Archive, Paper 2024/1881},
      year = {2024},
      url = {https://eprint.iacr.org/2024/1881}
}
Note: In order to protect the privacy of readers, eprint.iacr.org does not use cookies or embedded third party content.