Paper 2024/1429
Powerformer: Efficient Privacy-Preserving Transformer with Batch Rectifier-Power Max Function and Optimized Homomorphic Attention
We propose an efficient non-interactive privacy-preserving Transformer inference architecture called Powerformer. Since softmax is a non-algebraic operation, previous studies have attempted to modify it to be HE-friendly, but these methods have encountered issues with accuracy degradation or prolonged execution times due to the use of multiple bootstrappings. We propose replacing softmax with a new ReLU-based function called the \textit{Batch Rectifier-Power max} (BRPmax) function without any unstable approximation methods, which outperforms even original BERT performance within BERT-Large model while requiring fewer levels, allowing it to operate with only a single bootstrapping. We also present a matrix multiplication algorithms specialized for attention block that reduce the number of key-switchings by 35% to 91% compared to existing state-of-the-art methods. We design clear end-to-end HE-based implementation for private Transformer model, and our implementation of Powerformer on the BERT-tiny model using RNS-CKKS takes 503 seconds on a single-threaded CPU, and to the best of our knowledge, this is the first end-to-end non-interactive Transformer implementation using HE.
- Available format(s)
- Category
- Applications
- Publication info
- Preprint.
- Keywords
- Privacy-Preserving Machine LearningHomomorphic EncryptionTransformerImplementation
- Contact author(s)
thrudgelmir @ cau ac kr
eslee3209 @ sejong ac kr
jwlee2815 @ cau ac kr - History
- 2024-09-14: approved
- 2024-09-12: received
- See all versions
- Short URL
- License
@misc{cryptoeprint:2024/1429, author = {Dongjin Park and Eunsang Lee and Joon-Woo Lee}, title = {Powerformer: Efficient Privacy-Preserving Transformer with Batch Rectifier-Power Max Function and Optimized Homomorphic Attention}, howpublished = {Cryptology {ePrint} Archive, Paper 2024/1429}, year = {2024}, url = {} }