PrivQuant: Communication-Efficient Private Inference with Quantized Network/Protocol Co-Optimization

Tianshi Xu; Shuzhang Zhong; Wenxuan Zeng; Runsheng Wang; Meng Li

Paper 2024/2021

PrivQuant: Communication-Efficient Private Inference with Quantized Network/Protocol Co-Optimization

Tianshi Xu

, Peking University

Shuzhang Zhong, Peking University

Wenxuan Zeng, Peking University

Runsheng Wang, Peking University

Meng Li, Peking University

Abstract

Private deep neural network (DNN) inference based on secure two-party computation (2PC) enables secure privacy protection for both the server and the client. However, existing secure 2PC frameworks suffer from a high inference latency due to enormous communication. As the communication of both linear and non-linear DNN layers reduces with the bit widths of weight and activation, in this paper, we propose PrivQuant, a framework that jointly optimizes the 2PC-based quantized inference protocols and the network quantization algorithm, enabling communication-efficient private inference. PrivQuant proposes DNN architecture-aware optimizations for the 2PC protocols for communication-intensive quantized operators and conducts graph-level operator fusion for communication reduction. Moreover, PrivQuant also develops a communication-aware mixed precision quantization algorithm to improve the inference efficiency while maintaining high accuracy. The network/protocol co-optimization enables PrivQuant to outperform prior-art 2PC frameworks. With extensive experiments, we demonstrate PrivQuant reduces communication by , which results in latency reduction compared with SiRNN, COINN, and CoPriv, respectively.

Metadata

Available format(s): PDF
Category: Applications
Publication info: Published elsewhere. ICCAD'24
Keywords: Privacy-Preserving Deep Learning Multi-Party Computation Oblivious Transfer
Contact author(s): tianshixu @ stu pku edu cn
History: 2024-12-13: approved; 2024-12-13: received; See all versions
Short URL: https://ia.cr/2024/2021
License: CC BY-NC

BibTeX

@misc{cryptoeprint:2024/2021,
      author = {Tianshi Xu and Shuzhang Zhong and Wenxuan Zeng and Runsheng Wang and Meng Li},
      title = {{PrivQuant}: Communication-Efficient Private Inference with Quantized Network/Protocol Co-Optimization},
      howpublished = {Cryptology {ePrint} Archive, Paper 2024/2021},
      year = {2024},
      url = {https://eprint.iacr.org/2024/2021}
}