PIGEON: A High Throughput Framework for Private Inference of Neural Networks using Secure Multiparty Computation

Christopher Harth-Kitzerow; Yongqin Wang; Rachit Rajat; Georg Carle; Murali Annavaram

Paper 2024/1371

PIGEON: A High Throughput Framework for Private Inference of Neural Networks using Secure Multiparty Computation

Christopher Harth-Kitzerow

, Technical University of Munich, BMW Group

Yongqin Wang

, University of Southern California

Rachit Rajat

, University of Southern California

Georg Carle

Murali Annavaram

, University of Southern California

Abstract

Privacy-Preserving Machine Learning (PPML) is one of the most relevant use cases for Secure Multiparty Computation (MPC). While private training of large neural networks such as VGG-16 or ResNet-50 on state-of-the-art datasets such as ImageNet is still out of reach, given the performance overhead of MPC, GPU-based MPC frameworks are starting to achieve practical runtimes for private inference. However, we show that, unlike plaintext machine learning, using GPU acceleration for both linear (e.g., convolutions) and nonlinear neural network layers (e.g., ReLU) is actually counterproductive in PPML. While GPUs effectively accelerate linear layers compared to CPU-based MPC implementations, the MPC circuits required to evaluate nonlinear layers introduce memory overhead and frequent data movement between the GPU and the CPU to handle network communication. This results in slow ReLU performance and high GPU memory requirements in state-of-the-art GPU-based PPML frameworks, hindering them from scaling to multiple images per second inference throughput and more than eight images per batch on ImageNet. To overcome these limitations, we propose PIGEON, an open-source framework for Private Inference of Neural Networks. PIGEON employs a novel ABG programming model that switches between Arithmetic Vectorization and Bitslicing on the CPU for nonlinear layers depending on the MPC-specific computation required while offloading linear layers to the GPU. Compared to the state-of-the-art PPML framework Piranha, PIGEON improves ReLU throughput by two orders of magnitude, reduces peak GPU memory utilization by one order of magnitude, and scales better with large batch sizes. This translates to one to two orders of magnitude improvements in throughput for large ImageNet batch sizes (e.g., 192) and more than 70% saturation of a 25 Gbit/s network.

Note: This is the public version of the paper to be published at the 25th Privacy Enhancing Technologies Symposium (PETS 2025).

Metadata

Available format(s): PDF
Category: Cryptographic protocols
Publication info: Preprint.
Keywords: Privacy-Preserving Machine Learning Secure Inference Secure Multiparty Computation Implementation
Contact author(s): christopher harth-kitzerow @ tum de
yongqin @ usc edu
rrajt @ usc edu
carle @ net in tum de
annavara @ usc edu
History: 2025-03-15: last of 5 revisions; 2024-09-01: received; See all versions
Short URL: https://ia.cr/2024/1371
License: CC BY

BibTeX

@misc{cryptoeprint:2024/1371,
      author = {Christopher Harth-Kitzerow and Yongqin Wang and Rachit Rajat and Georg Carle and Murali Annavaram},
      title = {{PIGEON}: A High Throughput Framework for Private Inference of Neural Networks using Secure Multiparty Computation},
      howpublished = {Cryptology {ePrint} Archive, Paper 2024/1371},
      year = {2024},
      url = {https://eprint.iacr.org/2024/1371}
}