Paper 2023/804
Falkor: Federated Learning Secure Aggregation Powered by AES-CTR GPU Implementation
Abstract
We propose a novel protocol, Falkor, for secure aggregation for Federated Learning in the multi-server scenario based on masking of local models via a stream cipher based on AES in counter mode and accelerated by GPUs running on the aggregating servers. The protocol is resilient to client dropout and has reduced clients/servers communication cost by a factor equal to the number of aggregating servers (compared to the naïve baseline method). It scales simultaneously in the two major complexity aspects: 1) large number of clients; 2) highly complex machine learning models such as CNNs, RNNs, Transformers, etc. The AES-CTR-based masking function in our aggregation protocol is built on the concept of counter-based cryptographically-secure pseudorandom number generators (csPRNGs) as described in [SMDS'11] and subsequently used by Facebook for their torchcsprng csPRNG. We improve upon torchcsprng by careful use of shared memory on the GPU device, a recent idea of Cihangir Tezcan [Tezcan'21] and obtain 100x speedup in the masking function compared to a single CPU core. In addition, we prove the semantic security of the AES-CTR-based masking function. Finally, we demonstrate scalability of our protocol in two real-world Federated Learning scenarios: 1) efficient training of large logistic regression models with 50 features and 50M data points distributed across 1000 clients that can dropout and securely aggregated via three servers (running secure multi-party computation (SMPC)); 2) training a recurrent neural network (RNN) model for sentiment analysis of Twitter feeds coming from a large number of Twitter users (more than 250,000 users). In case 1), our secure aggregation algorithm runs in less than a minute compared to a pure MPC computation (on 3 parties) that takes 27 hours and uses 400GB RAM machines as well as 1 gigabit-per-second network. In case 2), the total training is around $10$ minutes using our GPU powered secure aggregation versus 10 hours using a single CPU core.
Metadata
- Available format(s)
- Category
- Cryptographic protocols
- Publication info
- Preprint.
- Keywords
- Federated LearningSecure AggregationMPCGPU Optimizations
- Contact author(s)
-
maria georgievabs @ gmail com
sofiadandjee11 @ gmail com
nicolas gama @ gmail com
dimitar @ inpher io
dmitry mikushin @ unil ch - History
- 2023-06-06: approved
- 2023-06-01: received
- See all versions
- Short URL
- https://ia.cr/2023/804
- License
-
CC BY
BibTeX
@misc{cryptoeprint:2023/804, author = {Mariya Georgieva Belorgey and Sofia Dandjee and Nicolas Gama and Dimitar Jetchev and Dmitry Mikushin}, title = {Falkor: Federated Learning Secure Aggregation Powered by {AES}-{CTR} {GPU} Implementation}, howpublished = {Cryptology {ePrint} Archive, Paper 2023/804}, year = {2023}, url = {https://eprint.iacr.org/2023/804} }