Cryptology ePrint Archive: Report 2021/1636

Does Fully Homomorphic Encryption Need Compute Acceleration?

Leo de Castro and Rashmi Agrawal and Rabia Yazicigil and Anantha Chandrakasan and Vinod Vaikuntanathan and Chiraag Juvekar and Ajay Joshi

Abstract: The emergence of cloud-computing has raised important privacy questions about the data that users share with remote servers. While data in transit is protected using standard techniques like Transport Layer Security (TLS), most cloud providers have unrestricted plaintext access to user data at the endpoint. Fully Homomorphic Encryption (FHE) offers one solution to this problem by allowing for arbitrarily complex computations on encrypted data without ever needing to decrypt it. Unfortunately, all known implementations of FHE require the addition of noise during encryption which grows during computation. As a result, sustaining deep computations requires a periodic noise reduction step known as bootstrapping. The cost of the bootstrapping operation is one of the primary barriers to the wide-spread adoption of FHE.In this paper, we present an in-depth architectural analysis of the bootstrapping step in FHE. First, we observe that se-cure implementations of bootstrapping exhibit a low arithmetic intensity (<1Op/byte), require large caches (>100MB) and as such, are heavily bound by the main memory bandwidth.Consequently, we demonstrate that existing workloads observe marginal performance gains from the design of bespoke high-throughput arithmetic units tailored to FHE. Secondly, we propose several cache-friendly algorithmic optimizations that improve the throughput in FHE bootstrapping by enabling upto3.2×higher arithmetic intensity and4.6×lower memory bandwidth. Our optimizations apply to a wide range of structurally similar computations such as private evaluation and training of machine learning models. Finally, we incorporate these optimizations into an architectural tool which, given a cache size, memory subsystem, the number of functional units and a desired security level, selects optimal cryptosystem parameters to maximize the bootstrapping throughput.Our optimized bootstrapping implementation represents a best-case scenario for compute acceleration of FHE. We show that despite these optimizations, bootstrapping (as well as other applications with similar computational structure) continue to remain bottlenecked by main memory bandwidth. We thus conclude that secure FHE implementations need to look beyond accelerated compute for further performance improvements and to that end, we propose new research directions to address the underlying memory bottleneck. In summary, our answer to the titular question is: yes, but only after addressing the memory bottleneck!

Category / Keywords: FHE, Bootstrapping, Hardware Acceleration

Date: received 14 Dec 2021, last revised 17 Dec 2021

Contact author: ldec at mit edu

Available format(s): PDF | BibTeX Citation

Version: 20211217:150622 (All versions of this report)

Short URL: ia.cr/2021/1636


[ Cryptology ePrint archive ]