Hash-Prune-Invert: Improved Differentially Private Heavy-Hitter Detection in the Two-Server Model

Borja Balle; James Bell; Albert Cheu; Adria Gascon; Jonathan Katz; Mariana Raykova; Phillipp Schoppmann; Thomas Steinke

Paper 2024/2024

Hash-Prune-Invert: Improved Differentially Private Heavy-Hitter Detection in the Two-Server Model

Borja Balle, Google DeepMind

James Bell, Google Research

Albert Cheu, Google Research

Adria Gascon, Google Research

Jonathan Katz, Google (United States)

Mariana Raykova, Google (United States)

Phillipp Schoppmann, Google (United States)

Thomas Steinke, Google DeepMind

Abstract

Differentially private (DP) heavy-hitter detection is an important primitive for data analysis. Given a threshold $t$ and a dataset of $n$ items from a domain of size $d$, such detection algorithms ignore items occurring fewer than $t$ times while identifying items occurring more than $t+\Delta$ times; we call $\Delta$ the error margin. In the central model where a curator holds the entire dataset, $(\varepsilon,\delta)$-DP algorithms can achieve error margin $\Theta(\frac 1 \varepsilon \log \frac 1 \delta)$, which is optimal when $d \gg 1/\delta$. Several works, e.g., Poplar (S&P 2021), have proposed protocols in which two or more non-colluding servers jointly compute the heavy hitters from inputs held by $n$ clients. Unfortunately, existing protocols suffer from an undesirable dependence on $\log d$ in terms of both server efficiency (computation, communication, and round complexity) and accuracy (i.e., error margin), making them unsuitable for large domains (e.g., when items are kB-long strings, $\log d \approx 10^4$). We present hash-prune-invert (HPI), a technique for compiling any heavy-hitter protocol with the $\log d$ dependencies mentioned above into a new protocol with improvements across the board: computation, communication, and round complexity depend (roughly) on $\log n$ rather than $\log d$, and the error margin is independent of $d$. Our transformation preserves privacy against an active adversary corrupting at most one of the servers and any number of clients. We apply HPI to an improved version of Poplar, also introduced in this work, that improves Poplar's error margin by roughly a factor of $\sqrt{n}$ (regardless of $d$). Our experiments confirm that the resulting protocol improves efficiency and accuracy for large $d$.

Metadata

Available format(s): PDF
Category: Cryptographic protocols
Publication info: Preprint.
Keywords: Differential Privacy Heavy hitters Private heavy hitters Distributed Point Functions
Contact author(s): bballe @ google com
jhbell @ google com
cheu @ google com
adriag @ google com
jkcrypto @ google com
marianar @ google com
schoppmann @ google com
steinke @ google com
History: 2024-12-13: approved; 2024-12-13: received; See all versions
Short URL: https://ia.cr/2024/2024
License: CC BY

BibTeX

@misc{cryptoeprint:2024/2024,
      author = {Borja Balle and James Bell and Albert Cheu and Adria Gascon and Jonathan Katz and Mariana Raykova and Phillipp Schoppmann and Thomas Steinke},
      title = {Hash-Prune-Invert: Improved Differentially Private Heavy-Hitter Detection in the Two-Server Model},
      howpublished = {Cryptology {ePrint} Archive, Paper 2024/2024},
      year = {2024},
      url = {https://eprint.iacr.org/2024/2024}
}