In this paper, we study basic counting problems in the shuffled model and establish separations between the error that can be achieved in the single-message shuffled model and in the shuffled model with multiple messages per user.
For the problem of frequency estimation for $n$ users and a domain of size $B$, we obtain:
- A nearly tight lower bound of $\tilde{\Omega}( \min(\sqrt[4]{n}, \sqrt{B}))$ on the error in the single-message shuffled model. This implies that the protocols obtained from the amplification via shuffling work of Erlingsson et al.~(SODA 2019) and Balle et al.~(Crypto 2019) are essentially optimal for single-message protocols. A key ingredient in the proof is a lower bound on the error of locally-private frequency estimation in the low-privacy (aka high $\epsilon$) regime. For this we develop new techniques to extend the results of Duchi et al.~(FOCS 2013; JASA 2018) and Bassily \& Smith~(STOC 2015), whose techniques were restricted to the high-privacy case.
- Protocols in the multi-message shuffled model with $\poly(\log{B}, \log{n})$ bits of communication per user and $\poly\log{B}$ error, which provide an exponential improvement on the error compared to what is possible with single-message algorithms. This implies protocols with similar error and communication guarantees for several well-studied problems such as heavy hitters, $d$-dimensional range counting, M-estimation of the median and quantiles, and more generally sparse non-adaptive statistical query algorithms.
For the related selection problem on a domain of size $B$, we prove: - A nearly tight lower bound of $\Omega(B)$ on the number of users in the single-message shuffled model. This significantly improves on the $\Omega(B^{1/17})$ lower bound obtained by Cheu et al.~(Eurocrypt 2019), and when combined with their $\tilde{O}(\sqrt{B})$-error multi-message protocol, implies the first separation between single-message and multi-message protocols for this problem.
Category / Keywords: foundations / Differential Privacy, Shuffled Model, Anonymous Channel, Frequency Estimation, Range Counting Date: received 1 Dec 2019, last revised 12 Jun 2020 Contact author: badihghazi at gmail com, badihghazi at google com, nzg at mit edu, ravi k53 at gmail com, pagh at itu dk, ameyav at google com Available format(s): PDF | BibTeX Citation Version: 20200613:000610 (All versions of this report) Short URL: ia.cr/2019/1382