Cryptology ePrint Archive: Report 2019/359

SANNS: Scaling Up Secure Approximate k-Nearest Neighbors Search

Hao Chen and Ilaria Chillotti and Yihe Dong and Oxana Poburinnaya and Ilya Razenshteyn and M. Sadegh Riazi

Abstract: We present new secure protocols for approximate $k$-nearest neighbor search ($k$-NNS) over the Euclidean distance in the semi-honest model, which scale to massive datasets. One of the new ingredients is a circuit for the approximate top-$k$ selection from $n$ numbers that is built from $O(n + \mathrm{poly}(k))$ comparators. Using this circuit as a subroutine, we design new $k$-NNS algorithms and two corresponding secure protocols: 1) optimized linear scan; 2) clustering-based sublinear time algorithm.

The new secure protocols utilize a combination of additively-homomorphic encryption, garbled circuits and oblivious RAM. Along the way, we introduce various optimizations to these primitives, which drastically improve concrete efficiency.

We evaluate the new protocols empirically and show that they are able to handle datasets that are significantly larger than in the prior work. For instance, running on two standard Azure instances within the same availability zone, for a dataset of $96$-dimensional descriptors of $10\,000\,000$ images, we can find $10$ nearest neighbors with average accuracy $0.9$ in under $10$ seconds improving upon prior work by at least two orders of magnitude.

Category / Keywords: applications / nearest neighbor search, privacy-preserving data analysis, sublinear algorithms

Date: received 3 Apr 2019, last revised 3 Jul 2019

Contact author: ilya razenshteyn at gmail com

Available format(s): PDF | BibTeX Citation

Note: Significantly improved presentation.

Version: 20190704:004952 (All versions of this report)

Short URL:

[ Cryptology ePrint archive ]