## Cryptology ePrint Archive: Report 2022/294

A Plug-n-Play Framework for Scaling Private Set Intersection to Billion-sized Sets

Saikrishna Badrinarayanan and Ranjit Kumaresan and Mihai Christodorescu and Vinjith Nagaraja and Karan Patel and Srinivasan Raghuraman and Peter Rindal and Wei Sun and Minghua Xu

Abstract: Motivated by the recent advances in practical secure computation, we design and implement a framework for scaling solutions for the problem of private set intersection (PSI) into the realm of big data. A protocol for PSI enables two parties each holding a set of elements to jointly compute the intersection of these sets without revealing the elements that are not in the intersection. Following a long line of research, recent protocols for PSI only have $\approx 5\times$ computation and communication overhead over an insecure set intersection. However, this performance is typically demonstrated for set sizes in the order of ten million. In this work, we aim to scale these protocols to efficiently handle set sizes of one billion elements or more.

We achieve this via a careful application of a binning approach that enables parallelizing any arbitrary PSI protocol. Building on this idea, we designed and implemented a framework that takes a pair of PSI executables (i.e., for each of the two parties) that typically works for million-sized sets, and then scales it to billion-sized sets (and beyond). For example, our framework can perform a join of billion-sized sets in 83 minutes compared to 2000 minutes of Pinkas et al. (ACM TPS 2018), an improvement of $25\times$. Furthermore, we present an end-to-end Spark application where two enterprises, each possessing private databases, can perform a restricted class of database join operations (specifically, join operations with only an on clause which is a conjunction of equality checks involving attributes from both parties, followed by a where clause which can be split into conjunctive clauses where each conjunction is a function of a single table) without revealing any data that is not part of the output.

Category / Keywords: cryptographic protocols / Private Set Intersection, Join