## Cryptology ePrint Archive: Report 2020/623

PSI-Stats: Private Set Intersection Protocols Supporting Secure Statistical Functions

Jason H. M. Ying and Shuwei Cao and Geong Sen Poh and Jia Xu and Hoon Wei Lim

Abstract: Private Set Intersection (PSI) enables two parties, each holding a private set to securely compute their intersection without revealing other information. This paper considers settings of secure statistical computations over PSI, where both parties hold sets containing identifiers with one of the parties having an additional positive integer value associated with each of the identifiers in her set. The main objective is to securely compute some desired statistics of the associated values for which its corresponding identifiers occur in the intersection of the two sets. This is achieved without revealing the identifiers of the set intersection. This has many useful business applications, for examples in measuring effectiveness of advertising campaigns. In many cases, the parties wish to know various statistical information with regards to the set intersection and the associated integer values. For instance, information relating to arithmetic mean, geometric mean, harmonic mean, standard deviation, minimum, maximum, range or an approximate distribution of the sum composition. A potential use case is for a credit card company to provide the percentage of high spending to a shopping mall based on their common customers. Therefore, in this paper we introduce various mechanisms to enable secure computation of statistical functions, which we collectively termed PSI-Stats. The proposed protocols maintain strong privacy guarantee, that is computations are performed without revealing the identifiers of the set intersection to both parties. Implementations of our constructions are also carried out based on a simulated dataset as well as on actual datasets in the business use cases that we defined, in order to demonstrate practicality of our solution. To the best of our knowledge, our work is the first non-circuit-based type which enables parties to learn more about the set intersection via secure computations over a wide variety of statistical functions, without requiring the machinery of fully homomorphic encryption.

Category / Keywords: private set intersection, public-key cryptography, homomorphic encryption, statistical functions