Paper 2024/1091
MatcHEd: Privacy-Preserving Set Similarity based on MinHash
Abstract
Fully homomorphic encryption (FHE) enables arbitrary computation on encrypted data, but certain applications remain prohibitively expensive in the encrypted domain. As a case in point, comparing two encrypted sets of data is extremely computationally expensive due to the large number of comparison operators required. In this work, we propose a novel methodology for encrypted set similarity inspired by the MinHash algorithm and the CGGI FHE scheme. Doing comparisons in FHE requires comparators and multiplexers or an expensive approximation, which further increases the latency, especially when the goal is to compare two sets of data. The MinHash algorithm can significantly reduce the number of comparisons required by employing a special Carter-Wegman (CW) hash function as a key building block. However, the modulus operation in the CW hash becomes another key bottleneck because the encrypted sub-circuits required to perform the modular reduction are very large and inefficient in an FHE setting. Towards that end, we introduce an efficient bitwise FHE-friendly digest function (FFD) to employ as the cornerstone of our proposed encrypted set-similarity. In a Boolean FHE scheme like CGGI, the bitwise operations can be implemented efficiently with Boolean gates, which allows for faster evaluation times relative to standard Carter-Wegman constructions. Overall, our approach drastically reduces the number of comparisons required relative to the baseline approach of directly computing the Jaccard similarity coefficients, and is inherently parallelizable, allowing for efficient encrypted computation on multi-CPU and GPU-based cloud servers. We validate our approach by performing a privacy-preserving plagiarism detection across encrypted documents.
Metadata
- Available format(s)
- Category
- Applications
- Publication info
- Preprint.
- Keywords
- Secure computationFully homomorphic encryptionMinHashPrivacy-preserving set similarity
- Contact author(s)
-
rostinsh @ udel edu
cgouert @ udel edu
tsoutsos @ udel edu - History
- 2024-07-05: approved
- 2024-07-04: received
- See all versions
- Short URL
- https://ia.cr/2024/1091
- License
-
CC BY
BibTeX
@misc{cryptoeprint:2024/1091, author = {Rostin Shokri and Charles Gouert and Nektarios Georgios Tsoutsos}, title = {{MatcHEd}: Privacy-Preserving Set Similarity based on {MinHash}}, howpublished = {Cryptology {ePrint} Archive, Paper 2024/1091}, year = {2024}, url = {https://eprint.iacr.org/2024/1091} }