You are looking at a specific version 20190512:152429 of this paper. See the latest version.

Paper 2019/465

Towards a Practical Clustering Analysis over Encrypted Data

Jung Hee Cheon and Duhyeong Kim and Jai Hyun Park

Abstract

Clustering analysis is one of the most significant unsupervised machine learning tasks, and it is utilized in various fields associated with privacy issue including bioinformatics, finance and image processing. In this paper, we propose a practical solution for privacy-preserving clustering analysis based on homomorphic encryption~(HE). Our work is the first HE solution for the mean-shift clustering algorithm. To reduce the super-linear complexity of the original mean-shift algorithm, we adopt a novel random sampling method called dust sampling which perfectly fits in HE and achieve the linear complexity. We also substitute non-polynomial kernels by a new polynomial kernel so that it can be efficiently computed in HE. The quality of clustering analysis with the new HE-friendly kernel is fairly fine in practice. The performance of our modified mean-shift clustering algorithm based on the approximate HE scheme HEAAN is quite remarkable in terms of speed and accuracy. It takes about $30$ minutes with $99\%$ accuracy over several public datasets with hundreds of data, but even for two hundred thousands of data it takes only $82$ minutes with SIMD operations in HEAAN. Our results outperform the previously best known result (SAC 2018) over $400$ times.

Metadata
Available format(s)
PDF
Category
Applications
Publication info
Preprint. MINOR revision.
Keywords
clusteringmean-shifthomomorphic encryptionprivacy
Contact author(s)
doodoo1204 @ snu ac kr
History
2019-10-20: last of 5 revisions
2019-05-10: received
See all versions
Short URL
https://ia.cr/2019/465
License
Creative Commons Attribution
CC BY
Note: In order to protect the privacy of readers, eprint.iacr.org does not use cookies or embedded third party content.