Paper 2019/465

Towards a Practical Cluster Analysis over Encrypted Data

Jung Hee Cheon, Duhyeong Kim, and Jai Hyun Park


Cluster analysis is one of the most significant unsupervised machine learning tasks, and it is utilized in various fields associated with privacy issues including bioinformatics, finance and image processing. In this paper, we propose a practical solution for privacy-preserving cluster analysis based on homomorphic encryption~(HE). Our work is the first HE solution for the mean-shift clustering algorithm. To reduce the super-linear complexity of the original mean-shift algorithm, we adopt a novel random sampling method called dust sampling which perfectly fits in HE and achieves the linear complexity. We also substitute non-polynomial kernels by a new polynomial kernel so that it can be efficiently computed in HE. The HE implementation of our modified mean-shift clustering algorithm based on the approximate HE scheme HEAAN shows prominent performance in terms of speed and accuracy. It takes about $30$ minutes with $99\%$ accuracy over several public datasets with hundreds of data, and even for the dataset with $262,144$ data it takes only $82$ minutes applying SIMD operations in HEAAN. Our results outperform the previously best known result (SAC 2018) over $400$ times.

Available format(s)
Publication info
Published elsewhere. Minor revision. Selected Areas in Cryptography (SAC) 2019
clusteringmean-shifthomomorphic encryptionprivacy
Contact author(s)
doodoo1204 @ snu ac kr
jhyunp @ snu ac kr
2019-10-20: last of 5 revisions
2019-05-10: received
See all versions
Short URL
Creative Commons Attribution


      author = {Jung Hee Cheon and Duhyeong Kim and Jai Hyun Park},
      title = {Towards a Practical Cluster Analysis over Encrypted Data},
      howpublished = {Cryptology ePrint Archive, Paper 2019/465},
      year = {2019},
      note = {\url{}},
      url = {}
Note: In order to protect the privacy of readers, does not use cookies or embedded third party content.