Paper 2019/466
Privacy-Preserving K-means Clustering with Multiple Data Owners
Jung Hee Cheon, Jinhyuck Jeong, Dohyeong Ki, Jiseung Kim, Joohee Lee, and Seok Won Lee
Abstract
Recently with the advent of technology, a lot of data are stored and mined in cloud servers. Since most of the data contain potential private information, it has become necessary to preserve the privacy in data mining. In this paper, we propose a protocol for collaboratively performing the K-means clustering algorithm on the data distributed among multiple data owners, while protecting the sensitive private data. We employ two service providers in our scenario, namely a main service provider and a key manager. Under the assumption that the cryptosystems used in our protocol are secure and that the two service providers do not collude, we provide a perfect secrecy in the sense that the cluster centroids and data are not leaked to any party including the two service providers. Also, we implement the scenario using recently proposed leveled homomorphic encryption called HEAAN. With our construction, the privacy-preserving K-means clustering can be done in less than one minute while maintaining 80-bit security in a situation with 10,000 data, 8 features and 4 clusters.
Note: This work was conducted in the fall of 2017.
Metadata
- Available format(s)
- -- withdrawn --
- Category
- Applications
- Publication info
- Preprint. MINOR revision.
- Keywords
- K-means ClusteringClusteringMachine LearningPrivacy-PreservingFully Homomorphic EncryptionHEAAN
- Contact author(s)
- wooki7098 @ snu ac kr
- History
- 2019-05-10: withdrawn
- 2019-05-10: received
- See all versions
- Short URL
- https://ia.cr/2019/466
- License
-
CC BY