Cryptology ePrint Archive: Report 2019/466

Privacy-Preserving K-means Clustering with Multiple Data Owners

Jung Hee Cheon and Jinhyuck Jeong and Dohyeong Ki and Jiseung Kim and Joohee Lee and Seok Won Lee

Abstract: Recently with the advent of technology, a lot of data are stored and mined in cloud servers. Since most of the data contain potential private information, it has become necessary to preserve the privacy in data mining. In this paper, we propose a protocol for collaboratively performing the K-means clustering algorithm on the data distributed among multiple data owners, while protecting the sensitive private data. We employ two service providers in our scenario, namely a main service provider and a key manager. Under the assumption that the cryptosystems used in our protocol are secure and that the two service providers do not collude, we provide a perfect secrecy in the sense that the cluster centroids and data are not leaked to any party including the two service providers. Also, we implement the scenario using recently proposed leveled homomorphic encryption called HEAAN. With our construction, the privacy-preserving K-means clustering can be done in less than one minute while maintaining 80-bit security in a situation with 10,000 data, 8 features and 4 clusters.

Category / Keywords: applications / K-means Clustering, Clustering, Machine Learning, Privacy-Preserving, Fully Homomorphic Encryption, HEAAN

Date: received 7 May 2019, withdrawn 10 May 2019

Contact author: wooki7098 at snu ac kr

Available format(s): (-- withdrawn --)

Note: This work was conducted in the fall of 2017.

Version: 20190510:131415 (All versions of this report)

Short URL: ia.cr/2019/466


[ Cryptology ePrint archive ]