In this paper we describe a Two-Party k-Means Clustering Protocol that guarantees privacy, and is more efficient than utilizing a general multiparty ``compiler'' to achieve the same task. In particular, a main contribution of our result is a way to compute efficiently multiple iterations of k-means clustering without revealing the intermediate values. To achieve this, we use novel techniques to perform two-party division and sample uniformly at random from an unknown domain size.
Our techniques are quite general and can be realized based on the existence of any semantically secure homomorphic encryption scheme. For concreteness, we describe our protocol based on Paillier Homomorphic Encryption scheme (see [Pa]). We will also demonstrate that our protocol is efficient in terms of communication, remaining competitive with existing protocols (such as [JW]) that fail to protect privacy.
Category / Keywords: public-key cryptography / Multiparty Computation, k-means clustering Date: received 12 Jun 2007 Contact author: paulbunn at math ucla edu Available formats: Postscript (PS) | Compressed Postscript (PS.GZ) | PDF | BibTeX Citation Version: 20070619:195100 (All versions of this report) Discussion forum: Show discussion | Start new discussion