Secure Two-Party k-Means Clustering

Paul Bunn; Rafail Ostrovsky

Paper 2007/231

Secure Two-Party k-Means Clustering

Paul Bunn and Rafail Ostrovsky

Abstract

The k-Means Clustering problem is one of the most-explored problems in data mining to date. With the advent of protocols that have proven to be successful in performing single database clustering, the focus has changed in recent years to the question of how to extend the single database protocols to a multiple database setting. To date there have been numerous attempts to create specific multiparty k-means clustering protocols that protect the privacy of each database, but according to the standard cryptographic definitions of ``privacy-protection,'' so far all such attempts have fallen short of providing adequate privacy. In this paper we describe a Two-Party k-Means Clustering Protocol that guarantees privacy, and is more efficient than utilizing a general multiparty ``compiler'' to achieve the same task. In particular, a main contribution of our result is a way to compute efficiently multiple iterations of k-means clustering without revealing the intermediate values. To achieve this, we use novel techniques to perform two-party division and sample uniformly at random from an unknown domain size. Our techniques are quite general and can be realized based on the existence of any semantically secure homomorphic encryption scheme. For concreteness, we describe our protocol based on Paillier Homomorphic Encryption scheme (see [Pa]). We will also demonstrate that our protocol is efficient in terms of communication, remaining competitive with existing protocols (such as [JW]) that fail to protect privacy.

Metadata

Available format(s): PDF PS
Category: Public-key cryptography
Publication info: Published elsewhere. Unknown where it was published
Keywords: Multiparty Computation k-means clustering
Contact author(s): paulbunn @ math ucla edu
History: 2007-06-19: received
Short URL: https://ia.cr/2007/231
License: CC BY

BibTeX

@misc{cryptoeprint:2007/231,
      author = {Paul Bunn and Rafail Ostrovsky},
      title = {Secure Two-Party k-Means Clustering},
      howpublished = {Cryptology {ePrint} Archive, Paper 2007/231},
      year = {2007},
      url = {https://eprint.iacr.org/2007/231}
}