### Privacy-preserving Density-based Clustering

Beyza Bozdemir, Sébastien Canard, Orhan Ermis, Helen Möllering, Melek Önen, and Thomas Schneider

##### Abstract

Clustering is an unsupervised machine learning technique that outputs clusters containing similar data items. In this work, we investigate privacy-preserving density-based clustering which is, for example, used in financial analytics and medical diagnosis. When (multiple) data owners collaborate or outsource the computation, privacy concerns arise. To address this problem, we design, implement, and evaluate the first practical and fully private density-based clustering scheme based on secure two-party computation. Our protocol privately executes the DBSCAN algorithm without disclosing any information (including the number and size of clusters). It can be used for private clustering between two parties as well as for private outsourcing of an arbitrary number of data owners to two non-colluding servers. Our implementation of the DBSCAN algorithm privately clusters data sets with 400 elements in 7 minutes on commodity hardware. Thereby, it flexibly determines the number of required clusters and is insensitive to outliers, while being only factor 19x slower than today's fastest private K-means protocol (Mohassel et al., PETS'20) which can only be used for specific data sets. We then show how to transfer our newly designed protocol to related clustering algorithms by introducing a private approximation of the TRACLUS algorithm for trajectory clustering which has interesting real-world applications like financial time series forecasts and the investigation of the spread of a disease like COVID-19.

Available format(s)
Category
Applications
Publication info
Published elsewhere. ACM ASIACCS 2021
DOI
10.1145/3433210.3453104
Keywords
Private Machine LearningClusteringSecure Computation
Contact author(s)
moellering @ encrypto cs tu-darmstadt de
History
Short URL
https://ia.cr/2021/612

CC BY

BibTeX

@misc{cryptoeprint:2021/612,
author = {Beyza Bozdemir and Sébastien Canard and Orhan Ermis and Helen Möllering and Melek Önen and Thomas Schneider},
title = {Privacy-preserving Density-based Clustering},
howpublished = {Cryptology ePrint Archive, Paper 2021/612},
year = {2021},
doi = {10.1145/3433210.3453104},
note = {\url{https://eprint.iacr.org/2021/612}},
url = {https://eprint.iacr.org/2021/612}
}

Note: In order to protect the privacy of readers, eprint.iacr.org does not use cookies or embedded third party content.