Paper 2023/1354

Privacy Preserving Feature Selection for Sparse Linear Regression

Adi Akavia, University of Haifa
Ben Galili, Technion – Israel Institute of Technology
Hayim Shaul, IBM Research - Haifa
Mor Weiss, Bar-Ilan University
Zohar Yakhini, Reichman University & Technion
Abstract

Privacy-Preserving Machine Learning (PPML) provides protocols for learning and statistical analysis of data that may be distributed amongst multiple data owners (e.g., hospitals that own proprietary healthcare data), while preserving data privacy. The PPML literature includes protocols for various learning methods, including ridge regression. Ridge regression controls the $L_2$ norm of the model, but does not aim to strictly reduce the number of non-zero coefficients, namely the $L_0$ norm of the model. Reducing the number of non-zero coefficients (a form of feature selection) is important for avoiding overfitting, and for reducing the cost of using learnt models in practice. In this work, we develop a first privacy-preserving protocol for sparse linear regression under $L_0$ constraints. The protocol addresses data contributed by several data owners (e.g., hospitals). Our protocol outsources the bulk of the computation to two non-colluding servers, using homomorphic encryption as a central tool. We provide a rigorous security proof for our protocol, where security is against semi-honest adversaries controlling any number of data owners and at most one server. We implemented our protocol, and evaluated performance with nearly a million samples and up to 40 features.

Metadata
Available format(s)
PDF
Category
Cryptographic protocols
Publication info
Published elsewhere. Major revision. Proceedings on Privacy Enhancing Technologies
Keywords
Privacy preserving machine learningsparse linear regressionfeature selectionfully homomorphic encryption
Contact author(s)
adi akavia @ gmail com
benga9 @ gmail com
hayim shaul @ gmail com
mor weiss @ biu ac il
zohar yakhini @ gmail com
History
2023-09-11: approved
2023-09-11: received
See all versions
Short URL
https://ia.cr/2023/1354
License
Creative Commons Attribution
CC BY

BibTeX

@misc{cryptoeprint:2023/1354,
      author = {Adi Akavia and Ben Galili and Hayim Shaul and Mor Weiss and Zohar Yakhini},
      title = {Privacy Preserving Feature Selection for Sparse Linear Regression},
      howpublished = {Cryptology {ePrint} Archive, Paper 2023/1354},
      year = {2023},
      url = {https://eprint.iacr.org/2023/1354}
}
Note: In order to protect the privacy of readers, eprint.iacr.org does not use cookies or embedded third party content.