Paper 2017/707

Privacy-Preserving Ridge Regression on Distributed Data

Irene Giacomelli, Somesh Jha, C. David Page, and Kyonghwan Yoon

Abstract

Linear regression is an important statistical tool that models the relationship between some explanatory values and an outcome value using a linear function. In many current applications (e.g. predictive modelling in personalized healthcare), these values represent sensitive data owned by several different parties that are unwilling to share them. In this setting, training a linear regression model becomes challenging and needs specific cryptographic solutions. In this work, we propose a new system that can train a linear regression model with 2-norm regularization (i.e. ridge regression) on a dataset obtained by merging a finite number of private datasets. Our system is composed of two phases: The first one is based on a simple homomorphic encryption scheme and takes care of securely merging the private datasets. The second phase is a new ad-hoc two-party protocol that computes a ridge regression model solving a linear system where all coefficients are encrypted. The efficiency of our system is evaluated both on synthetically generated and real-world datasets.

Metadata
Available format(s)
PDF
Category
Applications
Publication info
Preprint. MINOR revision.
Keywords
linear regressiondistributed dataprivacy-preserving systemmultiparty computation.
Contact author(s)
irene giacomelli29 @ gmail com
History
2017-07-25: received
Short URL
https://ia.cr/2017/707
License
Creative Commons Attribution
CC BY

BibTeX

@misc{cryptoeprint:2017/707,
      author = {Irene Giacomelli and Somesh Jha and C.  David Page and Kyonghwan Yoon},
      title = {Privacy-Preserving Ridge Regression on Distributed Data},
      howpublished = {Cryptology ePrint Archive, Paper 2017/707},
      year = {2017},
      note = {\url{https://eprint.iacr.org/2017/707}},
      url = {https://eprint.iacr.org/2017/707}
}
Note: In order to protect the privacy of readers, eprint.iacr.org does not use cookies or embedded third party content.