## Cryptology ePrint Archive: Report 2018/462

Logistic regression over encrypted data from fully homomorphic encryption

Hao Chen and Ran Gilad-Bachrach and Kyoohyung Han and Zhicong Huang and Amir Jalali and Kim Laine and Kristin Lauter

Abstract: One of the tasks in the $2017$ iDASH secure genome analysis competition was to enable training of logistic regression models over encrypted genomic data. More precisely, given a list of approximately $1500$ patient records, each with $18$ binary features containing information on specific mutations, the idea was for the data holder to encrypt the records using homomorphic encryption, and send them to an untrusted cloud for storage. The cloud could then apply a training algorithm on the encrypted data to obtain an encrypted logistic regression model, which can be sent to the data holder for decryption. In this way, the data holder could successfully outsource the training process without revealing either her sensitive data, or the trained model, to the cloud. Our solution to this problem has several novelties: we use a multi-bit plaintext space in fully homomorphic encryption together with fixed point number encoding; we combine bootstrapping in fully homomorphic encryption with a scaling operation in fixed point arithmetic; we use a minimax polynomial approximation to the sigmoid function and the $1$-bit gradient descent method to reduce the plaintext growth in the training process. As a result, our training over encrypted data takes $0.4$ -- $3.2$ hours per iteration of gradient descent.

Category / Keywords: applications / homomorphic encryption, logistic regression