Cryptology ePrint Archive: Report 2019/1113

Towards a Homomorphic Machine Learning Big Data Pipeline for the Financial Services Sector

Oliver Masters and Hamish Hunt and Enrico Steffinlongo and Jack Crawford and Flavio Bergamaschi

Abstract: Machine Learning (ML) is today commonly employed in the Financial Services Sector (FSS) to create various models to predict a variety of conditions ranging from financial transactions fraud to outcomes of investments and also targeted upselling and cross-selling marketing campaigns. The common ML technique used for the modeling is supervised learning using regression algorithms and usually involves large amounts of data that needs to be shared and prepared before the actual learning phase. Compliance with recent privacy laws and confidentiality regulations requires that most, if not all, of the data and the computation must be kept in a secure environment, usually in-house, and not outsourced to cloud or multi-tenant shared environments. Our work focuses on how to apply advanced cryptographic schemes such as Homomorphic Encryption (HE) to protect the privacy and confidentiality of both the data during the training of ML models as well as the models themselves, and as a consequence, the prediction task can also be protected. We de-constructed a typical ML pipeline and applied HE to two of the important ML tasks, namely the variable selection phase of the supervised learning and the prediction task. Quality metrics and performance results demonstrate that HE technology has reached the inflection point to be useful in a financial business setting for a full ML pipeline.

Category / Keywords: applications / homomorphic encryption; variable reduction; variable selection; feature selection; prediction

Original Publication (with minor differences): RWC 2020

Date: received 29 Sep 2019, last revised 11 Nov 2019

Contact author: flavio at uk ibm com, hamishun@uk ibm com, oliver masters@ibm com, enrico steffinlongo@ibm com, jack crawford@ibm com

Available format(s): PDF | BibTeX Citation

Version: 20191111:170018 (All versions of this report)

Short URL:

[ Cryptology ePrint archive ]