Cryptology ePrint Archive: Report 2016/012

Cryptography for Big Data Security

Ariel Hamlin and Nabil Schear and Emily Shen and Mayank Varia and Sophia Yakoubov and Arkady Yerukhimovich

Abstract: As big data collection and analysis becomes prevalent in today’s computing environments there is a growing need for techniques to ensure security of the collected data. To make matters worse, due to its large volume and velocity, big data is commonly stored on distributed or shared computing resources not fully controlled by the data owner. Thus, tools are needed to ensure both the confidentiality of the stored data and the integrity of the analytics results even in untrusted environments. In this chapter, we present several cryptographic approaches for securing big data and discuss the appropriate use scenarios for each.

We begin with the problem of securing big data storage. We first address the problem of secure block storage for big data allowing data owners to store and retrieve their data from an untrusted server. We present techniques that allow a data owner to both control access to their data and ensure that none of their data is modified or lost while in storage. However, in most big data applications, it is not sufficient to simply store and retrieve one’s data and a search functionality is necessary to allow one to select only the relevant data. Thus, we present several techniques for searchable encryption allowing database- style queries over encrypted data. We review the performance, functionality, and security provided by each of these schemes and describe appropriate use-cases.

However, the volume of big data often makes it infeasible for an analyst to retrieve all relevant data. Instead, it is desirable to be able to perform analytics directly on the stored data without compromising the confidentiality of the data or the integrity of the computation results. We describe several recent cryptographic breakthroughs that make such processing possible for varying classes of analytics. We review the performance and security characteristics of each of these schemes and summarize how they can be used to protect big data analytics especially when deployed in a cloud setting.

We hope that the exposition in this chapter will raise awareness of the latest types of tools and protections available for securing big data. We believe better understanding and closer collaboration between the data science and cryptography communities will be critical to enabling the future of big data processing.

Category / Keywords: cryptographic protocols / Secure Block Storage, Access Control, Secure Search, Homomorphic Encryption, Searchable Encryption, Verifiable Computation, Multi-party Computation, Functional Encryption

Original Publication (with minor differences):

Date: received 6 Jan 2016

Contact author: ariel hamlin at ll mit edu

Available format(s): PDF | BibTeX Citation

Version: 20160106:210558 (All versions of this report)

Short URL:

Discussion forum: Show discussion | Start new discussion

[ Cryptology ePrint archive ]