Paper 2016/012

Cryptography for Big Data Security

Ariel Hamlin, Nabil Schear, Emily Shen, Mayank Varia, Sophia Yakoubov, and Arkady Yerukhimovich


As big data collection and analysis becomes prevalent in today’s computing environments there is a growing need for techniques to ensure security of the collected data. To make matters worse, due to its large volume and velocity, big data is commonly stored on distributed or shared computing resources not fully controlled by the data owner. Thus, tools are needed to ensure both the confidentiality of the stored data and the integrity of the analytics results even in untrusted environments. In this chapter, we present several cryptographic approaches for securing big data and discuss the appropriate use scenarios for each. We begin with the problem of securing big data storage. We first address the problem of secure block storage for big data allowing data owners to store and retrieve their data from an untrusted server. We present techniques that allow a data owner to both control access to their data and ensure that none of their data is modified or lost while in storage. However, in most big data applications, it is not sufficient to simply store and retrieve one’s data and a search functionality is necessary to allow one to select only the relevant data. Thus, we present several techniques for searchable encryption allowing database- style queries over encrypted data. We review the performance, functionality, and security provided by each of these schemes and describe appropriate use-cases. However, the volume of big data often makes it infeasible for an analyst to retrieve all relevant data. Instead, it is desirable to be able to perform analytics directly on the stored data without compromising the confidentiality of the data or the integrity of the computation results. We describe several recent cryptographic breakthroughs that make such processing possible for varying classes of analytics. We review the performance and security characteristics of each of these schemes and summarize how they can be used to protect big data analytics especially when deployed in a cloud setting. We hope that the exposition in this chapter will raise awareness of the latest types of tools and protections available for securing big data. We believe better understanding and closer collaboration between the data science and cryptography communities will be critical to enabling the future of big data processing.

Available format(s)
Cryptographic protocols
Publication info
Published elsewhere. Minor revision.
Secure Block StorageAccess ControlSecure SearchHomomorphic EncryptionSearchable EncryptionVerifiable ComputationMulti-party ComputationFunctional Encryption
Contact author(s)
ariel hamlin @ ll mit edu
2016-01-06: received
Short URL
Creative Commons Attribution


      author = {Ariel Hamlin and Nabil Schear and Emily Shen and Mayank Varia and Sophia Yakoubov and Arkady Yerukhimovich},
      title = {Cryptography for Big Data Security},
      howpublished = {Cryptology ePrint Archive, Paper 2016/012},
      year = {2016},
      note = {\url{}},
      url = {}
Note: In order to protect the privacy of readers, does not use cookies or embedded third party content.