## Cryptology ePrint Archive: Report 2011/538

Leakage-Resilient Client-side Deduplication of Encrypted Data in Cloud Storage

Jia Xu and Ee-Chien Chang and Jianying Zhou

Abstract: Cloud storage service is gaining popularity in recent years. Client-side deduplication is widely adopted by cloud storage services like Dropbox, MozyHome and Wuala, to save bandwidth and storage. Security flaws, which may lead to private data leakage, in the current client-side deduplication mechanism are found recently by Harnik~\emph{et al.}~(S\&P Magazine, '10) and Halevi~\emph{et al.} (CCS '11). Halevi~\emph{et al.} identified an important security issue in client side deduplication which leads to leakage of private users' files to outside attackers, and addressed this issue by constructing schemes which they called \emph{proofs of ownership} (PoW). In a proof of ownership scheme, any owner of the same file $F$ can prove to the cloud storage that he/she owns file $F$ in a robust and efficient way, even if a certain amount of arbitrary information about file $F$ is leaked.

In this paper, we make two main contributions:

\begin​{itemize}

\item

We construct an efficient hash function $\mathsf{H}_k: \{ 0,1 \}^{M} \rightarrow \{ 0,1 \}^{L}$ with complexity in $\mathcal{O}(M + L)$, which is provably pairwise-independent in the random oracle model. We apply the constructed hash function to obtain a proof of ownership scheme, which is provably secure w.r.t. \emph{any} distribution of input file with sufficient min-entropy, in the random oracle model. In contrast, the PoW scheme (the last and the most practical construction) in Halevi~\emph{et al.} is provably secure w.r.t. only \emph{a particular type} of distribution (they call it a generalization of block-fixing'' distribution) of input file with sufficient min-entropy, in the random oracle model.

\item

We propose the first (to the best of our knowledge) solution to support cross-user client side deduplication over encrypted data in the \emph{leakage-resilient} model, where a certain amount of arbitrary information about users' files are leaked. Particularly, we address another important security issue in client side deduplication--- confidentiality of users' sensitive files against the honest-but-curious cloud storage server, by proposing a method to distribute a randomly chosen per-file encryption key to all owners of the same file, in an efficient and secure way. This key distribution method will be seamlessly incorporated into the process of client side deduplication. We emphasize that convergent encryption'', which encrypts a file $F$ using hash value $h(F)$ as encryption key, is not leakage-resilient and is thus insecure in the setting of PoW. Therefore, the direct combination of a PoW scheme and convergent encryption is not a solution for client side deduplication over encrypted data.

\end{itemize}

Category / Keywords: Cloud Storage, Client-side Deduplication, Zero Knowledge Proofs of Ownership, Privacy, Pairwise Independent Hash

Date: received 1 Oct 2011, last revised 12 Sep 2012

Contact author: jiaxu2001 at gmail com

Available format(s): PDF | BibTeX Citation

Note: A major revision in presentation since 25 May 2012.

Short URL: ia.cr/2011/538

[ Cryptology ePrint archive ]