**Improved Reconstruction Attacks on Encrypted Data Using Range Query Leakage**

*Marie-Sarah Lacharité and Brice Minaud and Kenneth G. Paterson*

**Abstract: **We analyse the security of database encryption schemes supporting range queries against persistent adversaries. Security against such an adversary captures, among other things, the privacy of the client's data with respect to the server hosting the encrypted database. The bulk of our work applies to a generic setting, where the view of the adversary is limited to the set of records or documents matched by each query (known as access pattern leakage). We also consider a more specific setting where certain rank information is also leaked. The latter is inherent to multiple encryption schemes supporting range queries, such as Kerschbaum's FH-OPE scheme (CCS 2015) and the recently proposed Arx scheme of Poddar et al. (IACR eprint 2016/568, 2016/591). We provide three attacks.

- We first consider full reconstruction, which asks to recover the value of every record, fully negating encryption. We show that full reconstruction is possible within an expected number of queries $N \log N + O(N)$, where $N$ is the number of distinct plaintext values. This attack assumes that the dataset is dense, in the sense that every plaintext value occurs in some record; but it does not assume any a priori knowledge of the distribution of the values among records. This bound improves on an $O(N^2 \log N)$ bound in the same setting by Kellaris et al. (CCS 2016). We also provide efficient algorithms that succeed with the minimum possible number of queries (in a strong, information theoretical sense), prove a matching data lower bound for the number of queries required, and study in more detail the setting where rank information leakage is available in addition to the access pattern. - We show another efficient attack able to recover all plaintext values within a constant ratio of error (such as a 1% error), requiring only the access pattern leakage of $O(N)$ queries. More precisely, recovering all plaintext values within an additive margin of error $\epsilon N$ for any arbitrary $\epsilon$ requires an expected number of $\frac{5}{4} N\log(1/\epsilon) + O(N)$ queries. As before, this result comes with a matching lower bound. - Finally, we consider the common situation where the adversary has access to an auxiliary distribution for the targeted values. This enables us to convert rank leakage into approximate range information, leading to an accelerated attack. This attack does not require a dense dataset. Since it is not amenable to a rigorous analysis, we report the results of experiments using this third attack against age data from real-world medical data sets. We show that the attack is highly effective at reconstructing the association between values and records, even with imperfect auxiliary information. In our experiments, observing only 50 queries was sufficient to reconstruct 55% of records to within 5 years, and 35% of records to within 3 years.

In combination, our attacks suggest that the practical impact of the leakage suffered by all schemes supporting range queries is more severe than previously thought, particularly so for schemes like Arx and FH-OPE which also leak rank. Our attacks cast doubt on the practical viability of current approaches to enabling range queries when the threat model goes beyond snapshot attacks to include a persistent server-side adversary.

**Category / Keywords: **cryptographic protocols / Encrypted database, leakage

**Date: **received 14 Jul 2017, last revised 1 Aug 2017

**Contact author: **brice minaud at gmail com

**Available format(s): **PDF | BibTeX Citation

**Version: **20170801:220235 (All versions of this report)

**Short URL: **ia.cr/2017/701

**Discussion forum: **Show discussion | Start new discussion

[ Cryptology ePrint archive ]