**Tight Tradeoffs in Searchable Symmetric Encryption**

*Gilad Asharov and Gil Segev and Ido Shahaf*

**Abstract: **A searchable symmetric encryption (SSE) scheme enables a client to store data on an untrusted server while supporting keyword searches in a secure manner. Recent experiments have indicated that the practical relevance of such schemes heavily relies on the tradeoff between their space overhead, locality (the number of non-contiguous memory locations that the server accesses with each query), and read efficiency (the ratio between the number of bits the server reads with each query and the actual size of the answer). These experiments motivated Cash and Tessaro (EUROCRYPT '14) and Asharov et al. (STOC '16) to construct SSE schemes offering various such tradeoffs, and to prove lower bounds for natural SSE frameworks. Unfortunately, the best-possible tradeoff has not been identified, and there are substantial gaps between the existing schemes and lower bounds, indicating that a better understanding of SSE is needed.

We establish tight bounds on the tradeoff between the space overhead, locality and read efficiency of SSE schemes within two general frameworks that capture the memory access pattern underlying all existing schemes. First, we introduce the ``pad-and-split'' framework, refining that of Cash and Tessaro while still capturing the same existing schemes. Within our framework we significantly strengthen their lower bound, proving that any scheme with locality $L$ must use space $\Omega ( N \log N / \log L )$ for databases of size $N$. This is a tight lower bound, matching the tradeoff provided by the scheme of Demertzis and Papamanthou (SIGMOD '17) which is captured by our pad-and-split framework.

Then, within the ``statistical-independence'' framework of Asharov et al. we show that their lower bound is essentially tight: We construct a scheme whose tradeoff matches their lower bound within an additive $O(\log \log \log N)$ factor in its read efficiency, once again improving upon the existing schemes. Our scheme offers optimal space and locality, and nearly-optimal read efficiency that depends on the frequency of the queried keywords: For a keyword that is associated with $n = N^{1 - \epsilon(n)}$ document identifiers, the read efficiency is $\omega(1) \cdot \epsilon(n)^{-1}+ O(\log\log\log N)$ when retrieving its identifiers (where the $\omega(1)$ term may be arbitrarily small, and $\omega(1) \cdot \epsilon(n)^{-1}$ is the lower bound proved by Asharov et al.). In particular, for any keyword that is associated with at most $N^{1 - 1/o(\log \log \log N)}$ document identifiers (i.e., for any keyword that is not exceptionally common), we provide read efficiency $O(\log \log \log N)$ when retrieving its identifiers.

**Category / Keywords: **

**Original Publication**** (with major differences): **CRYPTO 2018

**Date: **received 24 May 2018, last revised 18 Sep 2020

**Contact author: **ido shahaf at cs huji ac il

**Available format(s): **PDF | BibTeX Citation

**Version: **20200918:092959 (All versions of this report)

**Short URL: **ia.cr/2018/507

[ Cryptology ePrint archive ]