Cryptology ePrint Archive: Report 2017/148

Pattern Matching on Encrypted Streams

Nicolas Desmoulins and Pierre-Alain Fouque and Cristina Onete and Olivier Sanders

Abstract: Pattern matching is essential in applications such as deep-packet inspection (DPI), searching on genomic data, or analyzing medical data. A simple task to do on plaintext data, pattern matching is much harder to do when the privacy of the data must be preserved. Existent solutions involve searchable encryption mechanisms with at least one of these three drawbacks: requiring an exhaustive (and static) list of keywords to be prepared before the data is encrypted (like in symmetric searchable encryption); requiring tokenization, i.e., breaking up the data to search into substrings and encrypting them separately (e.g., like BlindBox); relying on symmetric-key cryptography, thus implying a token-regeneration step for each encrypted-data source (e.g., user). Such approaches are ill-suited for pattern-matching with evolving patterns (e.g., updating virus signatures), variable searchword lengths, or when a single entity must filter ciphertexts from multiple parties.

In this work, we introduce Searchable Encryption with Shiftable Trapdoors (SEST): a new primitive that allows for pattern matching with universal tokens (usable by all entities), in which keywords of arbitrary lengths can be matched to arbitrary ciphertexts. Our solution uses public-key encryption and bilinear pairings. It consists of projecting keywords on polynomials of degree equal to the length of the keyword and using a sliding-window-like technique to match the trapdoor to successive fragments of the encrypted data. In addition, very minor modifications to our solution enable it to take into account regular expressions, such as fully- or partly-unknown characters in a keyword (wildcards and interval/subset searches). Our trapdoor size is at most linear in the keyword length (and independent of the plaintext size), and we prove that the leakage to the searcher is only the trivial one: since the searcher learns whether the pattern occurs and where, it can distinguish based on different search results of a single trapdoor on two different plaintexts.

To better show the usability of our scheme, we implemented it to run DPI on all the SNORT rules. We show that even for very large plaintexts, our encryption algorithm scales well. The pattern-matching algorithm is slightly slower, but extremely parallelizable, and it can thus be run even on very large data. Although our proofs use a (marginally) interactive assumption, we argue that this is a relatively small price to pay for the flexibility and privacy that we are able to attain.

Category / Keywords: deep packet inspection, pattern matching

Date: received 16 Feb 2017, last revised 16 Feb 2018

Contact author: olivier sanders at orange com

Available format(s): PDF | BibTeX Citation

Note: Compared to the first version, this new one describes a much more efficent algorithm to perform pattern matching on encrypted string. It also includes a complexity section which provides timings for different parameters.

Version: 20180216:094031 (All versions of this report)

Short URL:

Discussion forum: Show discussion | Start new discussion

[ Cryptology ePrint archive ]