Paper 2017/148
Pattern Matching on Encrypted Streams
Nicolas Desmoulins, Pierre-Alain Fouque, Cristina Onete, and Olivier Sanders
Abstract
Pattern matching is essential in applications such as deep-packet inspection (DPI), searching on genomic data, or analyzing medical data. A simple task to do on plaintext data, pattern matching is much harder to do when the privacy of the data must be preserved. Existent solutions involve searchable encryption mechanisms with at least one of these three drawbacks: requiring an exhaustive (and static) list of keywords to be prepared before the data is encrypted (like in symmetric searchable encryption); requiring tokenization, i.e., breaking up the data to search into substrings and encrypting them separately (e.g., like BlindBox); relying on symmetric-key cryptography, thus implying a token-regeneration step for each encrypted-data source (e.g., user). Such approaches are ill-suited for pattern-matching with evolving patterns (e.g., updating virus signatures), variable searchword lengths, or when a single entity must filter ciphertexts from multiple parties. In this work, we introduce Searchable Encryption with Shiftable Trapdoors (SEST): a new primitive that allows for pattern matching with universal tokens (usable by all entities), in which keywords of arbitrary lengths can be matched to arbitrary ciphertexts. Our solution uses public-key encryption and bilinear pairings. It consists of projecting keywords on polynomials of degree equal to the length of the keyword and using a sliding-window-like technique to match the trapdoor to successive fragments of the encrypted data. In addition, very minor modifications to our solution enable it to take into account regular expressions, such as fully- or partly-unknown characters in a keyword (wildcards and interval/subset searches). Our trapdoor size is at most linear in the keyword length (and independent of the plaintext size), and we prove that the leakage to the searcher is only the trivial one: since the searcher learns whether the pattern occurs and where, it can distinguish based on different search results of a single trapdoor on two different plaintexts. To better show the usability of our scheme, we implemented it to run DPI on all the SNORT rules. We show that even for very large plaintexts, our encryption algorithm scales well. The pattern-matching algorithm is slightly slower, but extremely parallelizable, and it can thus be run even on very large data. Although our proofs use a (marginally) interactive assumption, we argue that this is a relatively small price to pay for the flexibility and privacy that we are able to attain.
Note: Compared to the first version, this new one describes a much more efficent algorithm to perform pattern matching on encrypted string. It also includes a complexity section which provides timings for different parameters.
Metadata
- Available format(s)
- Publication info
- A major revision of an IACR publication in ASIACRYPT 2018
- Keywords
- deep packet inspectionpattern matching
- Contact author(s)
- olivier sanders @ orange com
- History
- 2018-08-27: last of 3 revisions
- 2017-02-22: received
- See all versions
- Short URL
- https://ia.cr/2017/148
- License
-
CC BY
BibTeX
@misc{cryptoeprint:2017/148, author = {Nicolas Desmoulins and Pierre-Alain Fouque and Cristina Onete and Olivier Sanders}, title = {Pattern Matching on Encrypted Streams}, howpublished = {Cryptology {ePrint} Archive, Paper 2017/148}, year = {2017}, url = {https://eprint.iacr.org/2017/148} }