Secure-computation techniques offer a way out of this dilemma, but the high cost of computing edit distance privately poses a great challenge. Wang et al.~proposed recently [ACM CCS '15] an efficient solution, for situations where the genome sequences are so close that edit distance between two genomes can be well approximated just by looking at the indexes in which they differ from the reference genome. However, this solution does not extend well to cases with high divergence among individual genomes, and different techniques are needed there.
In this work we put forward a new approach for highly efficient secure computation for computing an approximation of the edit-distance, that works well even in settings with much higher divergence. We present contributions on two fronts. First, the design of an approximation method that would yield an efficient private computation. Second, further optimizations of the two-party protocol. Our tests indicate that the approximation method works well even in regions of the genome where the distance between individuals is 5\% or more with many insertions and deletions (compared to 99.5\% similarly with mostly substitutions, as considered by Wang et al.). As for speed, our protocol implementation takes just a few seconds to run on databases with thousands of records, each of length thousands of alleles, and it scales almost linearly with both the database size and the length of the sequences in it. As an example, in the datasets of the recent iDASH competition, it takes less than two seconds to find the nearest five records to a query, in a size-500 dataset of length-3500 sequences. This is 2-3 orders of magnitude faster than using state-of-the-art secure protocols for exact computation.Category / Keywords: cryptographic protocols / genomic privacy Date: received 15 Feb 2017 Contact author: asharov at cornell edu Available format(s): PDF | BibTeX Citation Version: 20170220:150113 (All versions of this report) Short URL: ia.cr/2017/144 Discussion forum: Show discussion | Start new discussion