Paper 2024/1088

HElix: Genome Similarity Detection in the Encrypted Domain

Rostin Shokri, University of Delaware
Charles Gouert, University of Delaware
Nektarios Georgios Tsoutsos, University of Delaware
Abstract

As the field of genomics continues to expand and more sequencing data is gathered, genome analysis becomes increasingly relevant for many users. For example, a common scenario entails users trying to determine if their DNA samples are similar to DNA sequences hosted in a larger remote repository. Nevertheless, end users may be reluctant to upload their DNA sequences, while the owners of remote genomics repositories are unwilling to openly share their database. To address this challenge, we propose two distinct approaches based on fully homomorphic encryption to preserve the privacy of the genomic data and enable queries directly on ciphertexts. The first is based on the ubiquitous MinHash algorithm and can determine if similar matches exist in the database, while the second involves a bespoke bloom filter construction for determining exact matches. We validate both approaches across various database sizes using both GPU and CPU-based cloud servers.

Metadata
Available format(s)
PDF
Category
Applications
Publication info
Preprint.
Keywords
Homomorphic encryptionPrivate genome associationMinHashBloom filters
Contact author(s)
rostinsh @ udel edu
cgouert @ udel edu
tsoutsos @ udel edu
History
2024-07-05: approved
2024-07-04: received
See all versions
Short URL
https://ia.cr/2024/1088
License
Creative Commons Attribution
CC BY

BibTeX

@misc{cryptoeprint:2024/1088,
      author = {Rostin Shokri and Charles Gouert and Nektarios Georgios Tsoutsos},
      title = {{HElix}: Genome Similarity Detection in the Encrypted Domain},
      howpublished = {Cryptology ePrint Archive, Paper 2024/1088},
      year = {2024},
      note = {\url{https://eprint.iacr.org/2024/1088}},
      url = {https://eprint.iacr.org/2024/1088}
}
Note: In order to protect the privacy of readers, eprint.iacr.org does not use cookies or embedded third party content.