Squint Hard Enough: Evaluating Perceptual Hashing with Machine Learning

You are looking at a specific version 20211122:112948 of this paper. See the latest version.

Paper 2021/1531

Squint Hard Enough: Evaluating Perceptual Hashing with Machine Learning

Jonathan Prokos and Tushar M. Jois and Neil Fendley and Roei Schuster and Matthew Green and Eran Tromer and Yinzhi Cao

Abstract

Many online communications systems use perceptual hash matching systems to detect illicit files in user content. These systems employ specialized perceptual hash functions such as Microsoft's PhotoDNA or Facebook's PDQ to produce a compact digest of an image file that can be approximately compared to a database of known illicit-content digests. Recently, several proposals have suggested that hash-based matching systems be incorporated into client-side and end-to-end encrypted (E2EE) systems: in these designs, files that register as illicit content will be reported to the provider, while the remaining content will be sent confidentially. By using perceptual hashing to determine confidentiality guarantees, this new setting significantly changes the function of existing perceptual hashing -- thus motivating the need to evaluate these functions from an adversarial perspective, using their perceptual capabilities against them. For example, an attacker may attempt to trigger a match on innocuous, but politically-charged, content in an attempt to stifle speech. In this work we develop threat models for perceptual hashing algorithms in an adversarial setting, and present attacks against the two most widely deployed algorithms: PhotoDNA and PDQ. Our results show that it is possible to efficiently generate targeted second-preimage attacks in which an attacker creates a variant of some source image that matches some target digest. As a complement to this main result, we also further investigate the production of images that facilitate detection avoidance attacks, continuing a recent investigation of Jain et al. Our work shows that existing perceptual hash functions are likely insufficiently robust to survive attacks on this new setting.

Metadata

Available format(s): PDF
Category: Applications
Publication info: Preprint. MINOR revision.
Keywords: perceptual hashing adversarial attacks
Contact author(s): jois @ cs jhu edu
jprokos4 @ gmail com
History: 2021-11-22: received
Short URL: https://ia.cr/2021/1531
License: CC BY