Paper 2025/532
Chunking Attacks on File Backup Services using Content-Defined Chunking
Abstract
Systems such as file backup services often use content-defined chunking (CDC) algorithms, especially those based on rolling hash techniques, to split files into chunks in a way that allows for data deduplication. These chunking algorithms often depend on per-user parameters in an attempt to avoid leaking information about the data being stored. We present attacks to extract these chunking parameters and discuss protocol-agnostic attacks and loss of security once the parameters are breached (including when these parameters are not setup at all, which is often available as an option). Our parameter-extraction attacks themselves are protocol-specific but their ideas are generalizable to many potential CDC schemes.
Note: Typo fixes.
Metadata
- Available format(s)
-
PDF
- Category
- Attacks and cryptanalysis
- Publication info
- Preprint.
- Keywords
- Content-defined chunkingCDCRolling hashBackup
- Contact author(s)
- cperciva @ tarsnap com
- History
- 2025-03-24: revised
- 2025-03-21: received
- See all versions
- Short URL
- https://ia.cr/2025/532
- License
-
CC BY
BibTeX
@misc{cryptoeprint:2025/532, author = {Boris Alexeev and Colin Percival and Yan X Zhang}, title = {Chunking Attacks on File Backup Services using Content-Defined Chunking}, howpublished = {Cryptology {ePrint} Archive, Paper 2025/532}, year = {2025}, url = {https://eprint.iacr.org/2025/532} }