Paper 2014/631
Zipf’s Law in Passwords
Ding Wang, Gaopeng Jian, Xinyi Huang, and Ping Wang
Abstract
Despite more than thirty years of research efforts, textual passwords are still enveloped in mysterious veils. In this work, we make a substantial step forward in understanding the distributions of passwords and measuring the strength of password datasets by using a statistical approach. We first show that Zipf's law perfectly exists in real-life passwords by conducting linear regressions on a corpus of 56 million passwords. As one specific application of this observation, we propose the number of unique passwords used in regression and the slope of the regression line together as a metric for assessing the strength of password datasets, and prove it in a mathematically rigorous manner. Furthermore, extensive experiments (including optimal attacks, simulated optimal attacks and state-of-the-art cracking sessions) are performed to demonstrate the practical effectiveness of our metric. To the best of knowledge, our new metric is the first one that is both easy to approximate and accurate to facilitate comparisons, providing a useful tool for the system administrators to gain a precise grasp of the strength of their password datasets and to adjust the password policies more reasonably.
Metadata
- Available format(s)
- Category
- Foundations
- Publication info
- Published elsewhere. Major revision. IEEE Transactions on Information Forensics and Security
- DOI
- 10.1109/TIFS.2017.2721359
- Keywords
- User authenticationPassword crackingStatistics
- Contact author(s)
- pwang @ pku edu cn
- History
- 2018-01-03: last of 22 revisions
- 2014-08-21: received
- See all versions
- Short URL
- https://ia.cr/2014/631
- License
-
CC BY
BibTeX
@misc{cryptoeprint:2014/631, author = {Ding Wang and Gaopeng Jian and Xinyi Huang and Ping Wang}, title = {Zipf’s Law in Passwords}, howpublished = {Cryptology {ePrint} Archive, Paper 2014/631}, year = {2014}, doi = {10.1109/TIFS.2017.2721359}, url = {https://eprint.iacr.org/2014/631} }