Paper 2014/631

Zipf’s Law in Passwords

Ding Wang, Gaopeng Jian, Xinyi Huang, and Ping Wang

Abstract

Despite more than thirty years of research efforts, textual passwords are still enveloped in mysterious veils. In this work, we make a substantial step forward in understanding the distributions of passwords and measuring the strength of password datasets by using a statistical approach. We first show that Zipf's law perfectly exists in real-life passwords by conducting linear regressions on a corpus of 56 million passwords. As one specific application of this observation, we propose the number of unique passwords used in regression and the slope of the regression line together as a metric for assessing the strength of password datasets, and prove it in a mathematically rigorous manner. Furthermore, extensive experiments (including optimal attacks, simulated optimal attacks and state-of-the-art cracking sessions) are performed to demonstrate the practical effectiveness of our metric. To the best of knowledge, our new metric is the first one that is both easy to approximate and accurate to facilitate comparisons, providing a useful tool for the system administrators to gain a precise grasp of the strength of their password datasets and to adjust the password policies more reasonably.

Metadata
Available format(s)
PDF
Category
Foundations
Publication info
Published elsewhere. Major revision. IEEE Transactions on Information Forensics and Security
DOI
10.1109/TIFS.2017.2721359
Keywords
User authenticationPassword crackingStatistics
Contact author(s)
pwang @ pku edu cn
History
2018-01-03: last of 22 revisions
2014-08-21: received
See all versions
Short URL
https://ia.cr/2014/631
License
Creative Commons Attribution
CC BY

BibTeX

@misc{cryptoeprint:2014/631,
      author = {Ding Wang and Gaopeng Jian and Xinyi Huang and Ping Wang},
      title = {Zipf’s Law in Passwords},
      howpublished = {Cryptology {ePrint} Archive, Paper 2014/631},
      year = {2014},
      doi = {10.1109/TIFS.2017.2721359},
      url = {https://eprint.iacr.org/2014/631}
}
Note: In order to protect the privacy of readers, eprint.iacr.org does not use cookies or embedded third party content.