Cryptology ePrint Archive: Report 2016/246
LINGUISTIC CRACKING OF PASSPHRASES USING MARKOV CHAINS
Peder Sparell and Mikael Simovits
Abstract: In order to remember long passwords, it is not uncommon users are recommended to create a sentence which then is assembled to form a long password, a passphrase. However, theoretically a language is very limited and predictable, why a linguistically correct passphrase according to Shannon's definition of information theory should be relatively easy to crack compared to bruteforce.
This work focuses on cracking linguistically correct passphrases, partly to determine to what extent it is advisable to base a password policy on such phrases for protection of data, and partly because today, widely available effective methods to crack passwords based on phrases are missing.
Within this work, phrases were generated for further processing by available cracking applications, and the language of the phrases were modeled using a Markov process. In this process, phrases were built up by using the number of observed instances of subsequent characters or words in a source text, known as n-grams, to determine the possible/probable next character/word in the phrases.
The work shows that by creating models of language, linguistically correct passphrases can be broken in a practical way compared to an exhaustive search. In the tests, passphrases consisting of up to 20 characters were broken.
Category / Keywords: Passphrases Cracking Markov chains
Date: received 4 Mar 2016, last revised 4 Mar 2016
Contact author: mikael at simovits com
Available format(s): PDF | BibTeX Citation
Note: Has been presented as a tutorial at PasswordCon 2015 UK and as a presentation at RSA Conference 2016 USA.
Version: 20160306:164616 (All versions of this report)
Short URL: ia.cr/2016/246
Discussion forum: Show discussion | Start new discussion
[ Cryptology ePrint archive ]