Paper 2024/689
Automated Creation of Source Code Variants of a Cryptographic Hash Function Implementation Using Generative Pre-Trained Transformer Models
Abstract
Generative pre-trained transformers (GPT's) are a type of large language machine learning model that are unusually adept at producing novel, and coherent, natural language. Notably, these technologies have also been extended to computer programming languages with great success. However, GPT model outputs in general are stochastic and not always correct. For programming languages, the exact specification of the computer code, syntactically and algorithmically, is strictly required in order to ensure the security of computing systems and applications. Therefore, using GPT models to generate computer code poses an important security risk -- while at the same time allowing for potential innovation in how computer code is generated. In this study the ability of GPT models to generate novel and correct versions, and notably very insecure versions, of implementations of the cryptographic hash function SHA-1 is examined. The GPT models Llama-2-70b-chat-hf, Mistral-7B-Instruct-v0.1, and zephyr-7b-alpha are used. The GPT models are prompted to re-write each function using a modified version of the localGPT framework and langchain to provide word embedding context of the full source code and header files to the model, resulting in over
Metadata
- Available format(s)
-
PDF
- Category
- Implementation
- Publication info
- Preprint.
- Keywords
- GPTSHA-1Cryptographic Hash FunctionC ImplementationGenerative Pre-Trained TransformerMachine LearningLLM
- Contact author(s)
- elijah pelofske @ protonmail com
- History
- 2024-07-10: revised
- 2024-05-06: received
- See all versions
- Short URL
- https://ia.cr/2024/689
- License
-
CC BY
BibTeX
@misc{cryptoeprint:2024/689, author = {Elijah Pelofske and Vincent Urias and Lorie M. Liebrock}, title = {Automated Creation of Source Code Variants of a Cryptographic Hash Function Implementation Using Generative Pre-Trained Transformer Models}, howpublished = {Cryptology {ePrint} Archive, Paper 2024/689}, year = {2024}, url = {https://eprint.iacr.org/2024/689} }