Paper 2024/586
Encryption Based Covert Channel for Large Language Models
Abstract
Transformer neural networks have gained significant traction since their introduction, becoming pivotal across diverse domains. Particularly in large language models like Claude and ChatGPT, the transformer architecture has demonstrated remarkable efficacy. This paper provides a concise overview of transformer neural networks and delves into their security considerations, focusing on covert channel attacks and their implications for the safety of large language models. We present a covert channel utilizing encryption and demonstrate its efficacy in circumventing Claude.ai's security measures. Our experiment reveals that Claude.ai appears to log our queries and blocks our attack within two days of our initial successful breach. This raises two concerns within the community: (1) The extensive logging of user inputs by large language models could pose privacy risks for users. (2) It may deter academic research on the security of such models due to the lack of experiment repeatability.
Metadata
- Available format(s)
- Category
- Applications
- Publication info
- Preprint.
- Keywords
- security of large language modelsencryption based covert channels
- Contact author(s)
- yonwang @ charlotte edu
- History
- 2024-06-26: last of 3 revisions
- 2024-04-16: received
- See all versions
- Short URL
- https://ia.cr/2024/586
- License
-
CC BY
BibTeX
@misc{cryptoeprint:2024/586, author = {Yongge Wang}, title = {Encryption Based Covert Channel for Large Language Models}, howpublished = {Cryptology {ePrint} Archive, Paper 2024/586}, year = {2024}, url = {https://eprint.iacr.org/2024/586} }