Paper 2024/586

Encryption Based Covert Channel for Large Language Models

Yongge Wang, University of North Carolina at Charlotte
Abstract

Transformer neural networks have gained significant traction since their introduction, becoming pivotal across diverse domains. Particularly in large language models like Claude and ChatGPT, the transformer architecture has demonstrated remarkable efficacy. This paper provides a concise overview of transformer neural networks and delves into their security considerations, focusing on covert channel attacks and their implications for the safety of large language models. We present a covert channel utilizing encryption and demonstrate its efficacy in circumventing Claude.ai's security measures. Our experiment reveals that Claude.ai appears to log our queries and blocks our attack within two days of our initial successful breach. This raises two concerns within the community: (1) The extensive logging of user inputs by large language models could pose privacy risks for users. (2) It may deter academic research on the security of such models due to the lack of experiment repeatability.

Metadata
Available format(s)
PDF
Category
Applications
Publication info
Preprint.
Keywords
security of large language modelsencryption based covert channels
Contact author(s)
yonwang @ charlotte edu
History
2024-04-24: revised
2024-04-16: received
See all versions
Short URL
https://ia.cr/2024/586
License
Creative Commons Attribution
CC BY

BibTeX

@misc{cryptoeprint:2024/586,
      author = {Yongge Wang},
      title = {Encryption Based Covert Channel for Large Language Models},
      howpublished = {Cryptology ePrint Archive, Paper 2024/586},
      year = {2024},
      note = {\url{https://eprint.iacr.org/2024/586}},
      url = {https://eprint.iacr.org/2024/586}
}
Note: In order to protect the privacy of readers, eprint.iacr.org does not use cookies or embedded third party content.