Encryption Based Covert Channel for Large Language Models

Yongge Wang

Paper 2024/586

Encryption Based Covert Channel for Large Language Models

Yongge Wang

, University of North Carolina at Charlotte

Abstract

Transformer neural networks have gained significant traction since their introduction, becoming pivotal across diverse domains. Particularly in large language models like Claude and ChatGPT, the transformer architecture has demonstrated remarkable efficacy. This paper provides a concise overview of transformer neural networks and delves into their security considerations, focusing on covert channel attacks and their implications for the safety of large language models. We present a covert channel utilizing encryption and demonstrate its efficacy in circumventing Claude.ai's security measures. Our experiment reveals that Claude.ai appears to log our queries and blocks our attack within two days of our initial successful breach. This raises two concerns within the community: (1) The extensive logging of user inputs by large language models could pose privacy risks for users. (2) It may deter academic research on the security of such models due to the lack of experiment repeatability.

Metadata

Available format(s): PDF
Category: Applications
Publication info: Preprint.
Keywords: security of large language models encryption based covert channels
Contact author(s): yonwang @ charlotte edu
History: 2024-06-26: last of 3 revisions; 2024-04-16: received; See all versions
Short URL: https://ia.cr/2024/586
License: CC BY

BibTeX

@misc{cryptoeprint:2024/586,
      author = {Yongge Wang},
      title = {Encryption Based Covert Channel for Large Language Models},
      howpublished = {Cryptology {ePrint} Archive, Paper 2024/586},
      year = {2024},
      url = {https://eprint.iacr.org/2024/586}
}