Paper 2025/643

Obfuscation for Deep Neural Networks against Model Extraction: Attack Taxonomy and Defense Optimization

Yulian Sun, Ruhr University Bochum
Vedant Bonde, Huawei Technologies (Germany)
Li Duan, Paderborn University
Yong Li, Huawei Technologies (Germany)
Abstract

Well-trained deep neural networks (DNN), including large language models (LLM), are valuable intellectual property assets. To defend against model extraction attacks, one of the major ideas proposed in a large body of previous research is obfuscation: splitting the original DNN and storing the components separately. However, systematically analyzing the methods’ security against various attacks and optimizing the efficiency of defenses are still challenging. In this paper, We propose a taxonomy of model-based extraction attacks, which enables us to identify vulnerabilities of several existing obfuscation methods. We also propose an extremely efficient model obfuscation method called O2Splitter using trusted execution environment (TEE). The secrets we store in TEE have O(1)-size, i.e., independent of model size. Although O2Splitter relies on a pseudo-random function to provide a quantifiable guarantee for protection and noise compression, it does not need any complicated training or filtering of the weights. Our comprehensive experiments show that O2Splitter can mitigate norm-clipping and fine-tuning attacks. Even for small noise (ϵ = 50), the accuracy of the obfuscated model is close to random guess, and the tested attacks cannot extract a model with comparable accuracy. In addition, the empirical results also shed light on discovering the relation between DP parameters in obfuscation and the risks of concrete extraction attacks.

Metadata
Available format(s)
PDF
Category
Applications
Publication info
Published elsewhere. Minor revision. ACNS 2025
Keywords
machine learning model securitymodel obfuscationtrusted execution environmentintellectual property protection
Contact author(s)
yulian sun @ edu ruhr-uni-bochum de
vedant bonde1 @ huawei com
liduan @ mail upb de
yong li1 @ huawei com
History
2025-04-12: approved
2025-04-08: received
See all versions
Short URL
https://ia.cr/2025/643
License
Creative Commons Attribution-NonCommercial
CC BY-NC

BibTeX

@misc{cryptoeprint:2025/643,
      author = {Yulian Sun and Vedant Bonde and Li Duan and Yong Li},
      title = {Obfuscation for Deep Neural Networks against Model Extraction: Attack Taxonomy and Defense Optimization},
      howpublished = {Cryptology {ePrint} Archive, Paper 2025/643},
      year = {2025},
      url = {https://eprint.iacr.org/2025/643}
}
Note: In order to protect the privacy of readers, eprint.iacr.org does not use cookies or embedded third party content.