Theoretical Limits of Provable Security Against Model Extraction by Efficient Observational Defenses

Ari Karchmer

Paper 2022/1039

Theoretical Limits of Provable Security Against Model Extraction by Efficient Observational Defenses

Ari Karchmer, Boston University

Abstract

Can we hope to provide provable security against model extraction attacks? As a step towards a theoretical study of this question, we unify and abstract a wide range of "observational" model extraction defenses (OMEDs) --- roughly, those that attempt to detect model extraction by analyzing the distribution over the adversary's queries. To accompany the abstract OMED, we define the notion of complete OMEDs --- when benign clients can freely interact with the model --- and sound OMEDs --- when adversarial clients are caught and prevented from reverse engineering the model. Our formalism facilitates a simple argument for obtaining provable security against model extraction by complete and sound OMEDs, using (average-case) hardness assumptions for PAC-learning, in a way that abstracts current techniques in the prior literature. The main result of this work establishes a partial computational incompleteness theorem for the OMED: any efficient OMED for a machine learning model computable by a polynomial size decision tree that satisfies a basic form of completeness cannot satisfy soundness, unless the subexponential Learning Parity with Noise (LPN) assumption does not hold. To prove the incompleteness theorem, we introduce a class of model extraction attacks called natural Covert Learning attacks based on a connection to the Covert Learning model of Canetti and Karchmer (TCC '21), and show that such attacks circumvent any defense within our abstract mechanism in a black-box, nonadaptive way. As a further technical contribution, we extend the Covert Learning algorithm of Canetti and Karchmer to work over any "concise" product distribution (albeit for juntas of a logarithmic number of variables rather than polynomial size decision trees), by showing that the technique of learning with a distributional inverter of Binnendyk et al. (ALT '22) remains viable in the Covert Learning setting.

Note: This is an updated version. The paper has been modified to improve readability and argumentation. Some new results have been added in section 5. The previous version of the paper appeared under the title "The Limits of Provable Security Against Model Extraction." The current version is the same as for publication in IEEE SATML '23, except for minor differences and formatting.

Metadata

Available format(s): PDF
Category: Attacks and cryptanalysis
Publication info: Published elsewhere. IEEE Secure and Trustworthy Machine Learning (SATML)
Keywords: Covert Learning Model Extraction Provable Security
Contact author(s): arika @ bu edu
History: 2023-01-03: revised; 2022-08-11: received; See all versions
Short URL: https://ia.cr/2022/1039
License: CC BY

BibTeX

@misc{cryptoeprint:2022/1039,
      author = {Ari Karchmer},
      title = {Theoretical Limits of Provable Security Against Model Extraction by Efficient Observational Defenses},
      howpublished = {Cryptology {ePrint} Archive, Paper 2022/1039},
      year = {2022},
      url = {https://eprint.iacr.org/2022/1039}
}