Paper 2022/793

LLAMA: A Low Latency Math Library for Secure Inference

Kanav Gupta, Microsoft Research (India)
Deepak Kumaraswamy, Microsoft Research (India)
Nishanth Chandran, Microsoft Research (India)
Divya Gupta, Microsoft Research (India)
Abstract

Secure machine learning (ML) inference can provide meaningful privacy guarantees to both the client (holding sensitive input) and the server (holding sensitive weights of the ML model) while realizing inference-as-a-service. Although many specialized protocols exist for this task, including those in the preprocessing model (where a majority of the overheads are moved to an input independent offline phase), they all still suffer from large online complexity. Specifically, the protocol phase that executes once the parties know their inputs, has high communication, round complexity, and latency. Function Secret Sharing (FSS) based techniques offer an attractive solution to this in the trusted dealer model (where a dealer provides input independent correlated randomness to both parties), and 2PC protocols obtained based on these techniques have a very lightweight online phase. Unfortunately, current FSS-based 2PC works (AriaNN, PoPETS 2022; Boyle et al. Eurocrypt 2021; Boyle et al. TCC 2019) fall short of providing a complete solution to secure inference. First, they lack support for math functions (e.g., sigmoid, and reciprocal square root) and hence, are insufficient for a large class of inference algorithms (e.g. recurrent neural networks). Second, they restrict all values in the computation to be of the same bitwidth and this prevents them from benefitting from efficient float-to-fixed converters such as Tensorflow Lite that crucially use low bitwidth representations and mixed bitwidth arithmetic. In this work, we present LLAMA -- an end-to-end, FSS based, secure inference library supporting precise low bitwidth computations (required by converters) as well as provably precise math functions; thus, overcoming all the drawbacks listed above. We perform an extensive evaluation of LLAMA and show that when compared with non-FSS based libraries supporting mixed bitwidth arithmetic and math functions (SIRNN, IEEE S&P 2021), it has at least an order of magnitude lower communication, rounds, and runtimes. We integrate LLAMA with the EzPC framework (IEEE EuroS&P 2019) and demonstrate its robustness by evaluating it on large benchmarks (such as ResNet-50 on the ImageNet dataset) as well as on benchmarks considered in AriaNN -- here too LLAMA outperforms prior work.

Metadata
Available format(s)
PDF
Category
Cryptographic protocols
Publication info
Published elsewhere. 22nd Privacy Enhancing Technologies Symposium (PETS 2022)
Keywords
Function Secret Sharing Secure Inference Secure Two-Party Computation
Contact author(s)
t-kanavgupta @ microsoft com
deepakkumaraswamy99 @ gmail com
nichandr @ microsoft com
divya gupta @ microsoft com
History
2022-06-20: approved
2022-06-20: received
See all versions
Short URL
https://ia.cr/2022/793
License
Creative Commons Attribution-NonCommercial-NoDerivs
CC BY-NC-ND

BibTeX

@misc{cryptoeprint:2022/793,
      author = {Kanav Gupta and Deepak Kumaraswamy and Nishanth Chandran and Divya Gupta},
      title = {{LLAMA}: A Low Latency Math Library for Secure Inference},
      howpublished = {Cryptology {ePrint} Archive, Paper 2022/793},
      year = {2022},
      url = {https://eprint.iacr.org/2022/793}
}
Note: In order to protect the privacy of readers, eprint.iacr.org does not use cookies or embedded third party content.