## Cryptology ePrint Archive: Report 2021/1418

Autoencoder Assist: An Efficient Profiling Attack on High-dimensional Datasets

Qi Lei and Zijia Yang and Qin Wang and Yaoling Ding and Zhe Ma and An Wang

Abstract: Deep learning (DL)-based profiled attack has been proved to be a powerful tool in side-channel analysis. A variety of multi-layer perception (MLP) networks and convolutional neural networks (CNN) are thereby applied to cryptographic algorithm implementations for exploiting correct keys with a smaller number of traces and a shorter time. However, these attacks merely focus on small datasets, in which their points of interest are well-trimmed for attacks. Countermeasures applied in embedded systems always result in high-dimensional side-channel traces, i.e., the high-dimension of each input trace. Time jittering and random delay techniques introduce desynchronization but increase SCA complexity as well. These traces inevitably require complicated designs of neural networks and large sizes of trainable parameters for exploiting the correct keys. Therefore, performing profiled attacks (directly) on high-dimensional datasets is difficult.

To bridge this gap, we propose a dimension reduction tool for high-dimensional traces by combining signal-to-noise ratio (SNR) analysis and autoencoder. With the designed asymmetric undercomplete autoencoder (UAE) architecture, we extract a small group of critical features from numerous time samples. The compression rate by using our UAE method reaches 40x on synchronized datasets and 30x on desynchronized datasets. This preprocessing step facilitates the profiled attacks by extracting potential leakage features. To demonstrate its effectiveness, we evaluate our proposed method on the raw ASCAD dataset with 100,000 samples in each trace. We also derive desynchronized datasets from the raw ASCAD dataset and validate our method under random delay effect. As current MLP and CNN structures cannot exploit the S-box leakage either before or after autoencoder preprocessed traces, here, we further propose a $2^n$-structure MLP network as the attack model. By applying UAE and $2^n$-structure MLP network on these traces, experimental results show that all correct subkeys on synchronized datasets (16 S-boxes) and desynchronized datasets are successfully revealed within hundreds of seconds. This shows that our autoencoder can significantly facilitate DL-based profiled attacks on high-dimensional datasets.

Category / Keywords: Side-channel Analysis, Deep Learning, Autoencoder, Multi-layer Perceptron, Convolutional Neural Networks