Paper 2024/195

PQC-AMX: Accelerating Saber and FrodoKEM on the Apple M1 and M3 SoCs

Décio Luiz Gazzoni Filho, Instituto de Computação, Universidade Estadual de Campinas (UNICAMP), Campinas, Brazil, Department of Electrical Engineering, State University of Londrina, Londrina, Brazil
Guilherme Brandão, Independent Researcher, Londrina, Brazil
Gora Adj, Cryptography Research Centre, Technology Innovation Institute, Abu Dhabi, UAE
Arwa Alblooshi, Cryptography Research Centre, Technology Innovation Institute, Abu Dhabi, UAE
Isaac A. Canales-Martínez, Cryptography Research Centre, Technology Innovation Institute, Abu Dhabi, UAE
Jorge Chávez-Saab, Cryptography Research Centre, Technology Innovation Institute, Abu Dhabi, UAE
Julio López, Instituto de Computação, Universidade Estadual de Campinas (UNICAMP), Campinas, Brazil
Abstract

As CPU performance is unable to keep up with the dramatic growth of the past few decades, CPU architects are looking into domain-specific architectures to accelerate certain tasks. A recent trend is the introduction of matrix-multiplication accelerators to CPUs by manufacturers such as IBM, Intel and ARM, some of which have not launched commercially yet. Apple's systems-on-chip (SoCs) for its mobile phones, tablets and personal computers include a proprietary, undocumented CPU-coupled matrix multiplication coprocessor called AMX. In this paper, we leverage AMX to accelerate the post-quantum lattice-based cryptosystems Saber and FrodoKEM, and benchmark their performance on Apple M1 and M3 SoCs. We propose a variant of the Toeplitz Matrix-Vector Product algorithm for polynomial multiplication, which sets new speed records for Saber using AMX (up to 13% for the main KEM operations, and 151% for matrix-vector multiplication of polynomials). For FrodoKEM, we set new speed records with our AMX implementation (up to 21% for the main KEM operations, and 124% for matrix multiplication, with even greater improvements for $4 \times$-batching). Such speedups are relative to our optimized NEON implementation, also presented here, which improves upon the state-of-the-art implementation for ARMv8 CPUs.

Metadata
Available format(s)
PDF
Category
Implementation
Publication info
Preprint.
Keywords
Post-quantum cryptographyAMXARMNEONFrodoKEMSaber
Contact author(s)
decio gazzoni @ ic unicamp br
brandaogbs @ gmail com
gora adj @ tii ae
arwa alblooshi @ tii ae
isaac canales @ tii ae
jorge saab @ tii ae
jlopez @ ic unicamp br
History
2024-02-09: approved
2024-02-09: received
See all versions
Short URL
https://ia.cr/2024/195
License
Creative Commons Attribution
CC BY

BibTeX

@misc{cryptoeprint:2024/195,
      author = {Décio Luiz Gazzoni Filho and Guilherme Brandão and Gora Adj and Arwa Alblooshi and Isaac A. Canales-Martínez and Jorge Chávez-Saab and Julio López},
      title = {{PQC}-{AMX}: Accelerating Saber and {FrodoKEM} on the Apple M1 and M3 {SoCs}},
      howpublished = {Cryptology {ePrint} Archive, Paper 2024/195},
      year = {2024},
      url = {https://eprint.iacr.org/2024/195}
}
Note: In order to protect the privacy of readers, eprint.iacr.org does not use cookies or embedded third party content.