## Cryptology ePrint Archive: Report 2016/264

How Fast Can Higher-Order Masking Be in Software?

Dahmun Goudarzi and Matthieu Rivain

Abstract: It is widely accepted that higher-order masking is a sound countermeasure to protect implementations of block ciphers against side-channel attacks. The main issue while designing such a countermeasure is to deal with the nonlinear parts of the cipher \textit{i.e.} the so-called s-boxes. The prevailing approach to tackle this issue consists in applying the Ishai-Sahai-Wagner (ISW) scheme from CRYPTO 2003 to some polynomial representation of the s-box. Several efficient constructions have been proposed that follow this approach, but higher-order masking is still considered as a costly (impractical) countermeasure. In this paper, we investigate efficient higher-order masking techniques by conducting a case study on ARM architectures (the most widespread architecture in embedded systems). We follow a bottom-up approach by first investigating the implementation of the base field multiplication at the assembly level. Then we describe optimized low-level implementations of the ISW scheme and its variant (CPRR) due to Coron \textit{et al.} (FSE 2013). Finally we present improved state-of-the-art methods with custom parameters and various implementation-level optimizations. We also investigate an alternative to polynomials methods which is based on bitslicing at the s-box level. We describe new masked bitslice implementations of the AES and PRESENT ciphers. These implementations happen to be significantly faster than (optimized) state-of-the-art polynomial methods. In particular, our bitslice AES masked at order 10 runs in $0.48$ megacycles, which makes $8$ milliseconds in presence of a $60$ MHz clock frequency.

Category / Keywords: implementation / Side-Channel Countermeasures, Higher-Order Masking, Bitslice, ARM