Paper 2016/047
Comb to Pipeline: Fast Software Encryption Revisited
Andrey Bogdanov, Martin M. Lauridsen, and Elmar Tischhauser
Abstract
AES-NI, or Advanced Encryption Standard New Instructions, is an extension of the x86 architecture proposed by Intel in 2008. With a pipelined implementation utilizing AES-NI, parallelizable modes such as AES-CTR become extremely efficient. However, out of the four non-trivial NIST-recommended encryption modes, three are inherently sequential: CBC, CFB, and OFB. This inhibits the advantage of using AES-NI significantly. Similar observations apply to CMAC, CCM and a great deal of other modes. We address this issue by proposing the comb scheduler -- a fast scheduling algorithm based on an efficient look-ahead strategy, featuring a low overhead -- with which sequential modes profit from the AES-NI pipeline in real-world settings by filling it with multiple, independent messages. As our main target platform we apply the comb scheduler to implementations on Haswell, a recent Intel microarchitecture, for a wide range of modes. We observe a drastic speed-up of factor 5 for NIST's CBC, CFB, OFB and CMAC performing around 0.88 cpb. Surprisingly, contrary to the entire body of previous performance analysis, the throughput of the authenticated encryption (AE) mode CCM gets very close to that of GCM and OCB3, with about 1.64 cpb (vs. 1.63 cpb and 1.51 cpb, respectively), when message lengths are sampled according to a realistic distribution for Internet packets, despite Haswell's heavily improved binary field multiplication. This suggests CCM as an AE mode of choice as it is NIST-recommended, does not have any weak-key issues like GCM, and is royalty-free as opposed to OCB3. Among the CAESAR contestants, the comb scheduler significantly speeds up CLOC/SILC, JAMBU, and POET, with the mostly sequential nonce-misuse resistant design of POET, performing at 2.14 cpb, becoming faster than the well-parallelizable COPA. Despite Haswell being the target platform, we also include performance figures for the more recent Skylake microarchitecture, which provides further optimizations to AES-NI instructions. Finally, this paper provides the first optimized AES-NI implementations for the novel AE modes OTR, CLOC/SILC, COBRA, POET, McOE-G, and Julius.
Metadata
- Available format(s)
- Category
- Implementation
- Publication info
- A minor revision of an IACR publication in FSE 2015
- DOI
- 10.1007/978-3-662-48116-5_8
- Keywords
- AES-NIpclmulqdqHaswellSkylakeauthenticated encryptionCAESARCBCOFBCFBCMACCCMGCMOCB3OTRCLOCCOBRAJAMBUSILCMcOE-GCOPAPOETJulius
- Contact author(s)
- mmeh @ dtu dk
- History
- 2016-01-19: received
- Short URL
- https://ia.cr/2016/047
- License
-
CC BY
BibTeX
@misc{cryptoeprint:2016/047, author = {Andrey Bogdanov and Martin M. Lauridsen and Elmar Tischhauser}, title = {Comb to Pipeline: Fast Software Encryption Revisited}, howpublished = {Cryptology {ePrint} Archive, Paper 2016/047}, year = {2016}, doi = {10.1007/978-3-662-48116-5_8}, url = {https://eprint.iacr.org/2016/047} }