Cryptology ePrint Archive: Report 2021/1236

Architecture Support for Bitslicing

Pantea Kiaei with Tom Conroy with Patrick Schaumont

Abstract: The bitsliced programming model has shown to boost the throughput of software programs. However, on a standard architecture, it exerts a high pressure on register access, causing memory spills and restraining the full potential of bitslicing. In this work, we present architecture support for bitslicing in a System-on-Chip. Our hardware extensions are of two types; internal to the processor core, in the form of custom instructions, and external to the processor, in the form of direct memory access module with support for data transposition. We present a comprehensive performance evaluation of the proposed enhancements in the context of several RISC-V ISA definitions (RV32I, RV64I, RV32B, RV64B). The proposed 14 new custom instructions use 1.5x fewer registers compared to the equivalent functionality expressed using RISC-V instructions. The integration of those custom instructions in a 5-stage pipelined RISC-V RV32I core requires 4.96% overhead. The proposed bitslice transposition unit with DMA provides a further speedup, changing the quadratic increase in execution time of data transposition to linear. Finally, we demonstrate a comprehensive performance evaluation using a set of benchmarks of lightweight and masked ciphers.

Category / Keywords: implementation / Bitslicing, instruction set extension, direct memory access, system-on-chip, hardware extension, computer architecture

Date: received 17 Sep 2021

Contact author: pkiaei at wpi edu, tconroy at vt edu, pschaumont at wpi edu

Available format(s): PDF | BibTeX Citation

Version: 20210920:114601 (All versions of this report)

Short URL:

[ Cryptology ePrint archive ]