This paper presents a parallel shared-memory implementation of the GaussSieve algorithm, a well known SVP-solver. Our implementation achieves almost linear and linear speedups with up to 64 cores, depending on the tested scenario, and delivers better sequential performance than any other disclosed GaussSieve implementation. In this paper, we show that it is possible to implement a highly scalable version of GaussSieve on multi-core CPU-chips. The key features of our implementation are a lock-free singly linked list, and hand-tuned, vectorized code. Additionally, we propose an algorithmic optimization that leads to faster convergence.
Category / Keywords: Original Publication (with minor differences): SBAC-PAD'14 - 26th International Symposium on Computer Architecture and High Performance Computing Date: received 30 Sep 2014 Contact author: artur mariano at sc tu-darmstadt de Available format(s): PDF | BibTeX Citation Note: Final (full) version of the paper. Version: 20141001:060758 (All versions of this report) Short URL: ia.cr/2014/775 Discussion forum: Show discussion | Start new discussion