For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
Performance Optimization for Sparse AtAx in Parallel on Multicore CPU
Yuan TAO Yangdong DENG Shuai MU Zhenzhong ZHANG Mingfa ZHU Limin XIAO Li RUAN
IEICE TRANSACTIONS on Information and Systems
Publication Date: 2014/02/01
Online ISSN: 1745-1361
Print ISSN: 0916-8532
Type of Manuscript: LETTER
Category: Fundamentals of Information Systems
sparse AtAx, compressed sparse block, compressed sparse rows, multicore platform,
Full Text: PDF>>
The sparse matrix operation, y ← y+AtAx, where A is a sparse matrix and x and y are dense vectors, is a widely used computing pattern in High Performance Computing (HPC) applications. The pattern poses challenge to efficient solutions because both a matrix and its transposed version are involved. An efficient sparse matrix format, Compressed Sparse Blocks (CSB), has been proposed to provide nearly the same performance for both Ax and Atx. We develop a multithreaded implementation for the CSB format and apply it to solve y ← y+AtAx. Experiments show that our technique outperforms the Compressed Sparse Row (CSR) based solution in POSKI by up to 2.5 fold on over 70% of benchmarking matrices.