Symmetric block sparse matrix-vector multiplication
Abstract:
Embodiments of the present invention are directed to methods and systems for performing block sparse matrix-vector multiplications with improved efficiency through the use of a specific re-ordering the matrix data such that matrix symmetry can be exploited while simultaneously avoiding atomic memory operations or the need for inefficient memory operations in general. One disclosed method includes reordering the matrix data such that, for any column of non-transpose data, and for any row of transpose data simultaneously processed within a single thread-block on a GPU, all matrix elements update independent elements of the output vector. Using the method, the amount of data required to represent the sparse matrix can be reduced by as much as 50%, thereby doubling the effective performance on the GPU, and doubling the size of the matrix that can be accelerated by the GPU.
Public/Granted literature
Information query
Patent Agency Ranking
0/0