RSH-SpMM: A Row-Structured Hybrid Kernel for Sparse Matrix-Matrix Multiplication on GPUs
The paper presents RSH-SpMM, a fine-grained row-structured hybrid kernel for GPU-based Sparse Matrix-Matrix Multiplication that utilizes adaptive row partitioning, RS-Tile representation, and load-balanced reordering to achieve 1.27x to 6.13x speedups over state-of-the-art methods by effectively handling extreme sparsity irregularity.