Report ID
1996-18
Report Authors
Cong Fu and Tao Yang
Report Date
Abstract
Sparse LU factorization with partial pivoting is important to many scientificapplications, but the effective parallelization of this algorithm is still anopen problem. The main difficulty is that partial pivoting operations makestructures of $L$ and $U$ factors unpredictable beforehand. This paperpresents a novel approach called $S^*$ for parallelizing this problem ondistributed memory machines. $S^*$ incorporates static symbolic factorizationto avoid run-time control overhead and uses nonsymmetric L/U supernodepartitioning and amalgamation strategies to maximize the use of BLAS-3routines. The irregular task parallelism embedded in sparse LU is exploitedusing the RAPID run-time system which optimizes asynchronous communication,overlaps computation with communication and balances processor loads. Theexperimental results on the Cray-T3D with a set of Harwell-Boeing nonsymmetricmatrices are very encouraging and good scalability has been achieved. Evencompared to a highly optimized sequential code, the parallel speedups are stillimpressive considering the current status of sparse LU research.
Document
1996-18.ps384.95 KB