Report ID
1997-11
Report Authors
Cong Fu, Xiangmin Jiao, and Tao Yang
Report Date
Abstract
Gaussian elimination based sparse LU factorization with partial pivoting isimportant to many scientific applications, but it is still an open problem todevelop a high performance sparse LU code on distributed memory machines. Themain difficulty is that partial pivoting operations make structures of $L$ and$U$ factors unpredictable beforehand. This paper presents an approach called$S^*$ for parallelizing this problem on distributed memory machines. The $S^*$approach adopts static symbolic factorization to avoid run-time controloverhead, incorporates 2-D L/U supernode partitioning and amalgamationstrategies to improve caching performance, and exploits irregular taskparallelism embedded in sparse LU using asynchronous computation scheduling.The paper discusses and compares the algorithms using 1-D and 2-D data mappingschemes, and presents experimental studies on Cray-T3D and T3E. The megaflopperformance results for a set of nonsymmetric benchmark matrices are veryencouraging.
Document
1997-11.ps409.9 KB