in ``Parallel, Neural, Systolic Algorithms and Applications,'' (Michael P. Bekakos, Ed.), Advances in High Performance Computing, Volume 5, Computational Mechanics Publications (to appear).

Ömer Egecioglu, Peter Cappello and Chris Scheiman

Introduction to Processor-Time-Optimal Systolic Arrays

Abstract. We consider computations suitable for systolic arrays, often called regular array computations or systems of uniform recurrence relations. In such computations, the tasks to be computed are viewed as the nodes of a directed acyclic graph (dag), where the data dependencies are represented as arcs. A processor-time-minimal schedule measures the minimum number of processors needed to extract the maximum parallelism from the dag. We present a technique for finding a lower bound on the number of processors needed to achieve a given schedule of an algorithm represented as a dag. The application of this technique is illustrated with a tensor product computation. We then consider the free schedule of algorithms for matrix product, Gaussian elimination, and transitive closure. For each problem, we provide a time-minimal processor schedule that meets the computed processor lower bounds, including the one for tensor product.

omer@cs.ucsb.edu