J. of Parallel Algorithms and Applications, Special Issue on Advanced Regular Array Design (to appear).

Ömer Egecioglu, Peter Cappello and Chris Scheiman

Processor-Time-Optimal Systolic Arrays

Abstract. Minimizing the amount of time and number of processors needed to perform an application reduces the application's fabrication cost and operation costs. A directed acyclic graph (dag) model of algorithms is used to define a time-minimal schedule and a processor-time-minimal schedule. We present a technique for finding a lower bound on the number of processors needed to achieve a given schedule of an algorithm. The application of this technique is illustrated with a tensor product computation. We then apply the technique to the free schedule of algorithms for matrix product, Gaussian elimination, and transitive cloure. For each, we provide a time-minimal processor schedule that meets these processor lower bounds, including the one for tensor product.

omer@cs.ucsb.edu