Report ID
1996-14
Report Authors
Mihai Florin Ionescu
Report Date
Abstract
Sorting is an important component of many applications and parallel sortingalgorithms have been studied extensively in the last three decades. One of theearliest parallel sorting algorithms is Bitonic Sort which is represented by asorting network consisting of multiple butterfly stages. Not surprisingly, thealgorithm has been studied on different parallel network topologies whichprovide an easy embedding of butterflies, such as the hypercube or theshuffle-exchange.This thesis studies bitonic sort on modern parallel machines. These machinesare relatively coarse grained and consist of only a modest number of nodes.Thus many data elements have to be mapped onto each processor. Under such asetting optimizing the bitonic sort algorithm becomes a question of mapping thedata elements to processing nodes (data layout) to minimize communication andoptimizing the local computation on each node. We developed a bitonic sortalgorithm which minimizes the number of communication steps and optimizes thelocal computation. We have implemented the new algorithm on the Meiko CS-2,analyzed and evaluated its performance under the LogP and LogGP models ofparallel computation. The resulting algorithm is faster than the previousimplementations, as experimental results collected on a 64 node Meiko CS-2show.
Document
1996-14.ps2.01 MB