Int. J. of Foundations of Computer Sci. 10 (1), 1999, pp. 81-102.

Ömer Egecioglu, Amr El Abbadi, and Kamil Sarac

DFT Techniques for Size Estimation of Database Join Operations

Abstract. Novel algorithms based on the Discrete Fourier Transform (DFT) are proposed to estimate the size of relations resulting from join operations. We start with an approach in which the frequency distribution values are transformed using the DFT and the Fourier coefficients are used to construct histograms. Our main contribution is a direct approach which uses the amplitudes of the DFT coefficients iteratively. The proposed algorithm gives the exact join size using logarithmic space for the special case of self join. A generalization to compute the join of arbitrary relations is then used to develop two tree-based techniques that provide a spectrum of algorithms which interpolate storage requirements versus accuracy of the estimation obtained. Finally, we present experimental results to exhibit the effectiveness of our approach.

omer@cs.ucsb.edu