Proc. 7th Int. Conf. on Information and Knowledge Management (CIKM'98), Nov. 1998 Washington DC, pp. 348-355.

Ömer Egecioglu, Amr El Abbadi, and Kamil Sarac

Iterated DFT Based Techniques for Join Size Estimation

Abstract. In this paper, we propose techniques based on the Discrete Fourier Transform (DFT) to estimate the size of relations resulting from join operations. The basic idea is the iterated application of DFT to attribute values modulo the phase information. The first version of this method called Approximation by Absolute Value (AAV) gives a logarithmic space representation which is exact for self join operations. Its generalizations Tree Approximation Algorithms (TAA) and Tree Approximation Algorithms with Truncation (TAAT) are built upon a binary tree representation of the vectors obtained through the iterated application of the DFT. TAA uses AAV as a subprocedure at the lower levels of the tree, whereas TAAT is obtained by truncating the tree at an appropriate level. Both TAA and TAAT provide a spectrum of algorithms that interpolate storage requirements versus accuracy of the estimates obtained.

omer@cs.ucsb.edu