Proc. 7th Int. Conf. on Information and Knowledge Management (CIKM'98),
Nov. 1998 Washington DC, pp. 348-355.
Ömer Egecioglu, Amr El Abbadi, and Kamil Sarac
Iterated DFT Based Techniques for Join Size Estimation
Abstract.
In this paper, we propose techniques
based on
the Discrete Fourier Transform (DFT)
to estimate the size of relations resulting
from join operations.
The basic idea is the iterated application of DFT
to attribute values modulo the phase information. The first version of this
method
called Approximation by Absolute Value (AAV) gives a
logarithmic space representation which
is exact for self join operations. Its generalizations
Tree Approximation Algorithms (TAA) and Tree Approximation
Algorithms with Truncation (TAAT)
are built upon a binary tree representation of the
vectors obtained through the
iterated application of the DFT. TAA uses AAV as a
subprocedure at the lower levels of the tree, whereas TAAT is
obtained by truncating the tree at an appropriate level.
Both TAA and TAAT
provide a spectrum of algorithms that
interpolate storage requirements versus accuracy of
the estimates obtained.
omer@cs.ucsb.edu