Report ID
2002-08
Report Authors
Mirek Riedewald, Divyakant Agrawal, Amr El Abbadi, and Flip Korn
Report Date
Abstract
Aggregation plays an important role in data warehousing and has receivedmuch attention to date. However, existing techniques do not sufficientlyaddress the issue of making the computation of multidimensional aggregatesscale with increasing dimensionality. At the same time the benefit oftaking full advantage of the hard disk geometry is often overlooked.This paper presents the Multiresolution File Scan (MFS) approach which isbased on a selection of flat files which are accessed with fast sequentialI/O operations. Its simple structure and low storage overheadallow MFS to scale to high dimensionality while making the best use ofthe increasing transfer speed of modern hard disks. We show that MFSoutperforms multidimensional index structures, even if these structures arebulk-loaded and hence optimized for query processing.Our approach can incorporate a priori knowledge about the queryworkload and is applicable to all distributive(e.g., COUNT, SUM, MAX, MIN) and algebraic (e.g., AVG) aggregate operators.
Document
2002-08.ps865.8 KB