Report ID
2001-03
Report Authors
Chengyu Sun, Divyakant Agrawal, Amr El Abbadi
Report Date
Abstract
As online spatial datasets grow both in number and sophistication,it becomes increasingly difficult for users to decide whether a datasetis suitable for their tasks, especially when they do not have priorknowledge of the dataset. In this paper, wepropose {\\em browsing} as an effective and efficient way to explorethe content of a spatial dataset. Browsing allows users to view thesize of a result set before evaluating the query at the database,thereby avoiding zero-hit/mega-hit queries and saving time and resources.Although the underlying technique supporting browsing issimilar to range query aggregation and selectivity estimation,spatial dataset browsing poses some unique challenges.In this paper, we identify a set of spatial relations that need to be supported in browsing applications, namely, the {\\em contains},{\\em contained} and the {\\em overlap} relations. We prove a lowerbound on the storage required to answer queries about the {\\em contains}relation accurately at a given resolution. We then present threestorage-efficient approximation algorithms which we believe to be thefirst to estimate query results about these spatial relations.We evaluate these algorithms with both synthetic and real worlddatasets and show that they provide highly accurate estimates fordatasets with various characteristics.
Document
2001-03.ps4.59 MB