Distributed Spatial Clustering in Sensor Networks

Report ID: 
A. Meka and A. K. Singh
2005-02-01 04:00:00


Sensor networks monitor physical phenomena (e.g.,temperature~\\cite{tao}, contaminant flows~\\cite{cflows}) over largegeographic regions. Scientists can gain valuable insights intophysical phenomena if they understand the structure and shape inherentin the underlying data distribution. A concise way to extract thisstructure is through \\emph{spatial clustering}, which partitions thenetwork into a set of spatial regions with similar observations. Webuild data models locally at each node and cluster based on modelcoefficients (rather than on raw data) to capture trends andseasonalities in spatial regions. Spatial clustering generates$\\epsilon$-$\\delta$ clusters which captures \\emph{local}dissimilarities by the parameter $\\epsilon$ and \\emph{global}dissimilarities by the parameter $\\delta$. Transmitting these modelsto a centralized site for clustering incurs communicationoverhead. Therefore, we present efficient distributed algorithms togenerate high quality clusters. We also present efficient updatealgorithms to continuously track the clusters as the phenomenaevolves. Using these spatial clusters, we show that range-queries canprune large portions of network leading to huge communicationgains. Experimental results on both real world and synthetic data setsshow that the quality of clusters generated is comparable to thecentralized algorithm and \\emph{in-network} clustering and modelingcombined together reduces the communication cost by two orders ofmagnitude.