Identifying Value in Crowdsourced Wireless Signal Measurements
Zhijing Li
Ana Nika
Xinyi Zhang
Yanzi Zhu
Yuanshun Yao
Ben Y. Zhao
Haitao Zheng
Proceedings of the 26th World Wide Web Conference (WWW 2017)
[Full Text in PDF Format, 266KB]
Paper Abstract
While crowdsourcing is an attractive approach to collect large-scale
wireless measurements, understanding the quality and variance of
the resulting data is difficult. Our work analyzes the quality of
crowdsourced cellular signal measurements in the context of basestation
localization, using large international public datasets (419M
signal measurements and ~1M cells) and corresponding ground
truth values. Performing localization using raw received signal
strength (RSS) data produces poor results and very high variance.
Applying supervised learning improves results moderately, but variance
remains high. Instead, we propose feature clustering, a novel
application of unsupervised learning to detect hidden correlation
between measurement instances, their features, and localization
accuracy. Our results identify RSS standard deviation and RSSweighted
dispersion mean as key features that correlate with highly
predictive measurement samples for both sparse and dense measurements
respectively. Finally, we show how optimizing crowdsourcing
measurements for these two features dramatically improves
localization accuracy and reduces variance.