UCSB Home

Yangtze

Computer Science Home
CS Home

Home 
Projects 
People 
Publications 
Courses 
Grants 

Data summarization and estimation can serve as a useful tool for a diverse set of applications ranging from traditional database query optimization to OLAP applications and the exploration of large data sets. Data estimation and summarization techniques will be developed for datasets with different modalities: point datasets, datasets containing objects with spatial extents, and stream datasets.  The approach is specifically based on the use of histograms in different contexts.  One of the main problems of applying estimation techniques in a data stream processing system is the unknown characteristic of the distribution of the data from an evolving data stream.  In fact, in order for such a technique to be useful, the estimation technique should be effective for different kinds of data distributions and query patterns. For stream datasets, a variety of approaches will be explored that are amenable to maintain histograms dynamically.

Principal Investigators

Graduate Student Researchers

  • Stacy Patterson
  • Shyam Antony
  • Ahmed Metwally (Ph.D. 2007)
  • Fatih Emecki (Ph.D. 2006)
  • Ozgur Sahin (Ph.D. 2006)
  • Huagang Li (Ph.D. 2006)
  • Ying Feng (Ph.D. 2005)
  • Lin Qiao (Ph.D. 2005)

Acknowledgement

This project is based upon work supported by the National Science Foundation under Grant No. IIS 0223022.

Selected Publications:
  • Using Tomography for Ubiquitous Sensing. Stacy Patterson, Bassam Bamieh, and Amr El Abbadi, 16th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM GIS), 2008. pdf
  • Distributed Consensus with Link Failures as a Structured Stochastic Uncertainty Problem. Stacy Patterson and Bassam Bamieh, 2008 Allerton Conference on Communication, Control, and Computing. pdf
  • SLEUTH: Single-pubLisher attack dEtection Using correlaTion Hunting. Ahmed Metwally, Fatih Emekci, Divyakant Agrawal, and Amr El Abbadi, Proceedings of the VLDB Endowmen, vol. 1 (2), 2008. pages 1217-1228. pdf
  • On the Feasibility of Large-Scale Automated Highways (invited paper). Stacy Patterson, Bassam Bamieh, Amr El Abbadi, and Mihailo Jovanovic, First International Workshop on Computational Transportation Science, 2008. pdf
  • MOOLAP: Towards Multi-Objective OLAP. Shyam Antony, Ping Wu, Divyakant Agrawal, and Amr El Abbadi, The 24th International Conference on Data Engineering (ICDE), 2008, pages 1394-1396. pdf
  • Environmenal Tomography: Ubiquitous Sensing with Mobile Devices (demonstration). Stacy Patterson, Bassam Bamieh, and Amr El Abbadi, The 24th International Conference on Data Engineering (ICDE), 2008, pages 1560-1563. pdf
  • Distributed Average Consensus with Stochastic Communication Failures. Stacy Patterson, Bassam Bamieh, and Amr El Abbadi, 46th IEEE Conference on Decision and Control (CDC), 2007, pages 4215-4220. pdf
  • Privacy preserving decision tree learning over multiple parties. Fatih Emekci, Ozgur D. Sahin, Divyakant Agrawal, and Amr El Abbadi, Data and Knowledge Engineering, vol. 63 (2), 2007, pages 348-361. pdf
  • Progressive Ranking of Range Aggregates. Huagang Li, Hailing Yu, Divyakant Agrawal, and Amr El Abbadi, Data and Knowledge Engineering, vol. 63 (1), 2007, pages 4-25. pdf
  • On Hit Inflation Techniques and Detection in Streams of Web Advertising Networks, Ahmed Metwally, Divyakant Agrawal, Amr El Abbadi, and Qi Zheng, ICDCS 2007. pdf
  • Fast Algorithms for Heavy Distinct Hitters using Associative Memories, Nagender Bandi, Divyakant Agrawal and Amr El Abbadi, ICDCS 2007. pdf
  • Fast data stream algorithms using associative memories. Nagender Bandi, Ahmed Metwally, Divyakant Agrawal and Amr El Abbadi, SIGMOD 2007, pp. 247-256. pdf
  • DETECTIVES: DETEcting Coalition hiT Inflation attacks in adVertising nEtworks Streams. Ahmed Metwally, Divyakant Agrawal, and Amr El Abbadi, WWW 2007, pp. 241-250. pdf
  • TCAM-Conscious Algorithms for Data Streams, Nagender Bandi, Ahmed Metwally, Divyakant Agrawal and Amr El Abbadi, ICDE 2007, pp. 1342-1344. (Poster) pdf
  • DeltaSky: Optimal Maintenance of Skyline Deletions without Exclusive Dominance Region Generation. Ping Wu, Divyakant Agrawal,Omer Egecioglu and Amr El Abbadi, ICDE 2007. pp. 486-495. pdf
  • FLUX: Content and Structure Matching of XPath Queries with Range Predicates, Hua-Gang Li, S. Alireza Aghili, Divyakant Agrawal and Amr El Abbadi, XSym 2006, pp. 61-76. pdf
  • FLUX: fuzzy content and structure matching of XML range queries. S. Alireza Aghili, Hua-Gang Li, Divyakant Agrawal, and Amr El Abbadi, WWW 2006, pp. 1081-1082. pdf
  • TWIX: twig structure and content matching of selective queries using binary labeling. S. Alireza Aghili, Hua-Gang Li, Divyakant Agrawal, and Amr El Abbadi, Infoscale 2006, p. 42. pdf
  • Privacy Preserving Query Processing Using Third Parties. Fatih Emekci, Divyakant Agrawal, Amr El Abbadi and Aziz Gulbeden, ICDE 2006, p. 27. pdf
  • Parallelizing Skyline Queries for Scalable Distribution. Ping Wu, Caijie Zhang, Ying Feng, Ben Y. Zhao, Divyakant Agrawal, and Amr El Abbadi, EDBT 2006, pp. 112-130. pdf
  • Exploring spatial datasets with histograms. Chengyu Sun, Nagender Bandi, Divyakant Agrawal and Amr El Abbadi, Distributed and Parallel Databases 2006, pp. 57-88. pdf
  • Optimal Data-Space Partitioning of Spatial Data for Parallel I/O. Hakan Ferhatosmanoglu, Divyakant Agrawal, Omer Egecioglu, and Amr El Abbadi, Parallel and Distributed Databases 17(1): 75-101 (2005). pdf
  • PADS: Protein Structure Alignment Using Directional Shape Signatures. S. Alireza Aghili, Divyakant Agrawal, and Amr El Abbadi DASFAA 2005, pp. 17-29. pdf
  • Exploiting Temporal Correlation in Temporal Data Warehouses. Ying Feng, Hua-Gang Li, Divyakant Agrawal, and Amr El Abbadi, DASFAA 2005, pp. 662-674. pdf
  • Attribute-Based Access to Distributed Data over P2P Networks. Divyakant Agrawal, Amr El Abbadi, and Subhash Suri, DNIS 2005, pp. 244-263. pdf
  • Multiple query optimization in middleware using query teamwork. Kevin O'Gorman, Amr El Abbadi and Divyakant Agrawal, Software Practice and Experience, vol 35, no 4, 2004, pp. 361-391. pdf
  • Distributed Resource Discovery in Large Scale Computing Systems. Abhishek Gupta, Divyakant Agrawal, and Amr El Abbadi, SAINT 2005, pp. 320-326. pdf
  • Techniques for Efficient Routing and Load Balancing in Content-Addressable Networks. Ozgur D. Sahin, Divyakant Agrawal and Amr El Abbadi, IEEE P2P 2005, pp. 67-74. pdf
  • Efficient Processing of Distributed Top-k Queries. Hailing Yu, Hua-Gang Li, Ping Wu, Divyakant Agrawal, and Amr El Abbadi, DEXA 2005, pp. 65-74. pdf
  • Progressive Ranking of Range Aggregates. Hua-Gang Li, Hailing Yu, Divyakant Agrawal, and Amr El Abbadi, DaWaK 2005, September 2005, pp. 179-189. pdf
  • Scalable ranking for preference queries. Ying Feng, Divyakant Agrawal, Amr El Abbadi, and Ambuj K. Singh, CIKM 2005, pp. 313-314. pdf
  • ABACUS: A Distributed Middleware for Privacy Preserving Data Sharing Across Private Data Warehouses. Fatih Emekci, Divyakant Agrawal, and Amr El Abbadi, Middleware 2005, pp. 21-41. pdf
  • PRISM: indexing multi-dimensional data in P2P networks using reference vectors. Ozgur D. Sahin, Aziz Gulbeden, Fatih Emekci, Divyakant Agrawal, and Amr El Abbadi, ACM International Conference on Multimedia, pp. 946-955. pdf
  • PRoBe: Multi-dimensional Range Queries in P2P Networks. Ozgur D. Sahin, S. Antony, Divyakant Agrawal and Amr El Abbadi, WISE 2005, pp. 332-346. pdf
  • Guaranteeing Correctness of Lock-free Range Queries over P2P Data. Stacy Patterson, Divyakant Agrawal, and Amr El Abbadi, The Third International Workshop on Databases, Information Systems and Peer-to-Peer Computing (DBISP2P 2005) with VLDB 2005. pdf
  • SPiDeR: P2P-Based Web Service Discovery. Ozgur D. Sahin, Cagdas Evren Gerede, Divyakant Agrawal, Amr El Abbadi, Oscar H. Ibarra, and Jianwen Su, ICSOC 2005, pp. 157-169. pdf
  • A Peer-to-Peer Framework for Web Service Discovery with Ranking. Fatih Emekci, Ozgur D. Sahin, Divyakant Agrawal, and Amr El Abbadi, ICWS 2004, pp. 192-199. pdf
  • Protein structure alignment using geometrical features. S. Alireza Aghili, Divyakant Agrawal, and Amr El Abbadi, CIKM 2004, pp. 148-149. pdf
  • Meghdoot: Content-Based Publish/Subscribe over P2P Networks. Abhishek Gupta, Ozgur D. Sahin, Divyakant Agrawal, and Amr El Abbadi, Middleware 2004, pp. 254-273. pdf
  • A Peer-to-peer Framework for Caching Range Queries. Ozgur D. Sahin, Abhishek Gupta, Divyakant Agrawal, and Amr El Abbadi, ICDE 2004, pp. 165-176. pdf
  • Range CUBE: Efficient Cube Computation by Exploiting Data Correlation. Ying Feng, Divyakant Agrawal, Amr El Abbadi, and Ahmed Metwally, ICDE 2004, pp. 658-670. pdf
  • BINOCULAR: A System Monitoring Framework. Fatih Emekci, S. Emre Tuna, Divyakant Agrawal, Amr El Abbadi, DMSN 2004, pp. 5-9. pdf
  • Supporting Sliding Window Queries for Continuous Data Streams, Lin Qiao, Divyakant Agrawal, and Amr El Abbadi, SSDBM 2003, pp. 85-96. pdf
  • RHist: Adaptive Summarization over Continuous Data Streams, Lin Qiao, Divyakant Agrawal, and Amr El Abbadi, CIKM 2002: 609-626.abstract pdf

 

 

Copyright(c) 2002 DSL. All rights reserved.
dsl@cs.ucsb.edu