Information System: Deciphering Complex Networks, funded by
NSF Career IIS-0954125.
Graphs and networks are ubiquitous, encoding
complex relationships ranging from chemical bonds to social interactions. Hidden
in these networks are the answers to many important questions in biology,
business, and sociology. In order to analyze complex networks, users have to
master sophisticated computing and programming skills. It indeed becomes a pain
point for many scientists and engineers.
This project is to change the
state of the art by developing a general graph information system, which is able
to address the needs of searching and mining complex networks. Real-life
networks are complex, not only having topological structures, but also
containing heterogeneous contents and attributes associated with nodes and
edges. The mixture of structures and contents raises two challenges that require
new solutions for smarter and faster graph analysis. First, new types of
graph search and mining operations, such as graph aggregation, graph
association, and graph pattern mining, are emerging. Second, when graphs become
complex and large, most of existing graph mining algorithms cannot scale well.
This project addresses these challenges and performs a comprehensive study of a
general graph information system. The proposed system includes three major
components: complex graph search, graph pattern mining, and graph indexing. It
covers emerging structure queries in social, biological, and information
networks, new graph mining operators such as graph summarization and
association, and innovative indexing methodologies, e.g., differential graph
This research is tightly integrated with education through student
mentoring and curriculum development. Publications, software and course
materials resulted from this project are disseminated on this website.
Nan Li (Data Scientist, oDesk, Apple,
Arijit Khan (PostDoc,
ETH), Shengqi Yang
(Research Scientist, Facebook)
Bruce Liu (Pasadena Community College/UCI)
- Query-Based Outlier Detection in Heterogeneous Information Networks,
by H. Zhuang, J. Zhang,
G. Brova, J. Tang, H. Cam, X. Yan, and J. Han,
EDBT'15 (18th International Conference on Extending Database Technology),
- Mining Query-Based Subnetwork Outliers in Heterogeneous Information
by H. Zhuang, J. Zhang, G. Brova, J. Tang, H. Cam, X. Yan, and J.
(Proc. 2014 Int. Conf. on Data Mining), Dec 2014. [pdf]
- Towards Scalable Critical Alert Mining,
by B. Zong, Y. Wu, J. Song, A.
Singh, H. Cam, J. Han and X. Yan,
(Proc. of the 20th Int. Conf. on Knowledge
Discovery and Data Mining), Aug 2014. [pdf]
- SLQ: A User-friendly Graph Querying System,
by S. Yang, Y. Xie, Y.
Wu, T. Wu, H. Sun, J. Wu, X. Yan,
(Proc. 2014 Int. Conf. on Management of Data) (demo paper), 2014. [pdf]
- Schemaless and Structureless Graph Querying,
by S. Yang, Y. Wu, H. Sun,
(Proc. of the 40th Int. Conf. on Very Large Databases),
- A Probabilistic Approach to Uncovering Attributed Graph
N. Li, H. Sun, K. Chipman, J. George, X. Yan,
(Proc. 2014 SIAM Int.
Conf. on Data Mining), 2014. [pdf]
Cloud Service Placement via Subgraph Matching,
by B. Zong, R.
Raghavendra, M. Srivatsa, X. Yan, A. Singh, and K.-W. Lee,
(Proc. 2014 Int. Conf. on Data Engineering), 2014
- Summarizing Answer Graphs Induced by Keyword Queries,
by Y. Wu, S.
Yang, M. Srivatsa, A. Iyengar, X. Yan,
(Proc. of the 40th Int. Conf. on Very
Large Databases), 2014.[pdf]
- Noise-Resistant Bicluster Recognition,
by H. Sun, G. Miao, X. Yan,
2013 IEEE Int. Conf. on Data Mining), Dec 2013. [pdf]
- Mining Evidences for Named Entity Disambiguation,
by Y. Li, C. Wang, F. Han, J. Han, D. Roth, and X. Yan,
(Proc. of the 19th Int. Conf. on Knowledge Discovery and Data Mining), Aug 2013. [pdf]
- Memory Efficient Minimum Substring Partitioning,
by Y. Li, P. Kamousi, F.
Han, S. Yang, X. Yan, S. Suri,
(Proc. of the 39th Int. Conf. on Very Large
Databases), Aug 2013. [pdf] [software release]
- NeMa: Fast Graph Search with Label Similarity,
by A. Khan, Y. Wu, C.
Aggarwal, X. Yan,
(Proc. of the 39th Int. Conf. on Very Large Databases ),
Aug 2013. [pdf] [software
- Ontology-based Subgraph Querying,
by Y. Wu, S. Yang, X. Yan,
ICDE'13 (Proc. 2013 Int. Conf. on Data Engineering), Apr 2013. [pdf]
- Neighborhood Based Fast
Graph Search in Large Networks,
by A. Khan, N. Li, Z. Guan, X. Yan, S.
Chakraborty, and S. Tao,
(Proc. 2011 Int. Conf. on Management of Data), June
- Content-Aware Resolution Sequence Mining for Ticket Routing,
by P. Sun, S.
Tao, X. Yan, N. Anerousis, Y. Chen,
BPM'10(The 8th Int. Conf. on
Business Process Management), Sep. 2010
2013 Nan Li, Ph.D., "Uncovering
Anomalous Patterns in Large Attributed Graphs."
2013 Arijit Khan, Ph.D., "Towards Querying and Mining of
2015 Shengqi Yang, Ph.D., "Fast Search in Large Scale Knowledge Graphs."