Knowledge Graph Query Processing and Benchmarking

Xifeng Yan,  University of California at Santa Barbara
Project Summary

Knowledge Graph Query Processing and Benchmarking , funded by NSF IIS 1528175.

Today, if a user has a question, using Google or Bing, she still has to read through multiple web pages to find answers. This paradigm is now changing due to the rise of mobile devices.  Over the last decade, it was witnessed that many systems aim to answer queries directly, e.g., using knowledge graphs collected from the Internet or through crowdsourcing.   A real sea change in information search is coming!  A broad range of new applications are emerging in intelligent policing, personal assistance, individualized healthcare, legal services, scientific literature search, and recently robotics.  This project has the potential to make fundamental advances in querying heterogeneous knowledge graphs, which are ubiquitous.  It will open up a set of new knowledge base applications in fast growing areas such as social networks, intelligence analysis, and medical research.  It is going to significantly ease query formulation and improve search quality/speed in these applications.

Given the high data heterogeneity in knowledge graphs, writing structured queries that fully comply with data specification is extremely hard for ordinary users, while keyword queries can be too ambiguous to reflect user search intent. The situation becomes even worse when there are various representations for the same entity or relation.  It is expected that a sophisticated query system shall be able to support different concept representations without forcing users to use very controlled vocabulary. It shall provide simple mechanisms to users so that they can quickly come up with a right query either explicitly or implicitly (e.g., via relevance feedback).  This proposal is going to develop such system, make it user-friendly and scalable.

The proposed research includes a plan to build a flexible query benchmark that is able to cope with heterogeneous, large-scale knowledge graphs, as well as user specified configurations and performance metrics.  Benchmarks are indispensable for rapid development of database research.   There were many successful examples of how robust and meaningful benchmarks can greatly expedite the development of a research area.   The query benchmark proposed in this project is timely needed.  It is going to (1) provide a standardized way to fairly and comprehensively evaluate different knowledge graph query algorithms, (2) improve the understanding of the existing query engines, and (3) advance the area by getting researchers involved in the same play ground for building better, faster, and more intelligent methods.

Graduate Students:

Undergraduate Students:


  1. SLQ: A User-friendly Graph Querying System,
    by S. Yang, Y. Xie, Y. Wu, T. Wu, H. Sun, J. Wu, X. Yan,
    SIGMOD'14 (Proc. 2014 Int. Conf. on Management of Data) (demo paper), 2014. [pdf] [demo]
  2. Schemaless and Structureless Graph Querying,
    by S. Yang, Y. Wu, H. Sun, X. Yan,
    VLDB'14 (Proc. of the 40th Int. Conf. on Very Large Databases), 2014. [pdf]

Dissertations (TBD)