Knowledge Graph Query Processing and Benchmarking


PI
Xifeng Yan,  University of California at Santa Barbara
 
Project Summary
Publications

Knowledge Graph Query Processing and Benchmarking , funded by NSF IIS 1528175.

Project Summary

Today, if a user has a question, using Google or Bing, she still has to read through multiple web pages to find answers. This paradigm is now changing due to the rise of mobile devices.  Over the last decade, it was witnessed that many systems aim to answer queries directly, e.g., using knowledge graphs collected from the Internet or through crowdsourcing.   A real sea change in information search is coming!  A broad range of new applications are emerging in intelligent policing, personal assistance, individualized healthcare, legal services, scientific literature search, and recently robotics.  This project has the potential to make fundamental advances in querying heterogeneous knowledge graphs, which are ubiquitous.  It will open up a set of new knowledge base applications in fast growing areas such as social networks, intelligence analysis, and medical research.  It is going to significantly ease query formulation and improve search quality/speed in these applications.

Given the high data heterogeneity in knowledge graphs, writing structured queries that fully comply with data specification is extremely hard for ordinary users, while keyword queries can be too ambiguous to reflect user search intent. The situation becomes even worse when there are various representations for the same entity or relation.  It is expected that a sophisticated query system shall be able to support different concept representations without forcing users to use very controlled vocabulary. It shall provide simple mechanisms to users so that they can quickly come up with a right query either explicitly or implicitly (e.g., via relevance feedback).  This proposal is going to develop such system, make it user-friendly and scalable.

The proposed research includes a plan to build a flexible query benchmark that is able to cope with heterogeneous, large-scale knowledge graphs, as well as user specified configurations and performance metrics.  Benchmarks are indispensable for rapid development of database research.   There were many successful examples of how robust and meaningful benchmarks can greatly expedite the development of a research area.   The query benchmark proposed in this project is timely needed.  It is going to (1) provide a standardized way to fairly and comprehensively evaluate different knowledge graph query algorithms, (2) improve the understanding of the existing query engines, and (3) advance the area by getting researchers involved in the same play ground for building better, faster, and more intelligent methods.

Publications

  1. What It Takes to Achieve 100% Condition Accuracy on WikiSQL,
    by S. Yavuz, I. Gur, Y. Su, X. Yan,
    EMNLP'18 (Proc. of the 2018 Conference on Empirical Methods in Natural Language Processing) [pdf]
  2. DialSQL: Dialogue Based Structured Query Generation,
    by I. Gur, S. Yavuz, Y. Su, X. Yan,
    ACL'18 (Proc. of the Annual Meeting of the Association for Computational Linguistics, 2018) [pdf]
  3. Variational Knowledge Graph Reasoning, 
    by W. Chen, W. Xiong, X. Yan and W. Wang, 
    NAACL-HLT'18 (Proc. of the 16th North American Chapter of ACL: Human Language Technologies, 2018) [pdf]
  4. Global Relation Embedding for Relation Extraction
    by Yu Su*, Honglei Liu*, Semih Yavuz, Izzeddin Gur, Huan Sun, Xifeng Yan. [pdf] [code] (*: Equal Contribution)  https://arxiv.org/abs/1704.05958, April 2017
    NAACL-HLT'18 (Proc. of the 16th North American Chapter of ACL: Human Language Technologies, 2018)[pdf]
  5. Scalable Construction and Querying of Massive Knowledge Bases (Tutorial),
    by X. Ren, Y. Su, P. Szekely, X. Yan.
    WWW'18 
    (Proc. of the International Conference on World Wide Web), 2018 [website][slides1][slides2][slides3]
  6. Construction and Querying of Large-scale Knowledge Bases (Tutorial),
    by X. Ren, Y. Su, X. Yan.
    CIKM'17
    (Proc. of the ACM International Conference on Information and Knowledge Management), 2017 [website][slides]
  7. Cross-domain Semantic Parsing via Paraphrasing, 
    by Y. Su, X. Yan. 
    EMNLP'17 (Proc. of the 2017 Conf. on Empirical Methods in Natural Language Processing), 2017 [pdf]
  8. Recovering Question Answering Errors via Query Revision,
    by S. Yavuz, I. Gur, Y. Su, X. Yan. 
    EMNLP'17 (Proc. of the 2017 Conference on Empirical Methods in Natural Language Processing), 2017 [pdf]
  9. On Generating Characteristic-rich Question Sets for QA Evaluation,
    by Y. Su, H. Sun, B. Sadler, M. Srivatsa, I. Gur, Z. Yan, and X. Yan,
    EMNLP'16 (Proc. of the 2016 Conf. on Empirical Methods in Natural Language Processing) 2016 [pdf]
  10. Improving Semantic Parsing via Answer Type Inference,
    by S. Yavuz, I. Gur, Y. Su, M. Srivatsa, X. Yan,
    EMNLP'16 (Proc. of the 2016 Conf. on Empirical Methods in Natural Language Processing), 2016 [pdf]
  11. Semantic SPARQL Similarity Search Over RDF Knowledge Graphs,
    by W. Zheng, L. Zou, W. Peng, X. Yan, S. Song, D. Zhao,
    VLDB'16 (Prof. of the 42nd International Conference on Very Large Data Bases), 2016. [pdf]
  12. Exploiting Relevance Feedback in Knowledge Graph Search,
    by Y. Su, S. Yang, H. Sun, M. Srivatsa, S. Kase, M. Vanni and X. Yan,
    KDD'15 (Proc. of Int. Conf. on Knowledge Discovery and Data Mining), 2015 [pdf]
  13. SLQ: A User-friendly Graph Querying System,
    by S. Yang, Y. Xie, Y. Wu, T. Wu, H. Sun, J. Wu, X. Yan,
    SIGMOD'14 (Proc. 2014 Int. Conf. on Management of Data) (demo paper), 2014. [pdf] [demo]
  14. Schemaless and Structureless Graph Querying,
    by S. Yang, Y. Wu, H. Sun, X. Yan,
    VLDB'14 (Proc. of the 40th Int. Conf. on Very Large Databases), 2014. [pdf]

Dissertations (TBD)