Knowledge Grounded and Controllable Conversational AI

PI
Xifeng Yan, University of California at Santa Barbara
Publications

Project Goal: Innovate Conversational AI with Knowledge Bases; Build Multimodal Artificially Intelligent Assistants

Selected to participate with PARC Digital Workforce in Perceptually-enabled Task Guidance (PTG):The Perceptually-enabled Task Guidance (PTG) program aims to develop artificial intelligence (AI) technologies to help users perform complex physical tasks while making them more versatile by expanding their skillset and more proficient by reducing their errors. PTG seeks to develop methods, techniques, and technology for artificially intelligent assistants that provide just-in-time visual and audio feedback to help with task execution.

Selected to participate in 2022-2023 Amazon Alexa, TaskBot, SimBot, and Socialbot Challenge. All of our three teams entered the final event with the highest rate given by Amazon Alexa users. All of them, Amazing!

Selected to participate in 2023 Amazon Alexa Socialbot Grand Challenge 5 (Our GauchoChat entered the final event, constantly received the highest rate by Amazon Alexa users): The challenge is focused on creating conversational social bots that can speak coherently and engagingly with humans for 20 minutes on a range of current events and topics.

Finalist in 2022-2023 Amazon Alexa Simbot Challenge (Our GauchoAI entered the final event again, constantly received the highest rate by Amazon Alexa users): The challenge is focused on helping advance development of next-generation virtual assistants that will assist humans in completing real-world tasks by continuously learning, and gaining the ability to perform commonsense reasoning. GauchoAI Places 2nd in AlexaPrize SimBot Challenge! (news)

Finalist in 2021-2022 Amazon Alexa Taskbot Challenge [Technical Report] (Our GauchoBot entered the final event and constantly received the highest rate by Amazon Alexa users): The challenge is focused on developing agents that assist customers in completing tasks requiring multiple steps and decisions. It's the first conversational AI challenge to incorporate multimodal (voice and vision) customer experiences.

Publications

Limitations of Language Models in Arithmetic and Symbolic Induction
by J. Qian, H. Wang, Z. Li, S. Li, X. Yan, 2022 [arxiv]
ACL'23 (Proc. of the Annual Meeting of the Association for Computational Linguistics)

Graph Reasoning for Question Answering with Triplet Retrieval
by S. Li, Y. Gao, H. Jiang, Q. Yin, Z. Li, X. Yan, C. Zhang and B. Yin
ACL'23 (Proc. of the Annual Meeting of the Association for Computational Linguistics) [pdf]

Forecasting Earnings Surprises from Conference Call Transcripts
by R. Koval, N. Andrews and X. Yan
ACL'23 (Proc. of the Annual Meeting of the Association for Computational Linguistics) [pdf]

Improving Medical Predictions by Irregular Multimodal Electronic Health Records Modeling
by X. Zhang, S. Li, Z. Chen, X. Yan, L. Petzold, 2022 [arxiv]
ICML'23 (The Fortieth International Conference on Machine Learning)

Guiding Large Language Models via Directional Stimulus Prompting
by Z. Li, B. Peng, P. He, M. Galley, J. Gao, X. Yan, 2023 [arxiv]

Explanations from Large Language Models Make Small Reasoners Better
by S. Li, J. Chen, Y. Shen, Z. Chen, X. Zhang, Z. Li, H. Wang, J. Qian, B. Peng, Y. Mao, W. Chen, X. Yan, 2023 [arxiv]

Visually-augmented language modeling
by W. Wang, L. Dong, H. Cheng, H. Song, X. Liu, X. Yan, J. Gao, F. Wei
ICLR'23 (Proceedings of Int. Conf. on Learning Representations) [pdf]

Limitations of Language Models in Arithmetic and Symbolic Induction
by J. Qian, H. Wang, Z. Li, S. Li, X. Yan, 2022 [arxiv]
Language Model Detoxification in Dialogue with Contextualized Stance Control
by J. Qian and X. Yan
EMNLP'22 (Proceedings of Findings of EMNLP 2022) [pdf]
Controllable Dialogue Simulation with In-context Learning
by Z. Li, W. Chen, S. Li, H. Wang, J. Qian and X. Yan
EMNLP'22 (Proceedings of Findings of EMNLP 2022) [pdf]
Explanations from Large Language Models Make Small Reasoners Better
by S. Li, J. Chen, Y. Shen, Z. Chen, X. Zhang, Z. Li, H. Wang, J. Qian, B. Peng, Y. Mao, W. Chen, X. Yan, 2022 [arxiv]
Visualization Question Answering Using Introspective Program Synthesis (PLDI'22 Distinguished Paper Award)
by Y. Chen, X. Yan, Y. Feng
PLDI'22 (The 43rd ACM SIGPLAN Conference on Programming Language Design and Implementation) [pdf]

[code]

Making something out of nothing: Building robust task-oriented dialogue systems from scratch
by Z. Li, H. Wang, A. Albalak, Y. Yang, J. Qian, S. Li, X. Yan
Amazon Alexa Prize TaskBot Challenge Proceedings 2022 [pdf]

Inductive Relation Prediction by BERT
by H. Zha, Z. Chen, X. Yan
AAAI'22 (Thirty-Sixth AAAI Conference on Artificial Intelligence) [arxiv]

Composite Re-Ranking for Efficient Document Search with BERT
Y. Yang, Y. Qiao, J. Shao, X. Yan, T. Yang,
WSDM'22 (ACM International Conference on Web Search and Data Mining) [arxiv]
Task-adaptive Pre-training and Self-training are Complementary for Natural Language Understanding
by S. Li, S. Yavuz, W. Chen and X. Yan
EMNLP'21 (Proceedings of Findings of EMNLP 2021) [pdf]
CoCo: Controllable Counterfactuals for Evaluating Dialogue State Trackers
by S. Li*, S. Yavuz*, K. Hashimoto, J. Li, T. Niu, N. Rajani, X. Yan, Y. Zhou and C. Xiong (*Equal Contribution)
ICLR'21 (International Conference on Learning Representations), 2021. [pdf] Leaderboard No.1 as Jan 2021 in Multiwoz
Beyond I.I.D.: Three Levels of Generalization for Question Answering on Knowledge Bases
by Y. Gu, S. Kase, M. Vanni, B. Sadler, P. Liang, X. Yan, Y. Su
WWW'21 (The World Wide Web Conf.) 2021. [arxiv] [Dataset: GrailQA]
KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation,
by W. Chen, Y. Su, X. Yan, W. Wang,
EMNLP'20 (Proc. of the 2020 Conference on Empirical Methods in Natural Language Processing) [pdf] [data/code]
HierCon: Hierarchical Organization of Technical Documents based on Concepts,
by K. Li, Shiyang Li, Semih Yavuz, Hanwen Zha, Yu Su, and Xifeng Yan,
ICDM'19 (Proc. 2019 IEEE Int. Conf. on Data Mining), Dec 2019. [pdf] (Best of ICDM 2019 selection)
Mining Algorithm Roadmap in Scientific Publications,
by H. Zha, W. Chen, K. Li and X. Yan,
KDD'19 (Proc. of the 25th Int. Conf. on Knowledge Discovery and Data Mining) [pdf]
Global Textual Relation Embedding for Relational Understanding,
by Z. Chen, H. Zha, H. Liu, W. Chen, X. Yan and Y. Su,
ACL'19 (Proc. of the Annual Meeting of the Association for Computational Linguistics) (Short Paper) [pdf]
Variational Knowledge Graph Reasoning,
by W. Chen, W. Xiong, X. Yan and W. Wang,
NAACL-HLT'18 (Proc. of the 16th North American Chapter of ACL: Human Language Technologies, 2018) [pdf]
Global Relation Embedding for Relation Extraction
by Yu Su*, Honglei Liu*, Semih Yavuz, Izzeddin Gur, Huan Sun, Xifeng Yan. [pdf] [code] (*: Equal Contribution) https://arxiv.org/abs/1704.05958, April 2017
NAACL-HLT'18 (Proc. of the 16th North American Chapter of ACL: Human Language Technologies, 2018)

What It Takes to Achieve 100% Condition Accuracy on WikiSQL,
by S. Yavuz, I. Gur, Y. Su, X. Yan,
EMNLP'18 (Proc. of the 2018 Conference on Empirical Methods in Natural Language Processing) [pdf]

XL-NBT: A Cross-lingual Neural Belief Tracking Framework,
by W. Chen, J. Chen, Y. Su, X. Wang, D. Yu, X. Yan and W. Wang,
EMNLP'18 (Proc. of the 2018 Conf. on Empirical Methods in Natural Language Processing) [pdf]

DialSQL: Dialogue Based Structured Query Generation,
by I. Gur, S. Yavuz, Y. Su, X. Yan,
ACL'18 (Proc. of the Annual Meeting of the Association for Computational Linguistics, 2018) [pdf]

Scalable Construction and Querying of Massive Knowledge Bases (Tutorial),
by X. Ren, Y. Su, P. Szekely, X. Yan.
WWW'18 (Proc. of the International Conference on World Wide Web), 2018 [website][slides1][slides2][slides3]

Construction and Querying of Large-scale Knowledge Bases (Tutorial),
by X. Ren, Y. Su, X. Yan.
CIKM'17(Proc. of the ACM International Conference on Information and Knowledge Management), 2017 [website][slides]

Cross-domain Semantic Parsing via Paraphrasing,
by Y. Su, X. Yan.
EMNLP'17 (Proc. of the 2017 Conf. on Empirical Methods in Natural Language Processing), 2017 [pdf]

Recovering Question Answering Errors via Query Revision,
by S. Yavuz, I. Gur, Y. Su, X. Yan.
EMNLP'17 (Proc. of the 2017 Conference on Empirical Methods in Natural Language Processing), 2017 [pdf]

Entity Disambiguation with Linkless Knowledge Bases,
by Y. Li, S. Tan, H. Sun, J. Han, D. Roth and X. Yan,
WWW'16 (Proc. of the 25th Int. World Wide Web Conference), 2016. [pdf]
Distributed Representations of Expertise,
by F. Han, S. Tan, H. Sun, M. Srivatsa, D. Cai, X. Yan,
SDM'16 (SIAM Int. Conf. on Data Mining), 2016. [pdf]

On Generating Characteristic-rich Question Sets for QA Evaluation,
by Y. Su, H. Sun, B. Sadler, M. Srivatsa, I. Gur, Z. Yan, and X. Yan,
EMNLP'16 (Proc. of the 2016 Conf. on Empirical Methods in Natural Language Processing) 2016 [pdf]

Improving Semantic Parsing via Answer Type Inference,
by S. Yavuz, I. Gur, Y. Su, M. Srivatsa, X. Yan,
EMNLP'16 (Proc. of the 2016 Conf. on Empirical Methods in Natural Language Processing), 2016 [pdf]

Semantic SPARQL Similarity Search Over RDF Knowledge Graphs,
by W. Zheng, L. Zou, W. Peng, X. Yan, S. Song, D. Zhao,
VLDB'16 (Prof. of the 42nd International Conference on Very Large Data Bases), 2016. [pdf]

Exploiting Relevance Feedback in Knowledge Graph Search,
by Y. Su, S. Yang, H. Sun, M. Srivatsa, S. Kase, M. Vanni and X. Yan,
KDD'15 (Proc. of Int. Conf. on Knowledge Discovery and Data Mining), 2015 [pdf]

SLQ: A User-friendly Graph Querying System,
by S. Yang, Y. Xie, Y. Wu, T. Wu, H. Sun, J. Wu, X. Yan,
SIGMOD'14 (Proc. 2014 Int. Conf. on Management of Data) (demo paper), 2014. [pdf]

Schemaless and Structureless Graph Querying,
by S. Yang, Y. Wu, H. Sun, X. Yan,
VLDB'14 (Proc. of the 40th Int. Conf. on Very Large Databases), 2014. [pdf]
Mining Evidences for Named Entity Disambiguation,
by Y. Li, C. Wang, F. Han, J. Han, D. Roth, and X. Yan,
KDD'13 (Proc. of the 19th Int. Conf. on Knowledge Discovery and Data Mining), Aug 2013. [pdf]
EntityRank: Searching Entities Directly and Holistically,
by T. Cheng, X. Yan and K. Chang.
VLDB'07b (Proc. of 2007 Int. Conf. on Very Large Data Bases), Sep. 2007. [pdf]

Dissertations

* Jing Qian, Ph.D., "Text Detoxification in Natural Language Processing," 2022 [pdf]

* Zhiyu Chen, Ph.D., "Knowledge-Grounded Natural Language Processing," 2022 [pdf]

* Hanwen Zha, Ph.D., "Towards Effort-Saving Knowledge Mining and Reasoning over the Web," 2021 [pdf]

* Wenhu Chen, Ph.D., "Knowledge-Grounded Natural Language Processing," 2021 [pdf]

* Semih Yavuz, Ph.D., "DeepAssist: Deep Knowledge Grounding for Factual and Conversational Natural Language Interfaces," 2019 [pdf]

* Izzeddin Gur, Ph.D., "Learning Natural Language Interfaces using Deep Neural Networks," 2019 [pdf]

* Yu Su, Ph.D., "Towards Democratizing Data Science with Natural Language Interfaces," 2018 [pdf]

* Huan Sun, Ph.D., "Mining Disparate Sources for Question Answering," 2016 [pdf]

* Yang Li, Ph.D., "Connecting Text with Knowledge," 2015 [pdf]