Tencent AI Lab gives a generous donation to UCSB NLP Group for the work on "Cross-Lingual and Open-World Task-Oriented Dialogue Schema Induction and Generation". This builds on their previous Tencent AI Lab Rhino-Bird Gift Fund project "XL-NBT: A Cross-lingual Neural Belief Tracking Framework", led by UCSB NLP Group PhD student Wenhu Chen.
Task-oriented dialog systems are becoming pervasive, and many companies heavily rely on them to complement human agents for customer service in call centers. With globalization, the need for providing cross-lingual customer support becomes more urgent than ever. However, cross-lingual support poses great challenges---it requires a large amount of additional annotated data from native speakers. In order to bypass the expensive human annotation and achieve the first step towards the ultimate goal of building a universal dialog system, we set out to build a cross-lingual state tracking framework. Specifically, we assume that there exists a source language with dialog belief tracking annotations while the target languages have no annotated dialog data of any form. Then, we pre-train a state tracker for the source language as a teacher, which is able to exploit easy-to-access parallel data. We then distill and transfer its own knowledge to the student state tracker in target languages. We specifically discuss two types of common parallel resources: bilingual corpus and bilingual dictionary, and design different transfer learning strategies accordingly. Experimentally, we successfully use English state tracker as the teacher to transfer its knowledge to both Italian and German trackers and achieve promising results. This work was presented at the recent EMNLP 2018 conference in Brussels, Belgium.