簡易檢索 / 詳目顯示

研究生: 陳采奕
Chen, Tsai-Yi
論文名稱: 對話系統中使用多建議及用戶角色檢索於具資訊且長期回應之生成
Informative and Long-Term Response Generation Using Multiple Suggestions and User Persona Retrieval in a Dialogue System
指導教授: 吳宗憲
Wu, Chung-Hsien
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 人工智慧科技碩士學位學程
Graduate Program of Artificial Intelligence
論文出版年: 2022
畢業學年度: 110
語文別: 英文
論文頁數: 112
中文關鍵詞: 長時距對話系統TransformerUser persona
外文關鍵詞: Long-term dialogue system, Transformer, User persona
相關次數: 點閱:41下載:3
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年來人機對話系統是非常熱門的主題,若是對話系統可以隨著聊天時間的增加而更好的理解使用者,並能夠生成符合用戶期待的回應便可以成為使用者的朋友與他們分享生活點滴,因此本論文的目標為建立一個長時距的聊天對話系統,在對話之中能夠主動記住使用者的喜好和個性,做出符合使用者期待且具有資訊的回覆。
    本論文之貢獻為建立一個新穎的對話系統架構Multi Suggestions Transformer,能夠生成同理心建議句(Empathy Suggestion)特徵、系統角色建議句(System Persona Suggestion)特徵、知識型建議句(Knowledge Suggestion)特徵供生成模型參考並生成具有資訊的回應句。另外,本研究利用角色偵測模型(Persona detection model)和角色提取模型(Persona Extraction model)從使用者輸入話語中提取使用者角色(User Persona),並對當前的使用者輸入話語透過關鍵字和關鍵字擴展(Keyword Expansion)檢索出最適合當前使用者輸入句話語的使用者角色,將當前使用者輸入語句和該使用者角色串接作為模型的輸入句,最終幫助模型成為一個長時距的對話系統。
    本論文分別使用Blended Skill Talk和Multi Session Chat作為兩階段微調訓練之訓練語料,根據實驗結果顯示在Blended Skill Talk和Multi Session Chat測試集上Multi Suggestions Transformer 的BLEU、BERT score、Distinct、PPL之客觀評測接優於Blender Bot。在本研究引入之兩個新型客觀評測Eval_1和Eval_2上顯示Multi Suggestions Transformer 能夠有效利用使用者角色資訊達到長時距對話系統的效果。在人工主觀評測上面,Multi Suggestions Transformer在四個指標分別達到68%、56%、52%、64%,皆優於Blender Bot 模型。由此可知,本研究提出之對話系統能給予使用者更好的使用體驗。
    因此本論文提出之對話系統架構能夠有效生成具有資訊的回應句,並可以善用使用者角色資訊幫助模型記住過去的對話歷史,成為一個長時距對話系統。

    In recent years, the human-machine dialogue system is a very popular topic. If the dialogue system can better understand the users as the chat time increases and can generate responses that meet the users' expectations, it can become a friend of the users and share life with them. Therefore, this thesis aims to establish a long-term dialogue system that can actively remember the user's preferences and personality during the dialogue.
    The contribution of this thesis is to establish a novel dialogue system architecture Multi Suggestions Transformer, which can generate Empathy Suggestion Sentence features, System Persona Suggestion Sentence features, and Knowledge Suggestion Sentence features for the generation model to generate informative responses. In addition, this study uses the Persona Detection Model and the Persona Extraction Model to extract the user’s persona from the user utterance and retrieves the most suitable user’s persona through keywords and the Keyword Expansion model. The current user utterance and the user persona are concatenated as the input of the model, which finally helps the model to become a long-term dialogue system.
    This thesis uses Blended Skill Talk and Multi Session Chat as the training corpus in two different stages. According to the experimental results, our model shows a better performance than Blender Bot model in BLEU, BERT score, Distinct-n, and Perplexity on the Blended Skill Talk and Multi Session Chat. The two new evaluation metrices, Eval_1 and Eval_2, introduced in this study show that Our model can effectively utilize the user persona information to achieve a long-term dialogue system. In the human subjective evaluation, Our model achieves 68%, 56%, 52%, and 64% in the four indicators, which are all better than the Blender Bot model. It can be seen that the dialogue system proposed in this study can give users a better experience.
    Therefore, the dialogue system architecture proposed in this thesis can effectively generate informative responses. It can make good use of the user persona information to help the model remember the past dialogue history and become a long-term dialogue system.

    摘要 I Abstract III Contents V List of Tables VII List of Figures IX Chapter 1 Introduction 1 1.1 Background 1 1.2 Motivation 3 1.3 Literature Review 7 1.3.1 Chit-Chat Dialogue System 7 1.3.2 Text Categorization 10 1.3.3 Sentence Embedding Model 11 1.3.4 Natural Language Generation 12 1.4 Problems 16 1.5 Proposed Methods 18 Chapter 2 Dataset 20 2.1 Blended Skill Talk 20 2.2 Multi-Session Chat 25 Chapter 3 System Framework 32 3.1 Model Basic Architecture 34 3.1.1 Transformer 34 3.1.2 BERT 40 3.1.3 Sentence-BERT (SBERT) 46 3.2 Phase1 - Multi Suggestion Transformer Training 49 3.2.1 Multi Suggestions Transformer 50 3.2.2 Multi Suggestions Transformer Training 52 3.2.3 Multi Suggestions Transformer Loss Function 56 3.3 Phase2 – Combining User Persona for Response Generation Training 59 3.3.1 Persona Detection Model and Persona Extraction Model 60 3.3.2 Keyword and Keyword Expansion Retrieval 63 3.3.3 Combining User Persona for Response Generation Training 67 3.3.4 Combining User Persona for Response Generation Training Loss Function 68 3.4 Testing 69 Chapter 4 Experimental Environment, Results and Discussion 71 4.1 Evaluation indicators 71 4.1.1 BLEU score 72 4.1.2 BERT score 74 4.1.3 Distinct-n 76 4.1.4 Perplexity (PPL) 76 4.1.5 Eval_1 and Eval_2 77 4.1.6 Human Subjective Evaluation 79 4.2 Experimental Results and Discussion of the First Stage 82 4.3 Experimental Results and Discussion of the Second Stage 91 4.3.1 Evaluation of User Persona Detection Model 91 4.3.2 Evaluation of User Persona Extraction Model 94 4.3.3 Keyword Extraction and Keyword Expansion Retrieval Evaluation 97 4.3.4 Evaluation of Combining User Persona for Response Generation Training 99 Chapter 5 Conclusion and Future Work 107 Reference 109

    [1] "The hidden ways your language betrays your character." [Online]. Available: https://www.bbc.com/future/article/20170720-the-hidden-ways-your-language-betrays-your-character.
    [2] H. T. Reis, M. R. Maniaci, P. A. Caprariello, P. W. Eastwick, and E. J. Finkel, "Familiarity does indeed promote attraction in live interaction," Journal of personality and social psychology, vol. 101, no. 3, p. 557, 2011.
    [3] "我身處人群,卻如此孤獨." https://www.bbc.com/zhongwen/trad/uk-45729511 (accessed.
    [4] H. Rashkin, E. M. Smith, M. Li, and Y.-L. Boureau, "Towards empathetic open-domain conversation models: A new benchmark and dataset," arXiv preprint arXiv:1811.00207, 2018.
    [5] R. Zandie and M. H. Mahoor, "Emptransfo: A multi-head transformer architecture for creating empathetic dialog systems," in The Thirty-Third International Flairs Conference, 2020.
    [6] J. Shin, P. Xu, A. Madotto, and P. Fung, "Happybot: Generating empathetic dialogue responses by improving user experience look-ahead," arXiv preprint arXiv:1906.08487, 2019.
    [7] 王怡萱, "基於對話情境及同理分析於條件轉換器之同理回應生成," 2020.
    [8] S. Zhang, E. Dinan, J. Urbanek, A. Szlam, D. Kiela, and J. Weston, "Personalizing dialogue agents: I have a dog, do you have pets too?," arXiv preprint arXiv:1801.07243, 2018.
    [9] H. Song, Y. Wang, W.-N. Zhang, X. Liu, and T. Liu, "Generate, delete and rewrite: A three-stage framework for improving persona consistency of dialogue generation," arXiv preprint arXiv:2004.07672, 2020.
    [10] Q. Liu et al., "You impress me: Dialogue generation via mutual persona perception," arXiv preprint arXiv:2004.05388, 2020.
    [11] T. Wolf, V. Sanh, J. Chaumond, and C. Delangue, "Transfertransfo: A transfer learning approach for neural network based conversational agents," arXiv preprint arXiv:1901.08149, 2019.
    [12] E. Dinan, S. Roller, K. Shuster, A. Fan, M. Auli, and J. Weston, "Wizard of wikipedia: Knowledge-powered conversational agents," arXiv preprint arXiv:1811.01241, 2018.
    [13] R. Lian, M. Xie, F. Wang, J. Peng, and H. Wu, "Learning to select knowledge for response generation in dialog systems," arXiv preprint arXiv:1902.04911, 2019.
    [14] P. Lewis et al., "Retrieval-augmented generation for knowledge-intensive nlp tasks," Advances in Neural Information Processing Systems, vol. 33, pp. 9459-9474, 2020.
    [15] E. M. Smith, M. Williamson, K. Shuster, J. Weston, and Y.-L. Boureau, "Can you put it all together: Evaluating conversational agents' ability to blend skills," arXiv preprint arXiv:2004.08449, 2020.
    [16] S. Roller et al., "Recipes for building an open-domain chatbot," arXiv preprint arXiv:2004.13637, 2020.
    [17] D. Adiwardana et al., "Towards a human-like open-domain chatbot," arXiv preprint arXiv:2001.09977, 2020.
    [18] L. Zhou, J. Gao, D. Li, and H.-Y. Shum, "The design and implementation of xiaoice, an empathetic social chatbot," Computational Linguistics, vol. 46, no. 1, pp. 53-93, 2020.
    [19] J. Xu, A. Szlam, and J. Weston, "Beyond goldfish memory: Long-term open-domain conversation," arXiv preprint arXiv:2107.07567, 2021.
    [20] H. C. Yu, T. H. Huang, and H. H. Chen, "Domain dependent word polarity analysis for sentiment classification," in 24th Conference on Computational Linguistics and Speech Processing, ROCLING 2012, 2012, pp. 30-31.
    [21] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "Bert: Pre-training of deep bidirectional transformers for language understanding," arXiv preprint arXiv:1810.04805, 2018.
    [22] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, "Distributed representations of words and phrases and their compositionality," Advances in neural information processing systems, vol. 26, 2013.
    [23] J. Pennington, R. Socher, and C. D. Manning, "Glove: Global vectors for word representation," in Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014, pp. 1532-1543.
    [24] A. Joulin, E. Grave, P. Bojanowski, and T. Mikolov, "Bag of tricks for efficient text classification," arXiv preprint arXiv:1607.01759, 2016.
    [25] Q. Le and T. Mikolov, "Distributed representations of sentences and documents," in International conference on machine learning, 2014: PMLR, pp. 1188-1196.
    [26] R. Kiros et al., "Skip-thought vectors," Advances in neural information processing systems, vol. 28, 2015.
    [27] L. Logeswaran and H. Lee, "An efficient framework for learning sentence representations," arXiv preprint arXiv:1803.02893, 2018.
    [28] N. Reimers and I. Gurevych, "Sentence-bert: Sentence embeddings using siamese bert-networks," arXiv preprint arXiv:1908.10084, 2019.
    [29] S. Verberne, "Retrieval-based Question Answering for Machine Reading Evaluation," in CLEF (Notebook Thesiss/Labs/Workshop), 2011.
    [30] D. Bahdanau, K. Cho, and Y. Bengio, "Neural machine translation by jointly learning to align and translate," arXiv preprint arXiv:1409.0473, 2014.
    [31] A. Vaswani et al., "Attention is all you need," Advances in neural information processing systems, vol. 30, 2017.
    [32] N. Moghe, S. Arora, S. Banerjee, and M. M. Khapra, "Towards exploiting background knowledge for building conversation systems," arXiv preprint arXiv:1809.08205, 2018.
    [33] L. Qin et al., "Conversing by reading: Contentful neural conversation with on-demand machine reading," arXiv preprint arXiv:1906.02738, 2019.
    [34] E. Dinan, "The second conversational intelligence challenge (ConvAI2)." 2020.
    [35] S. Humeau, K. Shuster, M.-A. Lachaux, and J. Weston, "Real-time inference in multi-sentence tasks with deep pretrained transformers," arXiv preprint arXiv:1905.01969, 2019.
    [36] A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever, "Improving language understanding by generative pre-training," 2018.
    [37] M. Peters et al., "Deep contextualized word representations. arXiv 2018," arXiv preprint arXiv:1802.05365, vol. 12, 1802.
    [38] R. Campos, V. Mangaravite, A. Pasquali, A. Jorge, C. Nunes, and A. Jatowt, "YAKE! Keyword extraction from single documents using multiple local features," Information Sciences, vol. 509, pp. 257-289, 2020.
    [39] G. A. Miller, "WordNet: a lexical database for English," Communications of the ACM, vol. 38, no. 11, pp. 39-41, 1995.
    [40] K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, "Bleu: a method for automatic evaluation of machine translation," in Proceedings of the 40th annual meeting of the Association for Computational Linguistics, 2002, pp. 311-318.
    [41] J. Li, M. Galley, C. Brockett, J. Gao, and B. Dolan, "A diversity-promoting objective function for neural conversation models," arXiv preprint arXiv:1510.03055, 2015.
    [42] C.-W. Liu, R. Lowe, I. V. Serban, M. Noseworthy, L. Charlin, and J. Pineau, "How not to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation," arXiv preprint arXiv:1603.08023, 2016.
    [43] T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger, and Y. Artzi, "Bertscore: Evaluating text generation with bert," arXiv preprint arXiv:1904.09675, 2019.
    [44] S. Banerjee and A. Lavie, "METEOR: An automatic metric for MT evaluation with improved correlation with human judgments," in Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, 2005, pp. 65-72.
    [45] M. Li, J. Weston, and S. Roller, "Acute-eval: Improved dialogue evaluation with optimized questions and multi-turn comparisons," arXiv preprint arXiv:1909.03087, 2019.

    下載圖示 校內:2025-03-06公開
    校外:2025-03-06公開
    QR CODE