| 研究生: |
潘昌義 Pan, Chang-Yi |
|---|---|
| 論文名稱: |
情境感知知識編碼之類神經對話模型 Context-Aware Knowledge Encoding for Neural Conversation Model |
| 指導教授: |
高宏宇
Kao, Hung-Yu |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2020 |
| 畢業學年度: | 108 |
| 語文別: | 英文 |
| 論文頁數: | 48 |
| 中文關鍵詞: | 對話模型 、自然語言生成 、注意力機制 |
| 外文關鍵詞: | conversation model, natural language generation, attention mechanism |
| 相關次數: | 點閱:131 下載:8 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
如今,能夠與真人用戶進行多輪會話的智能助理系統變得越來越流行。憑藉自然語言處理的強力支持,這項技術透過會話模型廣泛實現在各個領域中,例如聊天機器人與客戶服務系統。現行研究將對話模型的主要分為兩類,分別為檢索式模型與生成式模型。檢索式模型自現有的對話資料集中,以文字比對的方法挑選出與當前輸入句最匹配的問句-回覆組合中的回覆作為答案;生成式模型則透過自然語言生成以序列到序列的方式產生一個全新的回覆。
雖然生成式模型能夠針對輸入句子提供更為量身訂做的回覆句,但是該回覆仍有資訊性不足與文不及義的現象。在這篇論文中,我們為具有檢索訊息的生成式神經會話模型提出了情境感知知識編碼器(Context-Aware Knowledge Encoder, CAKE)以解決上述的問題。情境感知的知識編碼器由兩階層的注意力機制的編碼器所構成,分別為透過卷積神經網路判斷對話歷史的重要性以建構完整情境的情境感知編碼器,並利用完整情境自外部資訊抽取重要但不常見的關鍵字的知識編碼器。實驗結果顯示透過句層級的情境感知編碼器,與字層級的知識編碼器皆能改善對話模型的效能。我們希望這項研究的結果能提供生成式對話模型全新的見解,從不同層面挖掘生成模型的潛力。
Intelligence assistant systems which can take multi-turns conversation with human become popular nowadays. With the power of natural language processing, the conversation model has been widely used in various fields to achieve this technique such as chatbot and customer service. Recent research has studied the conversation model by either retrieval-based method or generation-based method. The retrieval-based model chooses the best match response for input sentence from the existing repository via word matching, and the generation-based model conducts a new response through the natural language generation.
Though generation-based conversation model can provide responses tailored to the input sentence, the responses still have lacks information and context. To address these issues, we propose Context-Aware Knowledge Encoder (CAKE) for generation-based conversation model with retrieved information. CAKE composes of two-level attention-oriented encoder, which includes context-aware encoder for full context and knowledge encoder for keyword. Experiment results demonstrate that our model has better performance on both sentence-level context encoder and word-level knowledge encoder. We hope that findings in this study can provide new insights that focus on the potential of generation-based conversation model.
[1] Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. Advanced in the International Conference on Learning Representations.
[2] Baheti, A., Ritter, A., Li, J., & Dolan, B. (2018). Generating more interesting responses in neural conversation models with distributional constraints. In Proceedings of the 2018 Conference on empirical methods in natural language processing (EMNLP) (pp. 3970-3980).
[3] Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of machine Learning research, 3(Jan), 993-1022.
[4] Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1724-1734).
[5] Gao, C., & Ren, J. (2019). A topic-driven language model for learning to generate diverse sentences. Neurocomputing, 333, 374-380.
[6] Gao, J., Galley, M., & Li, L. (2018, June). Neural approaches to conversational AI. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (pp. 1371-1374).
[7] Ghazvininejad, M., Brockett, C., Chang, M. W., Dolan, B., Gao, J., Yih, W. T., & Galley, M. (2018, April). A knowledge-grounded neural conversation model. In Thirty-Second AAAI Conference on Artificial Intelligence.
[8] Ji, Z., Lu, Z., & Li, H. (2014). An information retrieval approach to short text conversation. CoRR.
[9] Kim, Y. (2014). Convolutional neural networks for sentence classification. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1746-1751)
[10] Kingma, D. P., & Welling, M. (2013). Auto-encoding variational bayes. Advanced in the International Conference on Learning Representations.
[11] Lin, C. Y. (2004, July). Rouge: A package for automatic evaluation of summaries. In Text summarization branches out (pp. 74-81).
[12] Lowe, R., Pow, N., Serban, I., & Pineau, J. (2015). The ubuntu dialogue corpus: A large dataset for research in unstructured multi-turn dialogue systems. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics (pp. 311-318).
[13] Papineni, K., Roukos, S., Ward, T., & Zhu, W. J. (2002, July). BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics (pp. 311-318).
[14] Pennington, J., Socher, R., & Manning, C. D. (2014, October). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532-1543).
[15] Qu, C., Yang, L., Qiu, M., Croft, W. B., Zhang, Y., & Iyyer, M. (2019, July). BERT with history answer embedding for conversational question answering. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 1133-1136).
[16] Qu, C., Yang, L., Chen, C., Qiu, M., Croft, W. B., & Iyyer, M. (2020). Open-Retrieval Conversational Question Answering. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval.
[17] Robertson, S. E., & Walker, S. (1994). Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In SIGIR’94 (pp. 232-241). Springer, London.
[18] Song, Y., Yan, R., Li, C. T., Nie, J. Y., Zhang, M., & Zhao, D. (2018). An Ensemble of Retrieval-Based and Generation-Based Human-Computer Conversation Systems. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence Main track. (pp. 4382-4388).
[19] Sordoni, A., Galley, M., Auli, M., Brockett, C., Ji, Y., Mitchell, M., … & Dolan, B. (2015). A neural network approach to context-sensitive generation of conversational responses. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
[20] Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. In Advances in neural information processing systems (pp. 3104-3112).
[21] Tian, Z., Yan, R., Mou, L., Song, Y., Feng, Y., & Zhao, D. (2017, July). How to make context more useful? an empirical study on context-aware neural conversational models. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) (pp. 231-236).
[22] Vinyals, O., & Le, Q. (2015). A neural conversational model. ICML Deep Learning Workshop.
[23] Sukhbaatar, S., Weston, J., & Fergus, R. (2015). End-to-end memory networks. Advances in neural information processing systems (pp. 2440-2448).
[24] Yang, L., Hu, J., Qiu, M., Qu, C., Gao, J., Croft, W. B., … & Liu, J. (2019, November). A hybrid retrieval-generation neural conversation model. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management (pp. 1341-1350).