簡易檢索 / 詳目顯示

研究生: 陳垂康
Chen, Chu-Kwang
論文名稱: 應用具語句關注之連續對話狀態追蹤與強化學習之面試訓練系統
Sentence Attention-based Continuous Dialog State Tracking and Reinforcement Learning for Interview Coaching
指導教授: 吳宗憲
Wu, Chung-Hsien
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2017
畢業學年度: 105
語文別: 英文
論文頁數: 55
中文關鍵詞: 面試訓練對話系統對話管理主題模型注意力模型長短期記憶遞歸神經網路自編碼器強化學習
外文關鍵詞: interview coaching, dialog system, dialog management, topic model, attention model, LSTM, autoencoder, reinforcement learning
相關次數: 點閱:170下載:6
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 面試是一種很常被使用的入學管道,大家都知道面試的重要性,但是真正尋求面試專家進行面試練習的人卻很少。練習面試最直接的方式就是直接請專家來擔任面試官,但要花費的人力成本與時間配合都會是問題,而且學生難以反覆練習。本論文希望開發一個訓練面試的對話系統,可以更彈性地提供學生更多反覆練習面試的機會。
    本論文之研究主題為對話管理。在對話系統中,需要對話管理來決定對話流程,其中包含對話狀態追蹤和對話決策。傳統的對話狀態追蹤須由人工定義要追蹤的語意欄位,而本論文以主題機率分佈作為句子語意表示進行對話追蹤,而當對話回合由多個句子所組成時,其中可能包含不相關之句子,本論文結合卷積張量神經網路(CNTN)以及主題機率分佈,對多語句對話進行語句關注(Sentence Attention),給予每一句子重要度權重,再透過基於長短期記憶遞歸神經網路的自編碼器(LSTM-based Autoencoder)分別建立句子和對話回合之間的轉換及累積關係,藉此得到對話狀態,最後本論文為此面試流程設計獎勵函數(Reward Function)並利用強化學習中的Double Q-learning來建立觀察狀態和系統動作之間的關係。
    本論文共收集了260場面試對話,並採用五次交叉驗證來做實驗評估。從實驗結果顯示,本論文提出的方法相較與傳統方法,在一般問題的數量、追問問題的數量以及總問題的數量都更能達到語料中的平均值,而累積下來的總獎勵值也比傳統方法好。

    Admission interviews are one of the most frequently used methods of student selection. Even though people know the importance of such interviews, very few people practice their interview skills effectively by seeking professional help. Many students thus lack interview experience, and are likely to be nervous during an interview. There are many ways that can improve students’ interview skills, one of which is to hire a professional interview coach. This is the most direct way to practice interview skills, but it is also rather expensive.
    The main purpose of this thesis is thus to develop a dialog manager for an interview coaching system. In a dialog system, Dialog State Tracking (DST) and Dialog Policy (Policy) both are important tasks. The traditional approaches define the semantic slots manually for dialog state representation and tracking. This thesis adopts the topic profiles of the sentences as the representation of a dialog state. When the input sequence consists of several sentences, the summary vector is likely to contain noisy information from many irrelevant feature vectors. This thesis thus applies a sentence attention mechanism by combining the Convolutional Neural Tensor Network (CNTN) and Topic Profile for dialog state tracking. An LSTM-based autoencoder is used as dialog state tracker to model the transition and accumulation of dialog states. Finally, by applying Reinforcement Learning (RL) along with the designed reward functions, the agent learns its behavior from the interactions in an environment for making action decisions.
    This study collected 260 interview dialogs containing 3,016 dialog turns. A five-fold cross validation scheme was employed for evaluation. The results show that the proposed method performed better than the semantic slot-based baseline method by comparing the statistical data on the number of normal taken actions, follow-up taken actions and accumulative reward by Dialog Policy in the collected corpus.

    摘要 I Abstract II 誌謝 IV Contents V List of Tables VII List of Figures VIII Chapter 1 Introduction 1 1.1 Background 1 1.2 Motivation 2 1.3 Literature Review 3 1.3.1 Interview Coaching System 3 1.3.2 Dialog State Tracking 4 1.3.3 Attention Mechanisms 5 1.3.4 Dialog Policy 6 1.4 Problems and Proposed Methods 9 1.5 Research Framework 12 Chapter 2 MHMC Interview Database 13 2.1 Data Collection 13 2.2 Corpus Introduction 13 Chapter 3 Proposed Methods 16 3.1 Establishment of Topic Model 17 3.2 Sentence Attention Mechanism 20 3.2.1 Convolutional Neural Tensor Network (CNTN) 21 3.2.2 Sentence Attention – CNTN 23 3.2.3 Sentence Attention – Topic Profile 24 3.3 Dialog State Tracking 25 3.3.1 Long Short-Term Memory 26 3.3.2 LSTM-based Autoencoder 28 3.3.3 Establishment and Training of The DST 29 3.4 Reinforcement Learning in Dialog Policy 31 3.4.1 Agent of Reinforcement Learning 32 3.4.2 Reward Function 36 3.4.3 Policy Model Training 38 Chapter 4 Experimental Results and Discussion 40 4.1 Relevance Classification Performance 40 4.2 Evaluation of the LSTM-based Autoencoder 42 4.3 Evaluation of System Performance 45 4.3.1 Comparison of Topic Profile and Semantic Slot 45 4.3.2 Evaluation of Sentence Representation with Attention Mechanism 48 4.3.3 Discussion 49 Chapter 5 Conclusion and Future Work 51 References 53

    [1] J. Williams, A. Raux, and M. Henderson, "The dialog state tracking challenge series: A review," Dialogue & Discourse, vol. 7, no. 3, pp. 4-33, 2016.
    [2] (2016). 今年起大學個人申請最高占7成. Available: http://www.chinatimes.com/realtimenews/20160222005034-260405
    [3] Palladian. Available: http://www.palladiancr.com/
    [4] H. Jones and N. Sabouret, "TARDIS-A simulation platform with an affective virtual recruiter for job interviews," IDGEI (Intelligent Digital Games for Empowerment and Inclusion), 2013.
    [5] M. E. Hoque, M. Courgeon, J.-C. Martin, B. Mutlu, and R. W. Picard, "Mach: My automated conversation coach," in Proceedings of the 2013 ACM international joint conference on Pervasive and ubiquitous computing, 2013, pp. 697-706: ACM.
    [6] M. J. Smith et al., "Virtual reality job interview training in adults with autism spectrum disorder," Journal of Autism and Developmental Disorders, vol. 44, no. 10, pp. 2450-2463, 2014.
    [7] M. Henderson, "Machine learning for dialog state tracking: A review," in Proc. of The First International Workshop on Machine Learning in Spoken Language Processing, 2015.
    [8] V. Zue et al., "JUPlTER: a telephone-based conversational interface for weather information," IEEE Transactions on speech and audio processing, vol. 8, no. 1, pp. 85-96, 2000.
    [9] S. Larsson and D. R. Traum, "Information state and dialogue management in the TRINDI dialogue move engine toolkit," Natural language engineering, vol. 6, no. 3&4, pp. 323-340, 2000.
    [10] J. D. Williams, "Web-style ranking and SLU combination for dialog state tracking," in Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL), 2014, pp. 282-291.
    [11] M. Henderson, B. Thomson, and S. J. Young, "Deep Neural Network Approach for the Dialog State Tracking Challenge," in SIGDIAL Conference, 2013, pp. 467-471.
    [12] S. Lee, "Structured discriminative model for dialog state tracking," in Proceedings of the SIGDIAL 2013 Conference, 2013, pp. 442-451.
    [13] H. Ren, W. Xu, Y. Zhang, and Y. Yan, "Dialog state tracking using conditional random fields," in Proceedings of SIGDIAL, 2013.
    [14] M. Henderson, B. Thomson, and S. Young, "Word-based dialog state tracking with recurrent neural networks," in Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL), 2014, pp. 292-299.
    [15] L. Zilka and F. Jurcicek, "Incremental LSTM-based dialog state tracker," in Automatic Speech Recognition and Understanding (ASRU), 2015 IEEE Workshop on, 2015, pp. 757-762: IEEE.
    [16] O. Plátek, P. Bělohlávek, V. Hudeček, and F. Jurčíček, "Recurrent Neural Networks for Dialogue State Tracking," arXiv preprint arXiv:1606.08733, 2016.
    [17] S.-s. Shen and H.-y. Lee, "Neural attention models for sequence classification: Analysis and application to key term extraction and dialogue act detection," arXiv preprint arXiv:1604.00077, 2016.
    [18] D. Bahdanau, K. Cho, and Y. Bengio, "Neural machine translation by jointly learning to align and translate," arXiv preprint arXiv:1409.0473, 2014.
    [19] K. Xu et al., "Show, attend and tell: Neural image caption generation with visual attention," in International Conference on Machine Learning, 2015, pp. 2048-2057.
    [20] 張俊林. (2016). 以Attention Model為例談談兩種研究創新模式. Available: http://blog.csdn.net/malefactor/article/details/50583474
    [21] L. Shang, Z. Lu, and H. Li, "Neural responding machine for short-text conversation," arXiv preprint arXiv:1503.02364, 2015.
    [22] M.-T. Luong, H. Pham, and C. D. Manning, "Effective approaches to attention-based neural machine translation," arXiv preprint arXiv:1508.04025, 2015.
    [23] W.-N. Hsu, Y. Zhang, and J. Glass, "Recurrent Neural Network Encoder with Attention for Community Question Answering," arXiv preprint arXiv:1603.07044, 2016.
    [24] Y. Cui, Z. Chen, S. Wei, S. Wang, T. Liu, and G. Hu, "Attention-over-attention neural networks for reading comprehension," arXiv preprint arXiv:1607.04423, 2016.
    [25] C.-J. Lee, S.-K. Jung, K.-D. Kim, D.-H. Lee, and G. G.-B. Lee, "Recent approaches to dialog management for spoken dialog systems," Journal of Computing Science and Engineering, vol. 4, no. 1, pp. 1-22, 2010.
    [26] M. F. McTear, "Modelling spoken dialogues with state transition diagrams: experiences with the CSLU toolkit," development, vol. 5, no. 7, 1998.
    [27] E. Levin, R. Pieraccini, and W. Eckert, "Using Markov decision process for learning dialogue strategies," in Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on, 1998, vol. 1, pp. 201-204: IEEE.
    [28] S. Young, M. Gašić, B. Thomson, and J. D. Williams, "Pomdp-based statistical spoken dialog systems: A review," Proceedings of the IEEE, vol. 101, no. 5, pp. 1160-1179, 2013.
    [29] V. Mnih et al., "Human-level control through deep reinforcement learning," Nature, vol. 518, no. 7540, pp. 529-533, 2015.
    [30] S. Gu, T. Lillicrap, I. Sutskever, and S. Levine, "Continuous deep q-learning with model-based acceleration," arXiv preprint arXiv:1603.00748, 2016.
    [31] Z. Wang, T. Schaul, M. Hessel, H. van Hasselt, M. Lanctot, and N. de Freitas, "Dueling network architectures for deep reinforcement learning," arXiv preprint arXiv:1511.06581, 2015.
    [32] H. Van Hasselt, A. Guez, and D. Silver, "Deep Reinforcement Learning with Double Q-Learning," in AAAI, 2016, pp. 2094-2100.
    [33] I. Szita and A. Lörincz, "Learning Tetris using the noisy cross-entropy method," Neural computation, vol. 18, no. 12, pp. 2936-2941, 2006.
    [34] T. P. Lillicrap et al., "Continuous control with deep reinforcement learning," arXiv preprint arXiv:1509.02971, 2015.
    [35] V. Mnih et al., "Asynchronous methods for deep reinforcement learning," in International Conference on Machine Learning, 2016, pp. 1928-1937.
    [36] H. Cuayáhuitl, "Simpleds: A simple deep reinforcement learning dialogue system," in Dialogues with Social Robots: Springer, 2017, pp. 109-118.
    [37] D. M. Blei, A. Y. Ng, and M. I. Jordan, "Latent dirichlet allocation," Journal of machine Learning research, vol. 3, no. Jan, pp. 993-1022, 2003.
    [38] Wikipedia contributors. Latent Dirichlet allocation. Available: https://en.wikipedia.org/w/index.php?title=Latent_Dirichlet_allocation&oldid=786924256
    [39] yangliuy. (2012). 概率語言模型及其變形系列-LDA及Gibbs Sampling. Available: http://www.52nlp.cn/%E6%A6%82%E7%8E%87%E8%AF%AD%E8%A8%80%E6%A8%A1%E5%9E%8B%E5%8F%8A%E5%85%B6%E5%8F%98%E5%BD%A2%E7%B3%BB%E5%88%97-lda%E5%8F%8Agibbs-sampling
    [40] X. Qiu and X. Huang, "Convolutional Neural Tensor Network Architecture for Community-Based Question Answering," in IJCAI, 2015, pp. 1305-1311.
    [41] R. Socher, D. Chen, C. D. Manning, and A. Ng, "Reasoning with neural tensor networks for knowledge base completion," in Advances in neural information processing systems, 2013, pp. 926-934.

    下載圖示 校內:2019-08-31公開
    校外:2019-08-31公開
    QR CODE