| 研究生: |
蕭育丞 Hsiao, Yu-Cheng |
|---|---|
| 論文名稱: |
應用句型結構與部份樣本樹於對話行為偵測之研究 A Study on Detection of Dialogue Act Using Sentence Structure and Partial Pattern Trees |
| 指導教授: |
吳宗憲
Wu, Chung-Hsien |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2009 |
| 畢業學年度: | 97 |
| 語文別: | 中文 |
| 論文頁數: | 62 |
| 中文關鍵詞: | 語意表格 、對話行為偵測 、句型結構 、部份樣本樹 、部份觀察馬可夫決策程序 |
| 外文關鍵詞: | sentence structure, partial pattern trees, dialogue act detection, semantic slot, partially observation markov decision process |
| 相關次數: | 點閱:98 下載:1 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在高度資訊化社會中,以自然語言輸入方式之對話系統,是未來最理想的人機互動方式之一。在對話系統中,錯誤的語音辨識造成語意誤解常使得人機對話無法順利進行,所以如何讓電腦了解語者對話行為及採取正確回應成為一個值得研究的主題。
在本論文中主要是提出一種使用部份樣本樹(partial pattern trees)及句型結構(sentence structure)的對話行為偵測方法,目的是解決因為語音辨識錯誤而產生語意理解的錯誤。為了建構具強健性的潛在對話行為矩陣(latent dialogue act matrix),訓練語料藉由部份樣本樹產生多重候選句以避免因語音辨識錯誤而短少重要關鍵字,候選句再經由剖析器(parser)以得到該語句所對應的規則(rule),但為了避免對話行為之間的混淆,我們針對每一類對話行為進行了語句聚集(sentence clustering)以得到每一類對話行為最適應的規則群。最後,潛在對話行為矩陣被建構來模型化規則和對話行為之間的關係。因此,我們可經由潛在對話行為矩陣判斷出最佳對話行為。另外,在對話管理方面我們採用了部份觀察馬可夫決策程序(partially observation markov decision process),讓系統透過對話歷程學習最適合的對話策略。
為了進行方法之評估,我們建立一個旅遊資訊諮詢對話系統作為實際應用測試平台。而在測試系統時,分別就每項對話行為做測試可得到整體正確率為82.9%,且語意表格(semantic slot)的正確率為49.6%,我們提升了33.3%的正確率,由實驗可知論文所提之方法在實際應用上能有明顯的效能提升。
In the high information-intensive society, one of the most ideal man-machine interactive communications is the dialogue system using natural language in the near future. Misunderstanding in semantic interpretation usually results in the incomplete dialog in the traditional dialogue management, especially in the dialogue act or intention identification. The understanding of the utterance of the user will become the most interesting research issue.
This thesis proposes a novel understanding approach using partial pattern trees and sentence structure to detect the optimal dialogue act to improve the dialogue act detection error due to error prone speech recognition. To construct the robust latent dialogue act matrix, partial pattern trees is employed to generate candidate sentences from the training data in order to avoid losing the important keyword caused by speech recognition error. Then, we use the Stanford parser to deconstruct the candidate sentences to obtain the sentence rules. Besides, we use sentence clustering for each dialogue act to obtain the best sentence rule groups to avoid dialogue act confusion. Finally, the latent dialogue act matrix is constructed to model the relationship between rules and dialogue acts. Therefore, we could detect the best dialogue act. Moreover, the partially observation markov decision process was utilized to learn the best dialogue strategy according to dialogue history in dialogue management.
In order to evaluate our proposed approach, a travel inquiry dialog system was developed. The dialogue act detection accuracy is 82.9 %. Compared with the semantic slot based approaches, we have 33.3% improvement. The results show that the performance of the proposed method outperformed the traditional approaches in semantic understanding of spoken dialog system.
[1] James F. Allen, Donna K. Byron, Myroslava Dzikovska, George Ferguson, Lucian Galescu, and Amanda Stent, “Towards Conversational Human-Computer Interaction,” AI Magazine, 2001.
[2] Xuedong Huang, Alex Acero, Hsiao-Wuen Hon, “Spoken Language Processing”, Prentice-Halln, Inc. 2001
[3] Ferguson George, James F. Allen, Brad W. Miller, Eric K. Ringger, “The Design and Implementation of the TRAINS-96 System: A Prototype Mixed-Initiative Planning Assistant”, TRAINS Technical Note 96-5, 1996.
[4] Jingjing Liu, Yushi Xu, Stephanie Seneff, and Victor Zue, “Citybrowser Ⅱ: A multimodal restaurant guide in Mandarin”, in Chinese Spoken Language Processing, 2008.
[5] AT&T(2002)How May I Help You? [Online] Available:
http://www.research.att.com/~algot/hmihy/
[6] Chiori Hori, Kiyonori Ohtake, Teruhisa Misu, Hideki Kashioka, Satoshi Nakamura, “Dialog Management using Weighted Finite-State Transducers”, Interspeech, 2008
[7] Mikio Nakano, Kohji Dohsaka, Noboru Miyazaki, Jun-ichi Hirasawa, Masafumi Tamoto, Masahito Kawamori, Akira Sugiyama, Takeshi Kawabata, “Handling Rich Turn-Taking in Spoken Dialogue System”, Eurospeech99.
[8] S. Bennacef, L. Devillers, S. Rosset, L. Lamel, “Dialogue in the RAILTEL Telephone-Based System”, in ICSKP’96, Vol. 1. pp. 550-553
[9] Chun-Jen Lee, Eng-Fong Huang, and Jung-Kuei Chen, “A Multi-keyword Spotter for the Application of the TL Phone Directory Assistant Service”, in Proceedings of 1997 Workshop on Distributed System Technologies & Applications, pp. 197-202
[10] 蔡金翰, “語音對話系統和對話策略之研究”, 交大, 電信工程學系, 碩士論文, 2005
[11] Tung-Hui Chiang, Chung-Ming Peng, Yi-Chung Lin, Huei Ming Wang and Shih Chieh Chieh, “The Design of a Mandarin Chinese Spoken Dialogue System”, in Proceedings of COTEC’98, Taipei 1998, pp.E2-5.1~E2-5.7
[12] 陳銘軍, 葉瑞峰, 吳宗憲, “以知識概念模型為基礎之多主題對話管理系統”, in Proceedings of ROCLING XV, Hsinchu, Taiwan, 2003.
[13] Won Seug CHOI, Harksoo KIM, and Jungyun SEO, “An Integrated Dialogue Analysis Model for Determining Speech Acts and Discourse Structures,” in the Institute of Electronics, Information and Communication Engineers, 2005
[14] Chiori Hori, Kiyonori Ohtake, Teruhisa Misu, Hideki Kashioka, Satoshi Nakamura, “Dialog Management using Weighted Finite-State Transducers,” in Proceeding of Interspeech, 2008.
[15] Jason D. Williams, Steve Young, “Partially observable Markov decision processes for spoken dialog systems,” in Computer Speech and Language, 2007.
[16] David R. Traum, “Speech Act for Dialogue Agents,” Kluwer Academic Publishers, 1999.
[17] The Stanford Natural Language Processing Group [Online] Available: http://nlp.stanford.edu/software/lex-parser.shtml
[18] 旅遊資訊王 [Online] Available: http://travel.network.com.tw/tourguide/twnmap/tainancity.asp
[19] 台南民宿度假旅遊網 [Online] Available: http://www.twhotel.com.tw/kk3-1-1.htm
[20] 奇摩生活家 [Online] Available: http://tw.lifestyle.yahoo.com/
[21] Google Map API [Online] Available: http://code.google.com/intl/zh-TW/apis/maps/
[22] 交通部台灣鐵路管理局 [Online] Available: http://new.twtraffic.com.tw/index.aspx
[23] 易飛網 [Online] Available: http://www.ezfly.com/
[24] Dan Klein, Christopher D. Manning, “Fast Exact Inference with a Factored Model for Natural Language Parsing,” in Advances in Neural Information Processing Systems 15, 2003.
[25] Dan Klein and Christopher D. Manning, “Accurate Unlexicalized Parsing”, Proceedings of the 41st Meeting of the Association for Computational Linguistics, pp. 423-430, 2003.
[26] Richard O. Duda, Peter E. Hart, and David G. Stork, “Pattern Classification,” A Wiley-Interscience Publication, 2001
[27] E. Shriberg and A. Stolcke, “Word predictability after hesitations: A corpus-based study”, in Proc. Inc. Conf. Spoken Language Processing, pp. 1868-1871, 1996.
[28] M Siu, M. Ostendorf, and H. Gish, “Modeling disfluencies in conversational speec” , in Proc. Inc. Conf. Spoken Language Processing, vol 1, pp. 386-389, 1996.
[29] T. Niesler and P. Woodland, “Variable-length category n-gram language models”, Comput. Speech Lang., vol. 21, pp. 1-26, 1999.
[30] Janna S. Hamaker, “Towards building a better language Model for switchboard: the POS tagging task”, in Proc. Int. Conf. Acoustics, Speech, Signal Processing, pp. 579-582, 1999.
[31] G.F.V.B. van Leeuwen, C. Nieuwoudt, and Botha, E.C., “Towards a speech recognition based automatic telephone exchange with an Afrikaans conversational interface”, Africon, vol. 1, pp. 195-196, 1999.
[32] Matthijs T. J. Spaan, Nikos Vlassis, “Perseus: Randomized Point-based Value Iteration for POMDPs,” in Journal of Artificial Intelligence Research, 2005.
[33] Perseus [Online] Available: http://staff.science.uva.nl/~mtjspaan/software/approx/