簡易檢索 / 詳目顯示

研究生: 王志宏
Wang, Chih-Hung
論文名稱: 應用互動風格之偵測及句型比對於多元化對話回應之選擇
Versatile Dialogue Response Selection based on Interactional Style Detection and Sentence Matching
指導教授: 吳宗憲
Wu, Chung-Hsien
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2010
畢業學年度: 98
語文別: 中文
論文頁數: 52
中文關鍵詞: 互動風格偵測句子比對主題情緒
外文關鍵詞: Interactional Style Detection, Sentence Matching, Topic, Emotion
相關次數: 點閱:114下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文主要是提出一種以互動風格偵測(interactional style detect)為基礎及句子比對(sentence matching)的方法來增進傳統口語對話系統(spoken dialogue system, SDS)中單調回應的問題,目的是為了使得對話系統的回應能更加多元化,而不只是局限於單一回應或是隨機選取的預先制定的回應。因為人與人在對話的過程中會因為聊天的主題以及當下的情緒而產生對話互動上的差異。為了讓系統了解使用者目前的互動風格進而提供一個適當的回應給使用者,我們利用潛在狄氏配置(latent Dirichlet allocation, LDA)進行對話主題模型的訓練以進行對話主題的機率估算。此外,我們利用支持向量機(support vector machine, SVM)以及最大熵(maximum entropy, ME)分別對情緒韻律模型(emotional prosody model)以及情緒語言模型(emotional linguistic model)進行訓練,並利用這兩種情緒相關模型來估算情緒部分的機率。最後,由於類神經網路(artificial neural network, ANN)是模擬人類大腦的思考方式,因此,我們採用ANN作為本論文的互動風格偵測模型。以前述的聊天對話主題機率、情緒韻律機率和情緒語言機率來做為ANN輸入層神經結(node)的資訊,而ANN輸出層的神經結包含每一種互動風格。在句子比對方面,我們考量自發性語音(spontaneous speech)辨識是一種錯誤傾向(error-prone)語音辨識,進而嚴重影響句子比對的效能。因此, 除了藉由部份樣本樹(partial pattern tree, PPT)產生多重候選句以解決包含插入(insertion)和取代(substitution)形錯誤外,包括同義詞詞彙對照(synonym mapping)以解決自發性語音多樣性的文字內容和次音節層表示 (subsyllable-level representation)以解決同音異字(homophone)的問題都被納入句子比對中。為了驗證本論文所提出的方法,我們建立一個聊天對話系統作為語料收集和實際應用測試平台。而在測試系統時,對句子比對做測試可得到整體正確率為74.3%,且互動風格的正確率為82.67%,由實驗可知論文所提之方法在實際應用上能有明顯的效能提升。

    This thesis proposes interactional style (IS) detection based spoken dialogue system (SDS) with sentence matching to enhance the problem of drab responses in conventional spoken dialogue system. The goal of this thesis is that the responses from the SDS should be versatile ones instead of randomly selecting pre-defined responses because the dialogue turns can be affected by several factors - speakers’ topic-orientations and emotional states. For this purpose, latent Dirichlet allocation (LDA) is employed to model the topic-orientation; support vector machine (SVM) and maximum entropy (ME) are utilized to model the emotional prosody and emotional linguistic, respectively. Furthermore, artificial neural network (ANN) is designed to simulate the thinking skills of human beings; therefore, ANN is adopt for the IS detection in which the probabilities estimated by the models of aforementioned factors are treated as the input node and the output nodes comprise each kind of IS. For sentence matching, besides partial pattern tree (PPT) to generate several candidate sentences for error-prone speech recognition in spontaneous speech, synonym mapping for the problem of colorful words in spontaneous speech and subsyllable-level representation for the problem of homophone are included into sentence matching. In order to evaluate the proposed approach, a IS detection based SDS with sentence matching was built to corpus harvesting and be the evaluation platform. The evaluation results revealed that the performance of sentence matching and IS detection can achieve 74.3% and 82.67% accuracy, respectively.

    Abstract V 表目錄 X 圖目錄 XI 第一章 緒論 - 1 - 1.1 研究背景 - 1 - 1.2 國內外相關研究現況 - 2 - 1.3 研究目的與動機 - 3 - 1.4 研究方法簡介 - 5 - 1.5 章節概要 - 5 - 第二章 語料收集與標記 - 7 - 2.1 語料收集系統 - 7 - 2.2 標記 - 8 - 2.3 情緒語料收集 - 9 - 2.4 聊天與對話主題語料蒐集 - 9 - 第三章 系統簡介 - 10 - 3.1 訓練部分 - 10 - a. 主題偵測模型之訓練 - 11 - b. 情緒偵測模型之訓練 - 11 - c. 互動風格偵測模型之訓練 - 11 - 3.2 驗證部分 - 12 - 第四章 互動風格偵測模型 - 13 - 4.1 以潛在狄氏配置為基礎之主題偵測 - 13 - a. 主題偵測模型訓練 - 16 - b. 主題偵測 - 17 - 4.2 情緒之偵測(Emotion Detection) - 18 - a. 情緒偵測模型訓練 - 18 - b. 情緒偵測 - 19 - c. 情緒韻律參數萃取 - 19 - d. 支持向量機(SVM)為基礎之情緒韻律偵測 - 20 - 4.3 語言情緒之偵測(Linguistic Emotion Detection) - 23 - a. 情緒語言特徵萃取與知網情感文字集 - 23 - b. 最大熵模型(Maximum Entropy Model, ME) - 24 - 4.4 類神經網路 - 25 - 第五章 句子比對 - 29 - 5.1 句子比對流程 - 29 - 5.2 同義詞對照 - 30 - 5.3 部分樣本樹 - 31 - 5.4 次音節轉換 - 33 - 5.5 餘弦量測 - 33 - 第六章 實驗 - 36 - 6.1 聲學參數與模型 - 36 - 6.2 所使用之相關工具 - 37 - 6.3 句子比對辨識率 - 38 - 6.4 語言情緒與韻律情緒之辨識率 - 38 - 6.5 情緒與互動風格之分析 - 39 - 6.6 類神經網路隱藏層數目與互動風格辨識率比較 - 40 - 6.7 結合句子比對與互動風格偵測之辨識率 - 41 - 6.8 系統模擬 - 42 - 第七章 結論與未來展望 - 43 - 7.1 結論 - 43 - 7.2 未來展望 - 43 - 參考文獻 - 45 - 附錄 - 48 - 附錄 1:停用詞列表 (Stop word list) - 48 - 附錄 2:LDA分出之主題詞彙 - 49 -

    [1] J. Weizenbaum, “Eliza – A computer program for the study of natural language communication between man and machine.,” Communications of the ACM 9(1):36-45, 1996.
    [2] Richard S. Wallace, A.L.I.C.E. [Online] Available: http://www.alicebot.org/logo-info.html, 1995
    [3] Rollo Carpenter, Jabberwacky
    [Online] Available: http://www.jabberwacky.com/, 1988
    [4] A.L.Gorin et al., “May I Help You ?,” Speech Communication, vol.23, pp.113-127, 1997.
    [5] Victor Zue et al., “Jupiter: A Telepone-based Conversational Interface for Weather Information,” IEEE Trans. Speech and Audio Processing, vol.8, no.1, pp.85-96, 2000.
    [6] Teruhisa Misu, Tatsuya Kawahara, “Speech-based Interactive Information Guidance System Using Question-answering Technique,” Proc. ICASSP 2007, vol.4, pp.145-147, 2007.
    [7] Ryuichi Nisimura et al., “Public Speech-oriented Guidance System,” Proc. ICASSP 2004, vol.1, pp.433-436, 2004.
    [8] Salton, G. 1989 Automatic Text Processing: the Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley Longman Publishing Co., Inc.
    [9] H. Murao, N. Kawaguchi, S. Matsubara, Y. Yamaguchi, and Y. Inagaki, “Example-based spoken dialogue system using WOZ system log,” SIGdial Workshop on Discourse and Dialogue (SIGDIAL2003), pp.140–148, 2003.
    [10] S. Takeuchi, T. Cincarek, H. Kawanami, H. Saruwatari, K. Shikano, “Question and answer database optimization using speech recognition results”, Proc. INTERSPEECH 2008, pp 451-454, 2008.
    [11] Walker, M., J. Cahn, and S. J. Whittaker. 1997. Improving linguistic style: Social and affective bases for agent personality. In Proceedings of Autonomous Agents'97, 96–105. Marina del Ray, Calif.: ACM Press.
    [12] Linda V. Berens, “Understanding Yourself and Others: An Introduction to Interaction Styles,” Telos Publications, 2001
    [13] Jurafsky, D., Ranganath, R., McFarland, D., 2009. “Extracting social meaning: Identifying interactional style in spoken conversation,” In: Proceedings of NAACL HLT.
    [14] C. Burges. “A Tutorial on Support Vector Machines for Pattern Recognition”. Data Mining and Knowledge Discovery, 2, 121–167 (1998), Kluwer Academic Publishers, Boston.
    [15] E. T. Jaynes, “Information theory and statistical mechanics,” Phys. Reu., vol. 108, pp. 171-190; October, 15, 1957
    [16] Hertz, J., Palmer, R.G., Krogh. A.S. (1990) Introduction to the theory of neural computation, Perseus Books. ISBN 0-201-51560-1
    [17] Blei, D. M., Ng, A. Y., & Jordan, M. I., “Latent dirichlet allocation,” Journal of Machine Learning Research, 3, 993–1022, 2003.
    [18] Paul Boersma, David Weenink, Praat [Online] Available:http://www.fon.hum.uva.nl/praat/
    [19] Zhendong Dong, HowNet
    [Online] Available: http://www.keenage.com/
    [20] S. Deerwester, S. Dumais, T. Landauer, G. Furnas, and R. Harshman. “Indexing by latent semantic analysis,” Journal of the American Society of Information Science, 41(6):391–407, 1990.
    [21] T. Hofmann, “Unsupervised learning by probabilistic latent semantic analysis,” Machine Learning, 42(1), pp. 177–196, 2001.
    [22] A. P. Dempster, N. M. Laird and D. B. Rubin, “Maximum likelihood from incomplete data via the EM algorithm”, Journal of the Royal Statistical Society, Series B, vol. 39, no. 1, pp. 1-38, 1977.
    [23] Chang, C.-C. and C.-J. Lin (2001). LIBSVM: a library for support vector machines.
    Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
    [24] Le, Z. (2004)., “Maximum entropy modeling toolkit for python and c++ (version 20041229)., ”Natural Language Processing Lab, Northeastern University, China.
    Software available at http://homepages.inf.ed.ac.uk/lzhang10/maxent_toolkit.html#doc
    [25] S. Young, J. Jansen, J. Odell, D. Ollason, and P. Woodland,“The HTK Book. Cambridge Univ.,”,1996.
    [26] Stolcke, Andreas.,“SRILM an Extensible Language Modeling Toolkit., ”Intl. Conf. on Spoken Language Processing, 2002.
    [27] Shan Jin, Hemant Misra, Thomas Sikora, and Joemon M. Jose.: 2009,“Automatic topic detection strategy for information retrieval in spoken documents.,”In Proceedings of WIAMIS, London, U.K.
    [28] C. M. Lee et al.,“Combining acoustic and language information for emotion recognition., ”In Proc. ICSLP, 2002.
    [29] I. Kruijff-Korbayova and O. Kukina.,“The effect of dialogue system output style variation on users’ evaluation judgments and input style.,” In Proceedings of the 9th SIGdial Workshop on Discourse and Dialog, 2008.
    [30] David M. Bourg, Glenn Seemann, AI for Game Developers, O’Reilly Media Inc., 2004

    下載圖示 校內:2012-08-17公開
    校外:2014-08-17公開
    QR CODE