| 研究生: |
江翊瑄 Jiang, Yi-Hsuan |
|---|---|
| 論文名稱: |
基於WEPSA與分析樹之趣味語句生成系統以台南美食對話系統為例 Enjoyable Sentence Generation System for Tainan Delicacy Dialogue System Based on WEPSA and Parse Tree |
| 指導教授: |
王駿發
Wang, Jhing-Fa |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2019 |
| 畢業學年度: | 107 |
| 語文別: | 英文 |
| 論文頁數: | 51 |
| 中文關鍵詞: | 分析樹 、槽填充 、詞嵌入模型 、句子生成 、趣味對話系統 |
| 外文關鍵詞: | Parse tree, Slot filling, Word embedding model, Sentence generation, Enjoyable dialogue system |
| 相關次數: | 點閱:67 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本研究提出一個具趣味性之對話系統-以詢問台南美食為例,其功能分為相似詞對換、句構轉換與模板調配方法。讓對話系統可自動轉換與修飾文本預設回應句產生變化性與趣味性。其技術分別為詞嵌入投影排序法、分析樹、槽填充。套用詞嵌入模型將回答文本做詞向量化後,將詞向量模型之向量逐一做投影排序,找出同義詞,將其與原詞語作對換,增加變化性;利用分析樹作中文依存句法分析與理解句法結構,進而產生不影響句子原意的句構文法轉換;應用槽填充與正則表達式推估回答句子之意圖,以擴充趣味語句回傳給使用者。本篇提出以自然語言處理方法去實現句子分析系統,也有別於使用神經網路時需要大量資料庫作訓練文本,使其系統反應時間迅速,且在泛化性上無須因應更換文本而重新訓練系統,更可由使用者隨時擴充趣味語句。以上技術再配合蒐集台南美食問答語料庫與專有名詞庫(地名、店名、食物名),使本系統讓使用者得以輕鬆地詢問台南美食。在變化性與擴充語句實驗中,使用本系統準確率分別為86.4%與93.2%,且在使用者滿意度測試中平均得到4.56分,平均整體反應速度小於一秒,這表明我們的對話系統能在實際應用時帶給使用者良好感受。
We proposed an enjoyable dialogue system for Tainan Delicacy, its function could be divided into synonym substitution, syntactic structure transformation, and template matching methods, let the system automatically convert and modify sentences to increase variability and enjoyment. The used techniques are word embedding projection sorting algorithm (WEPSA), parse tree, and slot filling. Use word embedding model to vectorize the words, do vector projection and sorting one by one to the vectors in the word embedding model to find synonym, and replace the original word with the synonym to increase variability. Use parse tree to analyze dependency syntactic in Chinese, further to produce grammatical transforms that do not affect the original meaning of the sentence. Apply slot filling and regular expression to estimate the intent of the answer statement, select a suitable template adjunction, and finally feedback to the user. In this thesis, we proposed a sentence analysis system that is implemented by natural language processing, making its response speed fast, and different from other neural network systems, it does not require a large number of databases for training. As for generalization, it does not need to retrain for different dialogue corpus, and users could also expand their own interesting statements at any time. The technology mentioned above is used with Tainan delicacy QA-pair corpus and profession annotation lexicons collected by ourselves, so that the users could easily inquire for the information of Tainan cuisines. In the variability and template adjunction experiments, the accuracy is 86.4% and 93.2% respectively, the average MOS score is 4.56, and the overall response time is less than one second, which indicates that our proposed system could bring users a good using experience in practical applications.
[1] Stanford University, Artificial Intelligence and Life in 2030. in Study Panel, Stanford University, Sep. 2016, Available: http://ai100.stanford.edu/2016-report
[2] H. P. Grice,“Logic and conversation”in Syntax and Semantics, P. Cole and J. Morgan, Eds. New York, NY, USA: Academic, 1975, pp.41–58.
[3] Yoichi Matsuyama, Akihiro Saito, Shinya Fujie, and Tetsunori Kobayashi, Automatic Expressive Opinion Sentence Generation for Enjoyable Conversational Stystems. in 2015 2nd IEEE International Transactions on Audio, Speech, and Language Processing, 2015, pp. 313-326.
[4] B. Malinowski, “The problem of meaning in primitive languages”Language and literacy in social practice: A reader, pp. 1–10, 1994.
[5] K. P. Schneider, Small Talk: Analysing Phatic Discourse. Marburg, Germany: Hitzeroth, 1988, vol. 1.
[6] P. D. Turney, Thumbs up or thumbs down?: Semantic orientation applied to unsupervised classification of reviews. in Proc. 40th Annu. Meeting Assoc. Comput. Linguist 2002, pp. 417–424, Association for Computational Linguistics.
[7] T. Nakagawa, T. Kawada, K. Inui, and S. Kurohashi, Extracting subjective and objective evaluative expressions from the web. in Proc. IEEE 2nd Int. Symp. Universal Commun. (ISUC’08), 2008, pp.251–258.
[8] B.-U. Pagel, F. Korn, and C. Faloutsos, Deflating the dimensionality curse using multiple fractal dimensions, in 2000 IEEE Proceedings of 16th International Conference on Data Engineering, 2000, pp.589 - 598
[9] T. Mikolov, K. Chen, G. Corrado, and J. Dean, "Efficient estimation of word representations in vector space," arXiv preprint arXiv:1301.3781, 2013.
[10] X. Guan, Q. Peng, J. Zhang, and X. Zhang, "Renovating word vectors to build Chinese sentiment lexicon," in 2015 IEEE International Conference on Information and Automation, 2015, pp. 2977-2982.
[11] J. K. Kim, G. Tur, A. Celikyilmaz, B. Cao, and Y. Y. Wang, "Intent detection using semantically enriched word embeddings," in 2016 IEEE Spoken Language Technology Workshop (SLT), 2016, pp. 414-419.
[12] G. Mesnil et al., Using recurrent neural networks for slot filling in spoken language understanding, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, no. 3, pp. 530-539, 2015
[13] A. Mathur and G. M. Foody, Multiclass and Binary SVM Classification: Implications for Training and Classification Users, in 2008 IEEE Geoscience and Remote Sensing Letters, vol. 5, no. 2, pp. 241 - 245
[14] Sang Hoon Lee , Kwang-Yul Kim ,Jae Hyun Kim ,and Yoan Shin, Effective Feature-Based Automatic Modulation Classification Method Using DNN Algorithm, in 2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), pp. 557 - 559
[15] J. Sun, "‘Jieba’Chinese word segmentation tool," ed, 2012.
[16] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, "Distributed representations of words and phrases and their compositionality," in Advances in neural information processing systems, 2013, pp. 3111-3119.
[17] H. Huang, Y. Wang, C. Feng, Z. Liu, and Q. Zhou, "Leveraging Conceptualization for Short-Text Embedding," IEEE Transactions on Knowledge and Data Engineering, vol. 30, no. 7, pp. 1282-1295, 2018
[18] W. Che, Z. Li, and T. Liu, “Ltp: A Chinese language technology platform,”in Coling 2010: Demonstrations, Beijing, China, Aug. 2010,pp. 13–16, Coling 2010 Organizing Committee.
[19] 中國文化研究院,現代漢語語法. https://www.chiculture.net/0615/html/index.html
[20] Wanxiang Che, Yanyan Zhao, Honglei Guo, Zhong Su, and Ting Liu, Sentence Compression for Aspect-Based Sentiment Analysis. in IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2015, pp. 2111-2124.
[21] Sandhya R Savanur and Dr. R Sumathi, Feature Based Sentiment Analysis of Compound Sentences. in 2017 2nd International Conference On Emerging Computation and Information Technologies (ICECIT), 2017, pp. 1-6
[22] Kelvin Guu, Tatsunori B. Hashimoto, Yonatan Oren, and Percy Liang, Generating Sentences by Editing Prototypes. in 2018 Transactions of the Association for Computational Linguistics, 2018, pp. 437-450.
[23] Zheng Weifa, A SVM Text Classification Approch Based on Binary Tree, in 2009 International Forum on Computer Science-Technology and Applications, pp. 455 - 458
[24] K J Jose and K S Lakshmi, Joint Slot Filling And Intent Prediction for Natural Language Understanding in Frames Dataset, in 2018 International Conference on Inventive Research in Computing Applications (ICIRCA), pp. 179 - 181
[25] Maha Salem ,Micheline Ziadee,and Majd Sakr, Marhaba, how may I help you? Effects of Politeness and Culture on Robot Acceptance and Anthropomorphization, in 2014 9th ACM/IEEE International Conference on Human-Robot Interaction (HRI), pp. 74 - 81
校內:2022-08-01公開