簡易檢索 / 詳目顯示

研究生: 林舒容
Lin, Shu-Jung
論文名稱: 利用網路搜尋結果學習問題焦點和相依關係之問題分類技術
Using Web Search Results to Learn Question Focus and Dependency Relations for Question Classification
指導教授: 盧文祥
Lu, Wen-Hsiang
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2006
畢業學年度: 94
語文別: 中文
論文頁數: 72
中文關鍵詞: 問題焦點問題分類問答系統相依關係相依特徵
外文關鍵詞: Question Answering System, Question Focus, Question Classification, Dependency Feature, Dependency Relation
相關次數: 點閱:88下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  •  問答系統(Question Answering System)就是當使用者輸入一個自然語言問題給系統後,系統會根據問題回應一個最適當的答案給使用者。
     問答系統包含三個步驟,問題分析,文件檢索和答案擷取。本篇論文將針對問題分析中的問題分類,提出一個新的方法來處理。過去也有很多人針對問題分類進行研究,除了最常看到的規則為本的技術,另外我們最常看到的就是從問題中擷取各種特徵,再透過機器學習的方法來處理,譬如Support Vector Machine。然而這些機器學習技術,大多是依靠大量人工整理的訓練語料,如果訓練語料不夠充足,可能會造成效果不好。我們認為一個問題中,除了問題焦點佔有最重要的角色外,問題中仍有其他特徵和問題焦點之間存在著一些相依關係,而這些關係可能可以在問題分類上給予正面的幫助。
     所以在本篇論文中,我們針對Factoid Question Answering System,提出一個新的問題分類方法,透過自然語言的分析,觀察出一些詞與詞之間的相依關係,且利用少數一些Seeds從網際網路中收集大量訓練語料,以讓電腦自動學習各種類型的相依特徵,進而對問答系統中的問題分類給予適當的幫助,另外,我們也提出兩種機率模型用來解決問題分類,一個是Dependency Feature Model (DFM)和Dependency Relation Model (DRM)。最後實驗結果顯示,我們所提出的方法確實能在問題分類上給予正面的幫助,並且我們也跟Language Model比較,結果顯示透過網路資源,能解決一些訓練語料不足的問題。

     Question Answering System is expected to response exact answers to users’ questions in natural languages. Typically, most existing QA systems consist of the following components: question analysis, information retrieval, and answer extraction. This paper focus on question classification of question analysis, and we proposed a new method to deal with this problem. Recently, some machine learning techniques like support vector machines are employed for question classification. However, these techniques heavily depend on the availability of large amounts of training data, therefore if there is not enough training data, it may make an ineffective result. We think that in addition to question focus, there are some useful dependency features in a question, and these dependency features can be helpful for question classification.
     In this paper, we present a simple learning method that explores Web search results to collect more training data automatically, and we also proposed two models, the first is Dependency Feature Model (DFM) which takes advantage of dependency features learned from the larger number of collected Web search results to support the determination of question type, the second is Dependency Relation Model (DRM) which used dependency relations between question focus and dependency features to support the determination of question type.

    摘要.....................................................III Abstract..................................................IV 誌謝.......................................................V 目錄......................................................VI 表目錄..................................................VIII 圖目錄.....................................................X 第一章 緒論................................................1 1.1 研究目的與動機.........................................1 1.2 方法概念...............................................3 1.3論文編排................................................4 第二章 相關研究............................................5 2.1 問答系統簡介...........................................5 2.1.1 一般問答系統介紹.....................................6 2.1.2 問答系統相關競賽.....................................7 2.2 其他相關研究...........................................7 2.2.1 問題分類的重要性.....................................7 2.2.2 以基本規則為主的問題分類技術.........................8 2.2.3 機器學習問題分類技術.................................8 2.2.4 問題焦點的重要性....................................10 2.2.5 透過網路協助問題分類技術............................11 2.2.6 語言模型(Language Model)問題分類技術................13 第三章 研究方法...........................................14 3.1 技術架構..............................................14 3.2 問題類型..............................................15 3.3 基本問題分類規則......................................15 3.4 學習相依特徵..........................................17 3.4.1 收集訓練語料........................................17 3.4.2 問題焦點辨識........................................20 3.5 相依特徵模型和相依關係模型............................25 3.5.1 DFM.................................................26 3.5.2 DRM .................................................31 第四章 實驗和評估........................................35 4.1 機率模型效能評估......................................35 4.1.1 比較DFM和DRM........................................35 4.1.2 比較不同限制條件....................................38 4.1.3 比較不同權重值......................................41 4.2 DFM和LM效能評估.......................................46 4.2.1 以問題為訓練語料的比較..............................47 4.2.2 以網路為訓練語料的比較..............................49 4.2.3 以問題和網路為不同訓練語料比較......................51 4.2.4 Boostrpping收集訓練語料效能評估.....................53 4.3 Dependency Feature Model探討..........................63 第五章 討論和未來研究....................................67 5.1 討論..................................................67 5.2 結論..................................................68 5.3 未來研究工作..........................................69 參考文獻..................................................71

    Eric Brill, Susan Dumais and Michele Banko: An analysis of the AskMSR question-
     answering system, Proceedings of 2002 Conference on Empirical Methods in
     Natural Language Processing, 2002.
    Min-Yuh Day, Cheng-Wei Lee, Shih-Hung Wu, Chorng-Shyong Ong, Wen-Lian Hsu, An  
     Integrated QFowledge-based and Machine Learning Approach for Chinese Question
     Classification, IEEE NLPKE, 2005.
    Jun’ichi Fukumoto, Tsuneaki Kato, Fumito Masui, Question Answering Challenge
     (QAC-1) An Evaluation of Question Answering Task at NTCIRWorkshop 3,
     Proceedings of the Third NTCIR Workshop, 2001.
    Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependency Language
     Model for Information Retrieval, SIGIR, 2004.
    Sanda Harabagiu, Dan Moldovan, Marius Pasaca, Rada Mihalcea, Mihai Surdeanu,
     Razvan Bunescu, Roxana Girju, Vasile Rus, Paul Morarescu, FALCON: Boosting
     Knowledge for Answer Engines, Proceedings of the 9th Text Retrieval
     Conference, NIST, 2000.
    Ulf Hermjakob, Parsing and Question Classification for Question Answering,
     Proceedings of the Workshop on Open-Domain Question Answering at ACL, 2001.
    Eduard Hovy, Laurie Gerber, Ulf Hermjakob, Chin-Yew Lin, Deepak Ravichandran,
     Toward Semantics-based Answer Pinpointing, Proceedings of the DARPA Human
     Language Technology Conference (HLT), 2001.
    Kui-Lam Kwok, Peter Deng, Norbert Dinstl, Sora Choi, NTCIR-5 English-Chinese
     Cross Language Question-Answering Experiments using PIRCS, NTCIR-5, 2005.
    Cheng-Wei Lee, Cheng-Wei Shih, Min-Yuh Day etc, ASQA: Academia Sinica Question
     Answering System for NTCIR-5 CLQA, NTCIR-5, 2005.
    Wei Li, Question Classification Using Language Modeling, CIIR Technical Report,
     2002.
    Xin Li, Dan Roth, Learning Question Classifiers*, Coling, 2002.
    Xin Li, Dan Roth, Kevin Small, The Role of Semantic Information in Learning
     Question Classifiers*, IJCNLP, 2004.
    Chuan-Jie Lin, Yu-Chun Tzeng, Hsin-His Chen, System Description of NTOUA Group
     in CLQA1, NTCIR-5, 2005.
    Chuan-Jie Lin, A Study on Chinese Open-Domain Question Answering Systems, Ph.D.
     Dissertation, National Taiwan University, 2004.
    Jia-Ju Mei, Yi-Ming Zhu, Yun-Qi Gao, Hong-Xiang Yin, Tongyici CiLin, 1983.
    Dan Moldovan, Marius Pasca, Sanda Harabagiu, Mihai Surdeanu, Performance Issues
     and Error Analysis in an Open-Domain Question Answering System, ACM
     TRANSACTIONS ON INFORMATION SYSTEMS, 2003.
    Dan Moldovan, Sanda Harabagiu, Marius Pasca, Rada Mihalcea and etc, The
     Structure and Performance of Open-Domain Question Answering System, ACL,
     2000.
    David Ramamonjisoa, Question Answering System with Fine Grain Answer Types and
     Search Refinement, NTCIR-5, 2005.
    Deepak Ravichandran, Eduard Hovy: Learning surface text patterns for a question
     answering system, Association for Computational Linguistics Conference (ACL),
     2002.
    Ellen Riloff and Michael Thelen. A Rule-based Question Answering System for
     Reading Comprehension Tests. In Proceedings of the ANLP/NAACL 2000 Workshop
     on Reading Comprehension Tests as Evaluation for Computer-Based Language
     Understanding Systems, 2000.
    Thamar Solorio, Manuel Perez-Coutino, Manuel Montes-y-Gomez, Luis Villasenor-
     Pineda, Aurelio Lopez-Lopez, A Language Independent Method for Question
     Classification, Coling, 2004.
    Thamar Solorio, Manuel Perez-Coutino, Manuel Montes-y-Gomez, Luis Villasenor-
     Pineda, Aurelio Lopez-Lopez, Question Classification in Spanish and
     Portuguese, Lecture Notes in Computer Science, 2005.
    Richard F. E. Sutcliffe, Jia Xu Michael Mulcahy, Chinese Question Answering
     using the DLT System at NTCIR 2005, NTCIR-5, 2005.
    Jun Suzuki, Hirotoshi Taira, Yutaka Sasaki, and Eisaku Maeda, Question
     Classification using HDAG Kernel, ACL Workshop on Multilingual Summarization
     and Question Answering, 2003.
    Ellen M. Voorhees, Overview of the TREC 2001 Question Answering Track, TREC
     2001.
    Bin Wang, Gareth J.F. Jones, LCC-DCU C-C Question Answering Task at NTCIR-5,
     NTCIR-5, 2005.
    Dell Zhang, Wee Sun Lee, Question Classification using Support Vector Machines,
     SIGIR, 2003.

    下載圖示 校內:立即公開
    校外:2006-07-26公開
    QR CODE