| 研究生: |
林舒容 Lin, Shu-Jung |
|---|---|
| 論文名稱: |
利用網路搜尋結果學習問題焦點和相依關係之問題分類技術 Using Web Search Results to Learn Question Focus and Dependency Relations for Question Classification |
| 指導教授: |
盧文祥
Lu, Wen-Hsiang |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2006 |
| 畢業學年度: | 94 |
| 語文別: | 中文 |
| 論文頁數: | 72 |
| 中文關鍵詞: | 問題焦點 、問題分類 、問答系統 、相依關係 、相依特徵 |
| 外文關鍵詞: | Question Answering System, Question Focus, Question Classification, Dependency Feature, Dependency Relation |
| 相關次數: | 點閱:88 下載:2 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
問答系統(Question Answering System)就是當使用者輸入一個自然語言問題給系統後,系統會根據問題回應一個最適當的答案給使用者。
問答系統包含三個步驟,問題分析,文件檢索和答案擷取。本篇論文將針對問題分析中的問題分類,提出一個新的方法來處理。過去也有很多人針對問題分類進行研究,除了最常看到的規則為本的技術,另外我們最常看到的就是從問題中擷取各種特徵,再透過機器學習的方法來處理,譬如Support Vector Machine。然而這些機器學習技術,大多是依靠大量人工整理的訓練語料,如果訓練語料不夠充足,可能會造成效果不好。我們認為一個問題中,除了問題焦點佔有最重要的角色外,問題中仍有其他特徵和問題焦點之間存在著一些相依關係,而這些關係可能可以在問題分類上給予正面的幫助。
所以在本篇論文中,我們針對Factoid Question Answering System,提出一個新的問題分類方法,透過自然語言的分析,觀察出一些詞與詞之間的相依關係,且利用少數一些Seeds從網際網路中收集大量訓練語料,以讓電腦自動學習各種類型的相依特徵,進而對問答系統中的問題分類給予適當的幫助,另外,我們也提出兩種機率模型用來解決問題分類,一個是Dependency Feature Model (DFM)和Dependency Relation Model (DRM)。最後實驗結果顯示,我們所提出的方法確實能在問題分類上給予正面的幫助,並且我們也跟Language Model比較,結果顯示透過網路資源,能解決一些訓練語料不足的問題。
Question Answering System is expected to response exact answers to users’ questions in natural languages. Typically, most existing QA systems consist of the following components: question analysis, information retrieval, and answer extraction. This paper focus on question classification of question analysis, and we proposed a new method to deal with this problem. Recently, some machine learning techniques like support vector machines are employed for question classification. However, these techniques heavily depend on the availability of large amounts of training data, therefore if there is not enough training data, it may make an ineffective result. We think that in addition to question focus, there are some useful dependency features in a question, and these dependency features can be helpful for question classification.
In this paper, we present a simple learning method that explores Web search results to collect more training data automatically, and we also proposed two models, the first is Dependency Feature Model (DFM) which takes advantage of dependency features learned from the larger number of collected Web search results to support the determination of question type, the second is Dependency Relation Model (DRM) which used dependency relations between question focus and dependency features to support the determination of question type.
Eric Brill, Susan Dumais and Michele Banko: An analysis of the AskMSR question-
answering system, Proceedings of 2002 Conference on Empirical Methods in
Natural Language Processing, 2002.
Min-Yuh Day, Cheng-Wei Lee, Shih-Hung Wu, Chorng-Shyong Ong, Wen-Lian Hsu, An
Integrated QFowledge-based and Machine Learning Approach for Chinese Question
Classification, IEEE NLPKE, 2005.
Jun’ichi Fukumoto, Tsuneaki Kato, Fumito Masui, Question Answering Challenge
(QAC-1) An Evaluation of Question Answering Task at NTCIRWorkshop 3,
Proceedings of the Third NTCIR Workshop, 2001.
Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependency Language
Model for Information Retrieval, SIGIR, 2004.
Sanda Harabagiu, Dan Moldovan, Marius Pasaca, Rada Mihalcea, Mihai Surdeanu,
Razvan Bunescu, Roxana Girju, Vasile Rus, Paul Morarescu, FALCON: Boosting
Knowledge for Answer Engines, Proceedings of the 9th Text Retrieval
Conference, NIST, 2000.
Ulf Hermjakob, Parsing and Question Classification for Question Answering,
Proceedings of the Workshop on Open-Domain Question Answering at ACL, 2001.
Eduard Hovy, Laurie Gerber, Ulf Hermjakob, Chin-Yew Lin, Deepak Ravichandran,
Toward Semantics-based Answer Pinpointing, Proceedings of the DARPA Human
Language Technology Conference (HLT), 2001.
Kui-Lam Kwok, Peter Deng, Norbert Dinstl, Sora Choi, NTCIR-5 English-Chinese
Cross Language Question-Answering Experiments using PIRCS, NTCIR-5, 2005.
Cheng-Wei Lee, Cheng-Wei Shih, Min-Yuh Day etc, ASQA: Academia Sinica Question
Answering System for NTCIR-5 CLQA, NTCIR-5, 2005.
Wei Li, Question Classification Using Language Modeling, CIIR Technical Report,
2002.
Xin Li, Dan Roth, Learning Question Classifiers*, Coling, 2002.
Xin Li, Dan Roth, Kevin Small, The Role of Semantic Information in Learning
Question Classifiers*, IJCNLP, 2004.
Chuan-Jie Lin, Yu-Chun Tzeng, Hsin-His Chen, System Description of NTOUA Group
in CLQA1, NTCIR-5, 2005.
Chuan-Jie Lin, A Study on Chinese Open-Domain Question Answering Systems, Ph.D.
Dissertation, National Taiwan University, 2004.
Jia-Ju Mei, Yi-Ming Zhu, Yun-Qi Gao, Hong-Xiang Yin, Tongyici CiLin, 1983.
Dan Moldovan, Marius Pasca, Sanda Harabagiu, Mihai Surdeanu, Performance Issues
and Error Analysis in an Open-Domain Question Answering System, ACM
TRANSACTIONS ON INFORMATION SYSTEMS, 2003.
Dan Moldovan, Sanda Harabagiu, Marius Pasca, Rada Mihalcea and etc, The
Structure and Performance of Open-Domain Question Answering System, ACL,
2000.
David Ramamonjisoa, Question Answering System with Fine Grain Answer Types and
Search Refinement, NTCIR-5, 2005.
Deepak Ravichandran, Eduard Hovy: Learning surface text patterns for a question
answering system, Association for Computational Linguistics Conference (ACL),
2002.
Ellen Riloff and Michael Thelen. A Rule-based Question Answering System for
Reading Comprehension Tests. In Proceedings of the ANLP/NAACL 2000 Workshop
on Reading Comprehension Tests as Evaluation for Computer-Based Language
Understanding Systems, 2000.
Thamar Solorio, Manuel Perez-Coutino, Manuel Montes-y-Gomez, Luis Villasenor-
Pineda, Aurelio Lopez-Lopez, A Language Independent Method for Question
Classification, Coling, 2004.
Thamar Solorio, Manuel Perez-Coutino, Manuel Montes-y-Gomez, Luis Villasenor-
Pineda, Aurelio Lopez-Lopez, Question Classification in Spanish and
Portuguese, Lecture Notes in Computer Science, 2005.
Richard F. E. Sutcliffe, Jia Xu Michael Mulcahy, Chinese Question Answering
using the DLT System at NTCIR 2005, NTCIR-5, 2005.
Jun Suzuki, Hirotoshi Taira, Yutaka Sasaki, and Eisaku Maeda, Question
Classification using HDAG Kernel, ACL Workshop on Multilingual Summarization
and Question Answering, 2003.
Ellen M. Voorhees, Overview of the TREC 2001 Question Answering Track, TREC
2001.
Bin Wang, Gareth J.F. Jones, LCC-DCU C-C Question Answering Task at NTCIR-5,
NTCIR-5, 2005.
Dell Zhang, Wee Sun Lee, Question Classification using Support Vector Machines,
SIGIR, 2003.