簡易檢索 / 詳目顯示

研究生: 王聖閔
Wang, Shen-Min
論文名稱: 運用自然語言問句結構特徵改善條列式資訊檢索
Using Natural Language Question Structure Features to Improve List-Information Search
指導教授: 盧文祥
Lu, Wen-Hsiang
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2012
畢業學年度: 100
語文別: 英文
論文頁數: 45
中文關鍵詞: 條列式資訊檢索問句意見詞問句分析答案擷取
外文關鍵詞: List-Information Search, Question Opinion, Question Analysis, Answer Extraction
相關次數: 點閱:134下載:4
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 自然語言檢索是使用人類語言問句搜尋答案。其目的是要擷取適合使用者的答案。例如,問句「哪一品牌的手機有比較好的速度和網路?」。當使用者找尋答案是採用一般搜尋引擎,他們必須花大部分的時間在瀏覽出現部分詞彙網頁答案。Rose等人提出條列式資訊目的解決以上問題。
    在本論文,條列式資訊目的被定義為使用者想要獲得一系列同性質的實體。我們基於使用者需求分析問句結構和答案結構,並推薦實體的相關頁面。問句結構可以被分為問句焦點、問句上下文和問句意見詞。我們使用適合中文的問句分析演算法辨識以上問句結構特徵。至於答案結構可以被分為實體首頁、上下文的代表性頁面和意見詞組件(Opinion Component)。我們利用問句結構和答案結構的關係,建構問句結構特徵匹配模型(QSFMM)。我們推導QSFMM可以被區分為網頁結構匹配模型(WSMM)和意見檢索模型(OSM)。
    在我們實驗,藉由擷取適合的答案比較WSMM和QSFMM。在實驗結果呈現,雖然WSMM能擷取出適合的答案給使用者,但QSFMM在平均上能達到較好的效能。

    Natural language search is to use human language questions to search answers. The task of natural language search is to extract suitable answers for users. For example, the question, “Which brand of mobile phone is suitable for surfing the internet and with fast processor?” While users adopt conventional search engines to find answers, they need to spend lots of time on browsing filter the result pages, which may involve some noise information. Rose et al. proposed list-informational goal to solve above problem.
    In our paper, list-informational goal is defined that a user wants to obtain a list of homogeneous entities. We try to recommend representative pages of each entity based on the need behind users by analyzing question structures and answer structures. The question structure can be divided into three parts, they are question focus, question context and question opinion. We used the algorithm of question analysis to identify question features. As to the answer structure, it can be divided into entity homepage, context representative page and opinion component. We utilized relationship between the question structure and the answer structure to construct Question Structure Feature Mapping Model (QSFMM). We inferred the QSFMM to become two sub-models: Web Structure Mapping Model (WSMM) and Opinion Search Model (OSM).
    In our experiment, we compared WSMM and QSFMM by verifying extracted answers. The experiments result show WSMM can extract more suitable answers for users than QSFMM. However, better performance can be achieved by integrating all of features into QSFMM on average.

    摘要.................I Abstract............IV 誌謝.................VI CONTENTS............VII LIST OF TABLES......IX LIST OF FIGURES.....X Chapter1 Introduction.........1 1.1 Motivation.........1 1.2 Methodology.........3 1.3 Organization of this Dissertation.........3 Chapter 2 Related Work.........6 2.1 List-Informational natural language search.........6 2.2 Question Structure.........7 2.2.1 Question Focus & Question Context.........7 2.2.2 Question Opinion.........7 2.3 Answer Extraction.........8 Chapter3 Method........10 3.1 System Architecture.........10 3.2 Problem Formulation.........13 3.3 Question Analysis.........15 3.3.1 Identification of Question Focus.........15 3.3.2 Identification of Question Context.........16 3.3.3 Identification of Question Opinion.........19 3.4 Question Structure Features Mapping Model.........21 3.4.1 Model Inference.........21 3.4.2 Web Structure Mapping Model.........23 3.4.3 Opinion Search Model.........27 Chapter 4 Experiments.........29 4.1 Experiment Setup.........29 4.1.1 Data Set.........29 4.1.2 Evaluation Metrics.........31 4.2 Evaluation of Question Analysis.........32 4.3 Evaluation of Question Structure Features Mapping Model.........35 4.3.1 Parameter Estimation.........35 4.3.2 Experimental Result.........35 4.4 Discussion.........40 4.4.1 The problems of question features candidates.........40 4.4.2 Question Structure Features Mapping Model.........41 Chapter 5 Conclusion.........43 Chapter 6 Reference.........44

    [1]Broder A., A taxonomy of web search. SIGIR Forum, 2002
    [2]Rose D.E. and Levinson D., Understanding User Goals in Web Search, WWW, 2004
    [3]Liu B. and Zhang L., Entity Set Expansion in Opinion Documents, HT, 2011
    [4]Lin C.Y., Cao Y., Duan H., Yu Y. and Hon H.W., Recommending Questions Using the MDL-based Tree Cut Model, WWW, 2008
    [5]Lin C.Y., Duan H., Cao Y. and Yu Y., Searching Questions by Identifying Question Topic and Question Focus, In Proceedings of ACL-08, 2008
    [6]Lin K.H., Learning Question Structure based on Website Link Structure to Improve Natural Language Search, 2006
    [7]Lin C.C., Improve Natural Language Question Search Using Page Link Path, 2008
    [8]Li G., Ooi B.C., Feng J., Wang J. and Zhou L., EASE: An Effective 3-in-1 Keyword Search Method for Unstructured, Semi-structured and Structured Data, SIGMOD’08, 2008
    [9]Chen Z.Z., Finding Hidden Semantic Features in Question to Improve Page Searching in Websites, 2009
    [10]Wang R.C., Schlaefer N., Cohen W.W. and Nyberg E., Automatic Set Expansion for List Question Answering, Conference on Empirical Methods in Natural Language Processing(EMNLP-08), 2008
    [11]Wang R.C. and Cohen W.W, Language-Independent Set Expansion of Named Entities using the Web, In Proceedings of IEEE International Conference on Data Mining (ICDM 2007), 2007
    [12]Ferret O., Grau B., Hurault-Planet M., Illouz G., Monceaux L., Robba I. and Vilnat A., Finding an Answer based on the Recognition of the Question Focus, 2001
    [13]Lin S.J. and Lu W.H., Learning Question Focus and Semantically Related Features from Web Search Results for Chinese Question Classification, In AIRS, 2006
    [14]Hu M. and Liu B., Mining and Summarizing Customer Reviews, AAAI’ 04, 2004
    [15]Hu M. and Liu B., Mining Opinion Features in Customer Reviews, KDD’04, 2004
    [16]Popescu A.M. and Etzioni O., Extracting Product Features and Opinions from Reviews, EMNLP’05, 2005
    [17]Liu B., Guang Q., Bu J. and Chen C., Expanding Domain Sentiment Lexicon through Double Propagation, In Proceedings of the 21st International Joint Conference on Artificial Intelligence (IJCAI-09), 2009
    [18]Ku L.W., Chen, H.H., and Liang, Y.T., Opinion Extraction, Summarization and Tracking in News and Blog Corpora, AAAI’06, 2006
    [19]Jones K.S., Walker S., Robertson S.E., A Probabilistic Model of Information Retrieval: Development and Comparative Experiments, 2000
    [20]Kleinberg J., Authoritative sources in a hyperlinked environment, Journal of the ACM, 46(5):604–622, 1999.
    [21]Brin S. and Page L., The anatomy of a large-scale hypertextual web search engine, In Proc. of 7th International World Wide Web Conference, May 1998.
    [22]Ma W.Y., and Xu G., Building Implicit Links from Content for Forum Search, SIGIR’06, 2006
    [23]Ganesan K., and Zhai C.X., Opinion-Based Entity Ranking, g. Journal of Information Retrieval, 2011
    [24]Ku L.W., Lo Y.S. and Chen H.H., Using Polarity Scores of Words for Sentence-level Opinion Extraction, Proceedings of NTCIR-6 Workshop Meeting, 2007
    [25]Mayr P., Website entries from a web log file perspective – a new log file measure1, Proceedings of the AoIR-ASIST 2004 Workshop on Web Science Research Methods, 2004
    [26]Liu L, Yu Z. and Li L., Chinese Expert Entity Homepage Recognition Based on Co-EM, WISM, 2011
    [27]Elkan C., Log-linear models and conditional random fields, CIKM, 2008

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE