| 研究生: |
黃致堯 Huang, Jhih-Yao |
|---|---|
| 論文名稱: |
運用以意見為基礎的實體排序模型改善條列式資訊檢索 Using Opinion-based Entity Ranking Model to Improve List-Informational Search |
| 指導教授: |
盧文祥
Lu, Wen-Hsiang |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2013 |
| 畢業學年度: | 101 |
| 語文別: | 英文 |
| 論文頁數: | 48 |
| 中文關鍵詞: | 條列式資訊檢索 、問句意見詞 、問句分析 、答案擷取 |
| 外文關鍵詞: | List-Information Search, Question Opinion, Question Analysis, Answer Extraction |
| 相關次數: | 點閱:85 下載:1 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
自然語言檢索是使用人類語言問句搜尋答案。其目的是要擷取適合使用者的答案。相較於一般的短語查詢,自然語言問句查詢則是更能直覺的呈現出使用者的問題。例如,問句「台南哪間餐廳的火鍋好吃?」。但現今的搜尋引擎在自然語言檢索方面的表現往往不佳,導致使用者必須花大部分的時間在瀏覽出現部分詞彙網頁答案。Rose等人提出條列式資訊目的解決以上問題。在本論文,條列式資訊目的被定義為使用者想要獲得一系列同性質的實體。
我們首先提出了一個以網址為基礎的實體擴展方法來自動地擴展我們所需的實體。我們基於使用者需求分析問句結構和答案結構,並推薦實體的相關頁面。問句結構可以被分為問句焦點、問句上下文和問句意見對。我們使用適合中文的問句分析演算法辨識以上問句結構特徵。至於答案結構可以被分為實體首頁、上下文的證明頁面和意見詞總結(Opinion Summary)。我們利用問句結構和答案結構的關係,結合實際上的網路的評價意見,建構出以意見為基礎的實體排序模型(OBERM)來改善條列式資訊檢索。
實驗結果顯示我們提出的模型OBERM,比現有的搜尋引擎Google表現的還要好。也顯示出我們的系統著實能增進條列式資訊檢索的效能
Natural language search is to use human language questions as query to search and extract suitable webpages for users. Compare with short query, natural language query users can directly submit their query intents. For example, the question, “Which restaurants in Tainan have delicious hot pot?”. According to our observation, conventional search engines can’t efficiently process this queries and the search result is too messy. Therefore users need to spend lots of time on browsing and filter those noise information in the result pages . Rose et al. proposed a conceptual framework for classifying user’s goals. Such that, search engines can associate user goals with queries and exploit the goal information. In our paper, we focus in his proposed list-informational search goal.
We proposed a website-base entity set expansion method to expand our entities. We try to recommend evidence pages of each entity based on the intent behind users by analyzing question structures and answer structures. The question structure can be divided into three parts, they are question focus, question context and question opinion pair. We used the algorithm of question analysis to identify question features. As to the answer structure, it can be divided into entity homepage, context evidence page and opinion summary. We combine the relationship between the question structure and the answer structure with the real comments on the Internet to construct Opinion-based Entity Ranking Model (OBERM) to improve List-Informational search.
Experiment result shows that our proposed method OBERM outperforms Google Search. And it shows OBERM really can enhance performance in List-informational search.
[1] Broder A., A taxonomy of web search. SIGIR Forum, 2002
[2] B.J. Jansen, A. Spink, J. Bateman, and T. Saracevic, Real life information retrieval: A study of user queries on the web, SIGIR FORUM, 1998
[3] B. Pang and L. Lee. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts, ACL, 2004
[4] B. Pang and L. Lee. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales, ACL, 2005
[5] C. Silverstein, M. Henzinger, H. Marais, and M. Morics., Analysis of a very large AltaVista query log, Digital Systems Research Center, 1998.
[6] Chen Z.Z., Finding Hidden Semantic Features in Question to Improve Page Searching in Websites, 2009
[7] D. R. Radev, H. Qil, Z, Mining the Web for Answers to Natural Language Questions,CIKM,2001
[8] Elkan C., Log-linear models and conditional random fields, CIKM, 2008
[9] Ferret O., Grau B., Hurault-Planet M., Illouz G., Monceaux L., Robba I. and Vilnat A., Finding an Answer based on the Recognition of the Question Focus, 2001
[10] Hu M. and Liu B., Mining and Summarizing Customer Reviews, AAAI’ 04, 2004
[11] Hu, M., and Liu, B., Mining and summarizing customer reviews, KDD’04,2004
[12] J. Chu-Carroll, A Hybrid Approach to Natural Language Web Search, ACL,2002
[13] Jones K.S., Walker S., Robertson S.E., A Probabilistic Model of Information Retrieval: Development and Comparative Experiments, 2000
[14] Ku, L.W., and Chen, H.-H., Mining Opinions from the Web: Beyond Relevance Retrieval. Journal of American Society for Information Science and Technology, Special Issue on Mining Web Resources for En-hancing Information Retrieval, 2007
[15] Ku L.W., Chen, H.H., and Liang, Y.T., Opinion Extraction, Summarization and Tracking in News and Blog Corpora, AAAI’06, 2006
[16] Liu B., Guang Q., Bu J. and Chen C., Expanding Domain Sentiment Lexicon through Double Propagation, In Proceedings of the 21st International Joint Conference on Artificial Intelligence (IJCAI-09), 2009
[17] Lin C.C., Improve Natural Language Question Search Using Page Link Path, 2008
[18] Lin C.Y., Cao Y., Duan H., Yu Y. and Hon H.W., Recommending Questions Using the MDL-based Tree Cut Model, WWW, 2008
[19] Lin C.Y., Duan H., Cao Y. and Yu Y., Searching Questions by Identifying Question Topic and Question Focus, In Proceedings of ACL-08, 2008
[20] Li G., Ooi B.C., Feng J., Wang J. and Zhou L., EASE: An Effective 3-in-1 Keyword Search Method for Unstructured, Semi-structured and Structured Data, SIGMOD’08, 2008
[21] Lin K.H., Learning Question Structure based on Website Link Structure to Improve Natural Language Search, 2006
[22] Liu L, Yu Z. and Li L., Chinese Expert Entity Homepage Recognition Based on Co-EM, WISM, 2011
[23] Lin S.J. and Lu W.H., Learning Question Focus and Semantically Related Features from Web Search Results for Chinese Question Classification, In AIRS, 2006
[24] Mayr P., Website entries from a web log file perspective – a new log file measure1, Proceedings of the AoIR-ASIST 2004 Workshop on Web Science Research Methods, 2004
[25] Popescu A.M. and Etzioni O., Extracting Product Features and Opinions from Reviews, EMNLP’05, 2005
[26] R.C. Wang and W.W. Cohen. , Iterative Set Expansion of Named Entities using the web, ICDM, 2008.
[27] Rose D.E. and Levinson D., Understanding User Goals in Web Search, WWW, 2004
[28] Turney, P., Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews,ACL’02,2002
[29] Wang S.M., Using Natural Language Question Structure Features to Improve List-Information Search, 2012
[30] Zhang, L., Liu, B., Entity set expansion in opinion documents, Proceedings of the 22nd ACM Conference on Hypertext and Hypermedia, 2011