簡易檢索 / 詳目顯示

研究生: 韋絲若
Wardani, Dewi
論文名稱: 利用結構化與非結構化特徵改善複雜問答
Finding Structured and Unstructured Features to Improve the Search Result of Complex Question
指導教授: 盧文祥
Lu, Wen-Hsiang
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2009
畢業學年度: 97
語文別: 英文
論文頁數: 54
外文關鍵詞: question answering, structured feature, complex question
相關次數: 點閱:88下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近來,搜索引擎在遭遇到無法有效處理自然語言的問題。有時候,這些問題都是複雜的問題。而一個複雜問題是包含一些句子、意圖或較長答案的問題。

    在我們的研究中,我們觀察發現結構特徵和非結構化特徵並使用問題的結構化資料和非結構化資料可以改善複雜問題的搜索結果。根據上面的構想,因此我們結合兩種方法,分別為資訊檢索的方法和QA結構化模板。

    我們的研究架構由三部分組成,分別為:問題分析、資源發現和分析相關答案。在問題分析方面,我們提出了一些假設,針對問題試圖找到結構化和非結構化的特徵,結構化特徵指的是結構化資料,而非結構化特徵指的則是非
    結構化資料。在資源發現部分,我們結合結構化資料(關連性資料庫)和非結構化資料(網頁)這兩種資料的優點,用以改善取得的相關答案,藉此將能找出網頁本文中最好的區段。在分析相關答案方面,我們進行結構化資料和非結構化資料比對並計算權重分數,最後,我們使用QA模板來重構問題

    從實驗的結果得知,尋找結構特徵和非結構化的特徵和使用結構化與非結構化資料,並搭配使用資訊檢索和QA模板可以改善複雜問題的搜索結果。

    Recently, search engine got challenge to deal with natural language questions. Sometimes, these questions are complex questions. A complex question is a question that consists of several clauses, several intentions or need long answer.
    In this work we proposed that finding structured features and unstructured features of questions and using structured data and unstructured data could improve the search result of complex questions. According to the above ideas, integrate two approaches, IR approach and QA structured template.
    Our framework consists of three parts: Question Analysis, Resource Discovery and Analysis of The Relevant Answer. In Question Analysis we used a few assumptions and tried to find structured and unstructured features of the questions. Structured feature tend to Structured data and unstructured feature tend to unstructured data. In the resource discovery we integrated structured data (relational database) and unstructured data (webpage) to take the advantage of two kinds of data to improve and reach the correct answers. We can find the best top fragments from context of the relevant webpages. In the Relevant Answer part and then we made a score matching between the result from structured data and unstructured data, then finally used QA template to reformulate the questions.
    For the experimental result, it shows that finding structured feature and unstructured features and using both structured and unstructured data, using approach IR and QA template could improve the search result of complex questions

    Abstract I Abstract II Table of Content III List of Figure V List of Table VI Chapter 1. Introduction 1 1.1 Motivation 1 1.2 The Considered Problem 3 1.2.1 Question Analysis 3 1.2.2 Resource Discovery and Reach The Relevant Answer 4 1.3 Organization 4 Chapter 2. Related Works 5 2.1 Question Analysis on Question Answering 5 2.1.1 NLP Approach 5 2.1.2 IR Approach 5 2.1.3 Template-based QA 6 2.2 Complex Question 6 2.3 Structured Information to Improve Question Answering 7 2.4 Structured Retrieval 8 2.5 Integration Information 9 Chapter 3. Idea and Method 10 3.1 Observation of Question and Assumptions 10 3.1.1 Complex Question 10 3.1.2 Question from Yahoo!Answer 11 3.2 Idea 12 3.2.1 A Bag of Words of Question Answering Result 12 3.2.2 Finding Answer Using Structured and Unstructured Data 14 3.2.3 Framework 15 3.2.4 A Problem Definitions 16 3.3 Question Analysis 17 3.3.1 A Survey of Question 17 3.3.2 Algorithms and Method of Question Analysis 22 3.4 Resource Discovery 26 3.5 Finding The Relevant Answer 29 Chapter 4. Experiment 33 4.1 Experimental Setup 33 4.1.1 Dataset 33 4.1.2 Experiment Metrics 34 4.2 Experimental Result 35 4.2.1 Question Analysis 35 4.2.2 Resource Discovery and The Relevant Answer 38 Chapter 5. Conclusions and Future Works 50 5.1 Conclusion 50 5.2 Future Works 50 References 52

    1. Lin, C.J. and R.R. Liu. An analysis of multi-focus questions. In Proceedings of SIGIR, 2008.
    2. Li, G., et al. EASE: an effective 3-in-1 keyword search method for unstructured, semi-structured and structured data. In Proceedings of SIGMOD, 2008.
    3. Bovens, L. and W. Rabinowicz, Democratic answers to complex questions–an epistemic perspective. Synthese, 2006. 150(1): p. 131-153.
    4. Harabagiu, S., F. Lacatusu, and A. Hickl. Answering complex questions with random walk models. In Proceedings of SIGIR, 2006.
    5. Harabagiu, S., et al., Answering complex, list and context questions with LCC's Question-Answering Server. NIST SPECIAL PUBLICATION SP, 2002: p. 355-361.
    6. Saquete, E., et al. Splitting complex temporal questions for question answering systems. In Proceedings of ACL, 2004.
    7. Bitton, D., et al. One platform for mining structured and unstructured data: dream or reality? In Proceedings of VLDB, 2006.
    8. Bhattacharya, I., S. Godbole, and A. Joshi. Structured Entity Identification and Document Categorization Two Task with One Joint Model. In Proceedings of KDD, 2008.
    9. Chang, K.C.C., B. He, and Z. Zhang. Toward large scale integration: Building a metaquerier over databases on the web. In Proceedings of CIDR, 2005.
    10. Madhavan, J., et al. Web-scale data integration: You can only afford to pay as you go. In Proceedings of CIDR, 2007.
    11. Andrenucci, A. and E. Sneiders. Automated question answering: review of the main approaches. In Proceedings of ICITA, 2005.
    12. Lin, J. The Web as a resource for question answering: Perspectives and challenges. In Proceedings of LREC, 2002.
    13. Lin, J. The role of information retrieval in answering complex questions. In Proceedings of ACL, 2006.
    14. Hirschman, L. and R. Gaizauskas, Natural language question answering: The view from here. Natural Language Engineering, 2002. 7(04): p. 275-300.
    15. Burger, J., et al., Issues, tasks and program structures to roadmap research in question & answering (Q&A). Document Understanding Conferences Roadmapping Documents, 2001.
    16. Voorhees, E., et al., TREC: Experiment and evaluation in information retrieval. MIT Press, 2005.
    17. Moldovan, D., et al., LCC tools for question answering. TREC, 2003.
    18. Clarke, C.L.A. and E.L. Terra. Passage retrieval vs. document retrieval for factoid question answering. In Proceedings of SIGIR, 2003.
    19. Cui, H., et al., Question answering passage retrieval using dependency relations.
    20. Cui, H., et al. Question answering passage retrieval using dependency relations. In Proceedings of SIGIR, 2005.
    21. Liu, X. and W.B. Croft. Passage retrieval based on language models. In Proceedings of SIGIR, 2002.
    22. Mittendorf, E. and P. Schäuble. Document and passage retrieval based on hidden Markov models. In Proceedings of SIGIR, 1994.
    23. Salton, G., J. Allan, and C. Buckley. Approaches to passage retrieval in full text information systems. In Proceedings of SIGIR, 1993.
    24. Tellex, S., et al. Quantitative evaluation of passage retrieval algorithms for question answering. In Proceedings of SIGIR, 2003.
    25. Sneiders, E., Automated question answering using question templates that cover the conceptual model of the database. Lecture notes in computer science, 2002: p. 235-240.
    26. Tablan, V., D. Damljanovic, and K. Bontcheva, A natural language query interface to structured information. Lecture notes in computer science, 2008. 5021: p. 361.
    27. Lee, Y.H., et al., Complex Question Answering with ASQA. Entropy, 2008. 1: p. 10.
    28. Hickl, A., et al. Experiments with Interactive Question-Answering in Complex Scenarios. In Proceedings of ACL, 2004.
    29. McCallum, A., Information extraction: distilling structured data from unstructured text. QUEUE, 2005.
    30. Cafarella, M., et al. Structured Querying of Web Text. In Proceedings of CIDR, 2007.
    31. Cafarella, M.J., et al. Webtables: Exploring the power of tables on the web. In Proceedings of VLDB. 2008.
    32. Agichtein, E., C. Burges, and E. Brill. Question Answering over Implicitly Structured Web Content. In Proceedings of Web Intelligence. 2007.
    33. Cucerzan, S. and E. Agichtein, Factoid Question Answering over Unstructured and Structured Web Content. Microsoft Research Technical Report, 2005.
    34. Pinto, D., et al. Quasm: A system for question answering using semi-structured data. In Proceedings of JCDL, 2002.
    35. Frank, A., et al., Question answering from structured knowledge sources. Journal of Applied Logic, 2007. 5(1): p. 20-48.
    36. Bilotti, M.W., et al. Structured retrieval for question answering. In Proceedings of SIGIR, 2007.
    37. Roy, P., et al. Towards automatic association of relevant unstructured content with structured query results. In Proceedings of CIKM, 2005.
    38. Chang, K.C.C., B. He, and Z. Zhang. Toward large scale integration: Building a metaquerier over databases on the web. 2005.
    39. Cody, W.F., et al., The integration of business intelligence and knowledge management. Management, 2002. 41(4).
    40. Doan, A. and A.Y. Halevy, Semantic-integration research in the database community. AI magazine, 2005. 26(1): p. 83-94.
    41. Halevy, A., A. Rajaraman, and J. Ordille. Data integration: the teenage years. In Proceedings of VLDB, 2006.
    42. Levy, A., The Information Manifold approach to data integration. IEEE Intelligent Systems, 1998. 13(5): p. 12-16.
    43. Williams, D. and A. Poulovassilis. Combining data integration with natural language technology for the semantic web. In Proceedings of ISWC, 2003.
    44. Yao, C., et al. Towards a global schema for web entities. In Proceedings of WWW, 2008.
    45. Nie, Z., et al. Web object retrieval. In Proceedings of WWW, 2007.
    46. Ganti, V., A.C. Conig, and R. Vernica. Entity Categorization Over Large Document Collections,. In Proceedings of KDD, 2008.
    47. Allan, J., et al. Challenges in information retrieval and language modeling. In Proceedings of SIGIR, 2003.
    48. Kondrak, G., N-gram similarity and distance. Lecture notes in computer science, 2005. 3772: p. 115.
    49. Gravano, L., et al., Using q-grams in a DBMS for approximate string processing. IEEE Data Engineering Bulletin, 2001. 24(4): p. 28-34.

    下載圖示 校內:立即公開
    校外:2009-08-21公開
    QR CODE