簡易檢索 / 詳目顯示

研究生: 蕭智暉
Hsiao, Zhi-Hui
論文名稱: 應用軟式計算技術於物件導向式實體論來發展語意搜尋代理人
Applying Soft-Computing Techniques to Develop a Semantic Search Agent on an Object-Oriented Ontology
指導教授: 郭耀煌
Kuo, Yau-Hwang
郭淑美
Guo, Shu-Mei
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2003
畢業學年度: 91
語文別: 英文
論文頁數: 94
中文關鍵詞: 語意搜尋文法分析基因演算法模糊推論領域實體論延伸布林模式
外文關鍵詞: Ontology, Extended Boolean Model, TSK model, Semantic Search Agent, Grammar Analysis, Genetic Algorithm
相關次數: 點閱:105下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在資訊暴增的今日,人們常容易迷失在大量的訊息之中,雖然資訊檢索的發展減輕了負擔,但傳統的方法卻也漸漸的無法滿足使用者。本論文嘗試跳脫傳統利用統計文件關鍵字,抽象化表現文件概念的方法;改經由語意理解的角度,去推論文章的內容,以期望帶給更符合使用者需求的搜尋結果。在論文中,首先我們提出一套物件導向式的模型,架構出領域實體論,作為語意推論的知識庫。接著我們利用文法規則精確剖析文章,擷取出中心語意,再搭配建構在物件導向式實體論的模糊推論引擎,逐層解析內容,並建構出具有語法性質的索引結構,最後搭配Extended Boolean Model,達到個人化的檢索機制。關於模糊推論引擎的效能調校,我們引用基因演算法搭配專家的建構機制,以期增強系統的強韌度。此外關於中文文法的處理,我們採用中研院提供的CKIP工具做詞性標注等額外的輔助工作。

    Nowadays, we can retrieve information from search engine, but traditional search engine can’t perform the search with high quality and low quantity of information. That’s because of traditional search engine mainly use pure keyword and some statistic data to process the similarity between query words and documents. In the thesis, we first propose an infrastructure of object-oriented ontology to be the knowledge base of fuzzy inference model. Then we apply Chinese grammar to do syntax processing for separating different parts of sentences, and we put kernel of sentences to gradually infer the resulted instance with domain ontology. We collect various linguistic messages in documents, such as passive, negative voice and semantic degree to store in index at this stage. At last we apply Extended Boolean Model to be the personalized ranking mechanism. On the other hand, we apply Genetic Algorithm on tuning the parameters of TSK model to make our system more robust. Besides, we take the CKIP as the part-of-speech tagging tool, and it is the base of Chinese grammar analysis.

    Chapter 1 Introduction 1 1.1 Overview of Information Retrieval 1 1.2 Motivation 3 1.3 Research Contributions 5 1.4 Thesis Organization 6 Chapter 2 Related Work and Background 7 2.1 Information Retrieval Model 7 2.1.1 Measuring Query Effectiveness (precision, recall) 7 2.1.2 Term Weighting 9 2.1.3 Applied IR Models 10 2.1.3.1 Vector Space Retrieval Model 11 2.1.3.2 Extended Boolean Model 12 2.1.4 Additional Performance-Enhancing Mechanism 13 2.2 Information Retrieval with Knowledge Base 14 2.2.1 Taxonomy of Knowledge Base 15 2.2.2 Ontology Language 15 2.2.3 Semantic Web and Web Service 16 2.3 Information Classification 17 2.3.1 Feature Selection in Text 17 2.3.2 Classified Mechanism 18 2.3.2.1 Classification by Backpropagation 18 2.3.2.2 Classification by Fuzzy Agent 18 Chapter 3 Object-Oriented Ontology 20 3.1 Procedures of Constructing Object-Oriented Ontology 20 3.2 Description with DAML+OIL 24 Chapter 4 Semantic Information Retrieval Mechanism 27 4.1 Document Preprocessing 28 4.1.1 Calculation of Similarity between two Chinese Terms 28 4.1.2 Syntax Processing and Semantic Tree Constructing 33 4.1.3 Concept Resolution by Fuzzy Inference Mechanism 38 4.1.3.1 Instance Name Similarity 39 4.1.3.2 Class Name Similarity 41 4.1.3.3 Attribute-value Similarity 42 4.1.3.4 Defuzzification by TSK model 44 4.1.4 Resolving Instances and Actions in Gradual Steps 46 4.2 Query Processing 49 4.3 Results Ranking 50 4.3.1 Ranking Mechanism 51 4.3.2 Structural Analysis of Results 53 Chapter 5 Genetic Tuning the parameters of TSK Model 55 5.1 Problem Description 55 5.2 Genetic Algorithm Mechanism 57 5.2.1 Individual 57 5.2.2 Initial Population 58 5.2.3 Evaluation Function 58 5.2.4 Genetic Operators 59 5.2.4.1 Selection 59 5.2.4.2 Crossover 59 5.2.4.3 Mutation 60 5.2.4.4 Replacement 60 Chapter 6 Experimental Results and Analysis 62 6.1 Experimental Domain and Related Data 62 6.2 Experiment on Similarity between Two Chinese Terms 63 6.3.1 Analysis of Convergence 65 6.3.2 Analysis of Trained Parameters 66 6.3.2 Performance with Trained Parameters 69 6.3 Experiments on Syntax Processing and Index Constructing 71 6.4 Experiments on Search Results 72 Chapter 7 Conclusions and Future Work 77 7.1 Conclusions 77 7.2 Future work 78 References 80 Appendix 84 Appendix A. The Part-of-Speech Tags and Their Corresponding Meaning of CKIP 84 Appendix B. Examples of Constructing Index 88

    [Bodn96] R. C. Bodner and F. Song, “Knowledge-Based Approaches to Query Expansion in Information Retrieval,” Canadian Conference on Artificial Intelligent, New York, pp.146-158
    [Chi02] C. H. Chi, C. Ding and K. Y. Lam, ”Context Query in Information Retrieval,” 14th IEEE International Conference on Tools with Artificial Intelligence, Washington, DC, Nov. 04-06, 2002, pp.101-106
    [Choi03] D. Y. Choi, “Enhancing the power of Web search engines by means of fuzzy query,” Decision Support Systems, Vol. 35, pp.31-44, 2003
    [Gome02] A. Gomez-Perez and O. Corcho, “Ontology Language for the Semantic Web,” IEEE Intelligent Systems, Vol.17, Issue 1, pp.54-60, Jan/Feb 2002
    [Han01] J. Han and M. Kamber, Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers, 2001
    [Henz01] M. R. Henzinger, “Hyperlink analysis for the Web,” IEEE Internet Computing, Vol.5, Issue 1, pp.45-50, Jan/Feb 2001
    [Hodg01] J. Hodgson, “Do HTML Tags Flag Semantic Content?” IEEE Internet Computing, Vol.5, Issue 1, pp.20-25, Jan/Feb 2001
    [Jang93] J.-S.R. Jang, “ANFIS: adaptive-network-based fuzzy inference system,“ IEEE Transactions on Systems, Man and Cybernetics, Vol. 23, Issue 3, pp.665–685, May-June 1993
    [Jing99] H. Jing and E. Tzoukermann, “Information Retrieval Based on Context Distance and Morphology,” Proceeding of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Berkeley, CA, USA, Aug. 15-19, 1999, pp.90-96
    [Jing94] Yufeng Jing and W. B. Croft, “An Association Thesaurus for Information Retrieval,” Proceeding of RIAO-94 Conference, New York, US, Oct. 1994, pp.146-160
    [Klir95] G.. J. Klir and B.Yuan, Fuzzy Sets and Fuzzy Logic : Theory and Applications, Binghamton, New York: Prentice Hall, 1995
    [Lee02] C. S. Lee, C. P. Chen, H. J. Chen and Y. H. Kuo, “A Fuzzy Classification Agent for Personal e-News Service,” International Journal of Fuzzy Systems, Vol. 4, No. 4, pp.849-856, Dec 2002
    [Liao02] J. X. Liao, C. S. Lee and Y. H. Kuo, “Automatic Ontology Construction Approach and Its Application for Information Classification,” Master, Thesis, Dept. of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan.
    [Mart02] T.P. Martin, “Softer Concepts Mean Smarter Queries,” Proceeding of 2002 NAFIPS Conference, New Orleans, US, Jun. 27-29 2002, pp.152-157
    [Mich99] Zbigniew Michalewics, Genetic Algorithms + Data Structures = Evolution Programs, Springer, 1999
    [Nick01] Z. Z. Nick and P. Themis, “Web Search Using a Genetic Algorithm,” IEEE Internet Computing, Vol.5, pp.18-26, Mar/Apr 2001
    [NikR02] M. NikRavesh, “Fuzzy Conceptual-based Search Engine using Conceptual Semantic Indexing,” Proceeding of 2002 NAFIPS Conference, New Orleans, US, Jun. 27-29 2002, pp.146-151
    [Ohga02] R. Ohgaya, T. Takagi , K. Fukano, K. Taniguchi , A. Aizawa ,M. Nikravesh, “Conceptual fuzzy sets-based navigation system for Yahoo!,“ Proceeding of 2002 NAFIPS Conference, New Orleans, US, Jun. 27-29 2002, pp.274-279
    [Oliv02] J. A. Olivas, P. J. Garces and F. P. Romero, “FISS: application of fuzzy technologies to an internet metasearcher,” Proceeding of 2002 NAFIPS Conference, New Orleans, US, Jun. 27-29 2002, pp.140-145
    [Park99] S. Park and C. Wu, “Intelligent search agent for software components,” Proceeding of 6th Asia Pacific Software Engineering Conference (APSEC), Dec. 07-10 1999, Takamatsu, Japan, pp.154-161
    [Pere02] A.G. Perez and O. Corcho, “Ontology Languages for the Semantic Web,” IEEE Intelligent Systems, Vol.17, pp.54-60, Jan/Feb 2002
    [Ragh86] V. V. Raghavan and S. K. M. Wong, “A Critical Analysis of Vector Space Model for Information Retrieval,” Journal of the American Society for Information Science, Vol.37, No.5, pp.279-287, 1986
    [Rumb91] J. Rumbaugh, M. Blaha, W. Premerlani, F. Eddy and W. Lorensen, Object-Oriented Modeling and Design, New York: Prentice Hall, 1991
    [Salt88] G. Salton and C. Buckley, “Term Weighting Approaches in Automatic Text Retrieval,” Information Processing & Management, Vol. 24(5), pp.513-523, 1988
    [Sada00] K. Sadakane and H. Imai, “Text retrieval by using k-word proximity search,” Proceedings of 1999 International Symposium on Database Applications in Non-Traditional Environments (DANTE'99), Nov. 28-30, 1999, Kyoto, Japan, pp.183-188
    [Scim00] A. Scime and L. Kerschberg, “WebSifter: An Ontology-based Personalizable Search Agent for the Web,“ International Conference on Digital Libraries: Research and Practice, Kyoto, 2000, pp.203 –210
    [Shin02] A. Shinmura, K. Taniguchi, K. Kawahara and T. Takagi, “Exposure of illegal Web sites using conceptual fuzzy sets-based information filtering system,” Proceeding of 2002 NAFIPS Conference, New Orleans, US, Jun. 27-29 2002, pp.327-332
    [Silv01] B. G.. Silverman, M. Bachann and K. Al-Akharas, “Do What I Mean: Online Shopping with a Natural Language Search Agent,” IEEE Intelligent Systems, Vol. 16, No. 4, pp. 48-53, Jul/Aug 2001
    [Step01] L. M. Stephens and M. N. Huhns, “Consensus ontologies. Reconciling the semantics of Web pages and agents,” IEEE Internet Computing, Vol.5, Issue 5, pp.92-95, Sep/Oct 2001
    [Swar02] A. Swartz, “MusicBrainz: A Semantic Web Service,” IEEE Intelligent Systems, Vol. 17, Issue: 1, Jan/Feb 2002, pp.76-77
    [Widy01] D.H. Widyantoro and J. Yen, “Using Fuzzy ontology for Query Refinement in a Personalized Abstract Search Engine,” 9th FSA World Congress and 20th NAFIPS International Conference, 2001, pp.610-615
    [Yang97] Y. Yang and J. P. Pedersen, “A Comparative Study on Feature Selection in Text Categorization,” Proceeding of the 14th International Conference on Machine Learning (ICML'97), 1997, pp.412-420
    [Yate99] R. B. Yates and B. R. Neto, Modern Information Retrieval, Edinburgh: Addison-Wesley, 1999
    [Yen99] J. Yen and R. Langari, Fuzzy logic: intelligence, control, and information, Texas, USA: Prentice Hall, 1999
    [Hear99] http://www.sims.berkeley.edu/courses/is202/f00/lectures/Lecture5_202.htm
    [Pige] http://www.google.com/technology/pigeonrank.html
    [Crof99] B. Crof and J. Callan, Center of Intelligent Information Retrieval, University of Massachusetts Amherst, http://www.cpe.ku.ac.th/~arnon/Mirror/ir-p/Notes/VectorSpace/

    下載圖示 校內:立即公開
    校外:2003-08-25公開
    QR CODE