簡易檢索 / 詳目顯示

研究生: 高苑芳
Kao, Yuan-Fang
論文名稱: 基於插曲探勘與軟式計算技術之領域實體論自動萃取方法
Automatic Extraction of Domain Ontology from Document Set Based on Episode Mining and Soft-Computing Techniques
指導教授: 郭耀煌
Kuo, Yau-Hwang
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2003
畢業學年度: 91
語文別: 英文
論文頁數: 124
中文關鍵詞: 實體論建構中文自然語言處理插曲探勘軟式計算
外文關鍵詞: Software Computing, Ontology Construction, Chinese Natural Language Processing, Episode Mining
相關次數: 點閱:117下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 實體論(Ontology)在許多的資訊系統及語意網(Semantic Web)中越來越重要,但建構實體論往往需要耗費大量的時間,且建構完成後維護實體論對知識管理者來說也是費時的工作。本論文中,我們提出基於插曲探勘(Episode Mining)與模糊推論之中文文件建構實體論方法;此外,我們以物件導向模式來表示實體論,並提出一個四層式物件導向實體論結構,用此種結構來建造領域知識。在我們提出的方法中包含實體論建構及實體論學習兩部分,在實體論建構方面,採用自組織映射圖網路來找出實體論中的概念(Concept)與實體(Instance),利用插曲探勘及自然語言處理來找出概念中靜態的屬性(Attribute)、動態的行為(Operation)與概念間的關連關係(Association);在實體論學習方面,利用三層的平行模糊推論機制來得到新的實體,並擷取出新的資訊來更新原有的領域實體論;經由實驗驗證,此方法能有效的協助知識管理者建構及維護中文新聞之實體論。

    Ontology is increasingly important in many information systems and Semantic Web. The problem of it is that the construction of ontology is a time-consuming job and ontology engineers have to spend much time to maintain it. In this thesis, we propose an episode-based fuzzy inference mechanism to extract domain ontology from unstructured Chinese news documents. In addition, we propose a four-layer object-oriented ontology to structure domain knowledge. This approach contains domain ontology construction and domain ontology learning task. The Self-Organization Map (SOM) algorithm is adopted for concept clustering and taxonomic relation defined. Moreover, the attributes, operations of concepts and associations between concepts can be extracted based on episodes and morphological analysis. The three-layer parallel fuzzy inference mechanism is further applied to obtain new instances for ontology learning. Besides, new information will be extracted to update the domain ontology. The experimental results show that our approach can effectively assist ontology engineers to construct and maintain the Chinese news domain ontology.

    Contents List of Figures ………………………………………………………VII List of Tables ………………………………………………………IX Chapter 1 Introduction ……………………………………………1 1.1 Motivation and Research Contribution ……………………………………1 1.2 Overview of Research on Ontology Construction …………………………3 1.3 Thesis Organization ……………………………………………………4 Chapter 2 Related Work ……………………………………………5 2.1 Survey of Ontology …………………………………………………………5 2.1.1 What is Ontology? ……………………………………………………5 2.1.2 Why need Ontology? ………………………………………………6 2.1.3 Application of Ontology ………………………………………………8 2.2 Survey of Related Techniques Applied in Automatic Ontology Construction ………………………………………………………………………10 2.2.1 Concept Clustering …………………………………………………10 2.2.2 Relation Extraction …………………………………………………11 2.3 Survey of Ontology Learning ……………………………………………11 2.4 Survey of Episode ………………………………………………………13 Chapter 3 Object-Oriented Ontology ……………………………16 3.1 Structure of Object-Oriented Ontology …………………………………16 3.2 Development of Object-Oriented Ontology ……………………………21 3.3 Representation of Object-Oriented Ontology with UML Notation …22 Chapter 4 Automatic Domain Ontology Construction …………25 4.1 Document Pre-processing ……………………………………………26 4.2 Concept Clustering ………………………………………………………29 4.2.1 The Conceptual Similarity in POS Between Any Two Terms …31 4.2.2 The Conceptual Similarity in Term-Vocabulary Between Any Two Terms ……………………………………………………………32 4.2.3 The Conceptual Similarity in Term-Concept Between Any Two Terms …………………………………………………………………33 4.3 Extraction of Episodes …………………………………………………35 4.3.1 Algorithm of Extraction Episodes …………………………………36 4.4 Extraction of Attributes, Operations and Associations …………………39 4.4.1 Mapping Instances and Concepts …………………………………39 4.4.2 Extraction of Attributes, Operations, and Associations of Concepts and Instances ……………………………………………………………40 4.4.2.1 Morphological features of Chinese terms ………………………41 Chapter 5 Domain Ontology Learning ……………………………46 5.1 Extraction of Candidate Instances ………………………………………47 5.2 Parallel Fuzzy Inference System ………………………………………51 5.2.1 Conceptual Resonance Strength Between a Concept and a New Instance ……………………………………………………………51 5.2.2 Generation of Knowledge Base of PFIS by Genetic Learning …57 5.2.2.1 Generation of data base by genetic learning ………………………57 5.2.2.2 Generation of rule base by data-driven method …………………60 5.2.3 A Parallel Fuzzy Inference Mechanism for Conceptual Resonance Strength Computing ……………………………………………64 5.3 Check and Update ………………………………………………………69 Chapter 6 Experimental Results and Analysis ……………………72 6.1 The Results of Automatic Domain Ontology Construction ……………72 6.1.1 Document Pre-processing Analysis ……………………………72 6.1.2 Concept Clustering Analysis ………………………………………73 6.1.3 Precision of Ontology Construction ……………………………76 6.2 The Results of Domain Ontology Update ……………………………81 6.2.1 Candidate Instances Extraction Analysis ………………………82 6.2.2 Parallel Fuzzy Inference Analysis …………………………………82 6.2.3 Check and Update Process Analysis ……………………………87 Chapter 7 Conclusions and Future work …………………………89 7.1 Conclusions ……………………………………………………………89 7.2 Future work ……………………………………………………………90 References ……………………………………………………………91 Appendix ……………………………………………………………97 Appendix A. Conceptual Structure in CKIP (Modified for 2002 FIFA World Cup Domain) ………………………………………………………97 Appendix B. Complete Results of Automatic Ontology Construction in 2002 FIFA World Cup Domain ……………………………………………98 Appendix C. Complete Results of Automatic Ontology Construction in Typhoon Domain ……………………………………………………………105 Appendix D. Results of Candidate Instances Extraction in 2002 FIFA World Cup Domain ……………………………………………………………111 Appendix E. Results of Candidate Instances Extraction in Typhoon Domain …114 Appendix F. Results of Domain Ontology Learning in 2002 FIFA World Cup Domain ……………………………………………………………117 Appendix G. Results of Domain Ontology Learning in Typhoon Domain ………120 Biography……………………………………………………………124

    References
    [1] H. Ahonen, O. Heinonen, M. Klemettinen, and A.I. Verkamo, “Applying Data Mining Techniques for Descriptive Phrase Extraction in Digital Document Collections,” in: Proc. Advances in Digital Libraries Conference, Santa Barbara, CA, 1998, pp. 2-11.
    [2] H. Alani, K. Sanghee, D.E. Millard, M.J. Weal, W. Hall, P.H. Lewis, and N.R. Shadbolt, “Automatic Ontology-based Knowledge Extraction from Web Documents,” IEEE Intelligent Systems, Vol. 18, No. 1, Jan/Feb, 2003, pp. 14-21.
    [3] R. Baeza-Yates and B. Ribeiro-Neto, “Modern Information Retrieval,” UK, Harlow, 1999.
    [4] P.F. Brown, V.J. Della Pietra, P.V. deSouza, J.C. Lai, and R.L. Mercer, “Class-based n-gram Models of Natural Language,” International Journal of Computational Linguistics, Vol. 19, No. 4, 1992, pp. 467-479.
    [5] B. Chandrasekaran, R. Josephson, and R. Benjamins, “What Are Ontologies, and Why Do We Need Them?,” IEEE Intelligent Systems, Vol. 14, No 1, Jan, 1999, pp. 20-26.
    [6] P. Clerkin, P. Cunningham, and C. Hayes, “Ontology Discovery for the Semantic Web Using Hierarchical Clustering,” in: Proc. of the Semantic Web Mining Workshop, 2001.
    [7] O. Corcho and A. Gomez-Perez, “A Roadmap to Ontology Specification Languages,” in: Proc. of the 12th International Conference on Knowledge Engineering and Knowledge Management (EKAW'00), Juan-les-Pins, France, Oct, 2000.
    [8] S. Cranefield and M. Purvis, “UML as An Ontology Modelling Language,” in: Proc. Workshop on Intelligent Information Integration, 16th International Joint Conference on Artificial Intelligence (IJCAI-99), 1999.
    [9] O. Cordon, F. Herrera, and P. Villar, “Generating the Knowledge Base of a Fuzzy Rule-Based System by the Genetic Learning of the Data Base,” IEEE Transaction on Fuzzy Systems, Vol. 9, No. 4, Aug, 2001, pp. 667-674.
    [10] D.W. Embley, D.M. Campbell, R.D. Smith, and S.W. Liddle, "Ontology-based Extraction and Structuring of Information from Data-Rich Unstructured Documents," in: Proc. of ACM Conference on Information and Knowledge Management, USA, Feb, 1998, pp. 52-59.
    [11] A. Farquhar, R. Fikes, and J. Rice, “The Ontolingua Server: a Tool for Collaborative Ontology Construction,” International Journal of Human-Computer Studies, Vol. 46, No. 6, 1997, pp. 707-727.
    [12] D. Faure and C. N'edellec, “A Corpus-based Conceptual Clustering Method for Verb Frames and Ontology Acquisition.” In LREC Workshop on Adapting Lexical and Corpus Resources to Sublanguages and Applications, Granada, Spain, Mai 1998.
    [13] D. Fensel, “The Semantic Web and Its Languages,” IEEE Intelligent Systems, Nov/Dec, 2000, pp. 67-73.
    [14] D. Fensel, “Ontology-based Knowledge Management,” IEEE Computer, Vol. 35, No. 11, Nov, 2002, pp. 56-59.
    [15] A. Gomez-Perez and O. Corcho, "Ontology languages for the semantic web," IEEE Intelligent systems, Vol. 17, No. 1, Jan/Feb 2002, pp. 54-60.
    [16] T. Gruber, “What is An Ontology?,” URL Accessed on November 9, 2001, http://www-ksl.stanford.edu/kst/what-is-an-ontology.html
    [17] T.R. Gruber, “ONTOLINGUA: A Mechanism to Support Portable Ontologies,” Technical Report, Knowledge Systems Laboratory, Stanford University, Stanford, United States, 1992.
    [18] N. Guarino, “Formal Ontology and Information System,” in: Proc. of the First International Conference (FOIS'98), Trento, Italy, June, 1998.
    [19] N. Guarino. “The Role of Identity Conditions in Ontology Design,” Lecture Notes in Computer Science, Vol. 1661, 1999, pp. 221-234.
    [20] N. Guarino, “Understanding, Building and Using Ontologies,” International Journal of Human-Computer Studies, Vol. 46, pp. 293-310, 1997.
    [21] N. Guarino, C. Masolo, and G. Vetere, “OntoSeek: Content-Based Access to the Web,” IEEE Intelligent Systems, Vol. 14, No. 3, May, 1999, pp. 70-80.
    [22] L. Khan and F. Luo, “Ontology Construction for Information Selection,” in: Proc. The 14th IEEE International Conference on Tools with Artificial Intelligence, Crystal City, Virginia, 2002, pp. 122-127.
    [23] J.T. Kim and D.I. Moldovan, “Acquisition of Linguistic Patterns for Knowledge-based Information Extraction,” IEEE Transaction on Knowledge and Data Engineering, Vol. 7, No. 5, Oct, 1995, pp.713-724.
    [24] Y. Kitamura and R. Mizoguchi, “Ontology-based Description of Functional Design Knowledge and its Use in a Functional Way Server," International Journal of Expert Systems with Applications, Vol. 24, No. 2, Feb, 2003, pp. 153-166.
    [25] P. Kogut, S. Cranefield, L. Hart, M.Dutra, K. Baclawski, M. Kokar and J. Smith. “UML for Ontology Development”, The Knowledge Engineering Review, Vol. 16, No. 4, Cambridge University Press, Dec, 2001.
    [26] T. Kohonen, “Self-Organizing Maps,” Second Edition, Springer-Verlag, Heidelberg, 1997.
    [27] Y.H. Kuo, J.P. Hsu and C.W. Wang, “A Parallel Fuzzy Inference Model with Distributed Prediction Scheme for Reinforcement Learning,” IEEE Systems, Man, and Cybernetics, Vol. 28, No. 2, 1998, pp. 160-172.
    [28] C.S. Lee, C.H. Liao, and Y.H. Kuo, “A Semantic-based Concept Clustering Mechanism for Chinese News Ontology Construction,” in: Workshop on Artificial Intelligence of International Computer Symposium, Taiwan, 2002.
    [29] C.S. Lee, C.P. Chen, H.J. Chen, and Y.H. Kuo, “A Fuzzy Classification Agent for Personal e-News Service,” International Journal of Fuzzy Systems, Vol. 4, No. 4, Dec, 2002, pp.849-856.
    [30] D.B. Lenat, “CYC: a Large-scale Investment in Knowledge Infrastructure,” Communications of the ACM, Vol. 38, No. 11, 1995, pp. 33-41.
    [31] C.H. Liao, “Aotomatic Ontology Construction Approach and Its Application for Information Classification,” Master, Thesis, Department of Computer Science & Information Engineering, National Cheng Kung University, Taiwan, Jun, 2002.
    [32] C.T. Lin and C.S.G. Lee, “Neural-Network-Based Fuzzy Logic Control and Decision System,” IEEE Computers, Vol. 40, No. 12, 1991, pp. 1320-1336.
    [33] A. Maedche and S. Staab, “Ontology Learning for the Semantic Web,” IEEE Intelligent Systems, Vol. 16, No. 2, 2001, pp. 72-79.
    [34] Maedche and S. Staab, “Semi-automatic Engineering of Ontologies from Text,” in: Proc. of the 12th International Conference on Software Engineering and Knowledge Engineering SEKE, Chicago, USA, Jul, 2000.
    [35] H. Mannila, H. Toivonen and A.I. Verkamo, “Discovery of Frequent Episodes in Event Sequences,” International Journal of Data Mining and Knowledge Discovery, Vol. 1, No. 3, 1997, pp. 259-289.
    [36] H.M. Meng and K.C. Siu, “Semiautomatic Acquisition of Semantic Structures for Understanding Domain-specific Natural Language Queries,” IEEE Transactions on Knowledge and Data Engineering, Vol. 14, No. 1, Jan/Feb 2002.
    [37] G..A. Miller, “WORDNET: An On-Line Lexical Database,” International Journal of Lexiography, Vol. 3, No. 4, 1990, pp. 235-312.
    [38] Z. Min, M. Qing, Z. Ming and M. Shaoping, “Emergence of Chinese Semantic Maps from Self-Organization," in: Proc. of the 8th International Conference on Neural Information Processing ICONIP, Shanghai, China, Nov, 2001.
    [39] M. Missikoff, R. Navigli, and P. Velardi, “Integrated Approach to Web Ontology Learning and Engineering”, IEEE Computer, Vol. 35, No. 11, 2002, pp. 60-63.
    [40] R. Navigli, P. Velardi, and A. Gangemi, “Ontology Learning and Its Application to Automated Terminology Translation,” IEEE Intelligent Systems, Vol. 18, No. 1, 2003, pp. 22 -31.
    [41] N.F. Noy, M. Sintek, S. Decker, M. Crubézy, R.W. Fergerson, and M.A. Musen, “Creating Semantic Web Contents with Protégé-2000,” IEEE Intelligent Systems, Vol. 16, No. 2, 2001, pp. 60-71.
    [42] N. F. Noy and D. L. McGuinness, “Ontology Development 101: A Guide to Creating Your First Ontology,” Stanford Knowledge Systems Laboratory Technical Report KSL-01-05 and Stanford Medical Informatics Technical Report SMI-2001-0880, Mar, 2001.
    [43] B. Omelayenko, “Learning of Ontologies for the Web: the Analysis of Existent Approaches,” in: Proc. of the International Workshop on Web Dynamics held in conj. with the 8th International Conference on Database Theory (ICDT’01), London, UK, Jan, 2001.
    [44] B. Omelayenko, “Machine Learning for Ontology Learning,” report for the course, International Jyvaskyla Summer School, Finland, Aug, 2000.
    [45] J. Rumbaugh, M. Blaha, W. Premerlani, F. Eddy, and W. Lorensen, “Object-Oriented Modeling and Design,” Prentice Hall, 1991.
    [46] V.Sugumaran and V.C. Storey, “Ontologies for Conceptual Modeling: Their Ceation, Ue, and Mnagement,” International Journal of Data & Knowledge Engineering, Vol. 42, No. 3, 1997, pp. 251-271.
    [47] Y. Sure, M. Erdmann, J. Angele, S. Staab, R. Studer, and D. Wenke, “OntoEdit: Collaborative Ontology Development for the Semantic Web,” in: Proc. the first International Semantic Web Conference, Sardinia, Italia, 2002.
    [48] V.W. Soo and C.Y. Lin, “Ontology-based Information Retrieval in a Multi-agent System for Digital Library,” in: Proc. the 6th Conference on Artificial Intelligence and Applications, Taiwan, 2001, pp. 241-246.
    [49] Y.A. Tijerino and R. Mizoguchi, “MULTIS II: Enabling End Users to Design Problem Solving Engines via Two-level Ttask Ontologies,” in Lecture Notes in Artificial Intelligence 723: Knowledge Acquisition for Knowledge-Based Systems, Caylus, France, 1993, pp. 340-359.
    [50] M. Uschold and M. Gruninger, “Ontologies: Principles, Methods, and Applications,” International Journal of Knowledge Engineering Review, Vol. 11, No. 2, 1996, pp. 93-155.
    [51] P.E. van der Vet and N.J.I. Mars, “Bottom-up Construction Ontologies,” IEEE Transaction on Knowledge and data Engineering, Vol. 10, No. 4, Jul/Aug, 1998, pp.513-526.
    [52] G. van Heijst, A. Th. Schreiber, and B.J. Wielinga, “Using Explicit Ontologies in KBS Development,” International Journal of Human–Computer Studies, Vol. 46, 1997, pp. 183–292.
    [53] L.X. Wang and J. M. Mendel, “Generating Fuzzy Rules by Learning from Examples,” IEEE Transaction on Systems Man, and Cybernetics, Vol. 22, No. 6, Nov/Dec, 1992, pp. 1414-1427.
    [54] K. Waikit and M. Lik, “An Information Theoretic Approach for Ontology-based Interest Matching,” in: Proc. 17th International Conference on Artificial Intelligence IJCAI, Workshop on Ontology Learning, Seattle, Washington, USA, 2001.
    [55] D. H. Widyantoro and J. Yen, “A Fuzzy Ontology-based Abstract Search Engine and Its User Studies,” in: Proc. of the 10th IEEE International Conference on Fuzzy Systems, Melbourne, Australia, 2001.
    [56] B. J. Wielinga and A. Th. Schreiber, “Reusable and Shareable Knowledge Bases: A European Perspective,” in: Proc. Of the International Conference on Building and Sharing of Very Large-Scaled Knowledge Bases '93, Japan Information Processing Development Center, Tokyo, Japan, Dec, 1993, pp. 103-115.
    [57] S.J. Yen and A.L.P. Chen, “An Efficient Approach to Discovering Knowledge from Large Databases,” in: Proc. of the 4th International Conference on Parallel and Distributed Information Systems, Florida, USA, 1996, pp. 8-18.
    [58] K. Yoshinaga, T. Terano, and N. Zhong, “Multi-lingual Intelligent Information Retriever with Automated Ontology Generator,” in: Proc. of the 3th International Conference on Knowledge-Based Intelligent Information Engineering Systems, Adelaide, Australia, 1999,pp. 62-65.
    [59] L. Zhou, Q. E. Booker, and D. Zhang, “ROD – Toward Rapid Ontology Development for Underdeveloped Domains,” in: Proc. of the 35th Annual Hawaii International Conference on System Sciences, 2002.
    [60] Academia Sinica, Chinese Electronic Dictionary, in: Technical Report (93-05), Taiwan, 1993.

    下載圖示 校內:2004-08-18公開
    校外:2004-08-18公開
    QR CODE