成功大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	陳彥勳 Yen-Shun, Chen
論文名稱：	使用WWW資源協助知識本體整合 Ontology Integration Based on World Wide Web Resources
指導教授：	王惠嘉 Wang, Hei-Chia
學位類別：	碩士 Master
系所名稱：	管理學院 - 資訊管理研究所 Institute of Information Management
論文出版年：	2006
畢業學年度：	94
語文別：	中文
論文頁數：	56
中文關鍵詞：	網際網路、語意整合、網路探勘、資訊擷取、知識本體相配
外文關鍵詞：	world wide web, ontology mapping, taxonomy, web mining
相關次數：	點閱：260 下載：1
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

　　知識本體(Ontology)是一種知識表達法，主要是建構出人類的知識概念，及概念間的關連，其特性乃是：正規，共享，重用。這些特性使得知識本體間可以直接經由電腦達到自動化的溝通而不需要人工介入---這是知識本體應用最終目標。
　　然而各組織在建置自己的知識本體之前，並無事先溝通建置標準。因此可以說，沒有重複的知識本體。當跨組織知識分享的需要產生時，所帶來的就是不同規格標準的知識本體整合問題。從2000年有許多學者提出不同知識本體間的比對方法(ontology mapping )。其中領域語彙典，是一項重要輔助資源，它能解決同義異詞的問題，找出語意上正確的概念對應關係。但缺點是語彙典建置需要大量人工、時間，更新慢，使得一些新的用詞，或新興領域用語，無法即時被收錄。因此本體比對若遇到新詞，常常推論錯誤。
　　本研究為了解決知識本體在做比對時時輔助資訊(information sparseness)不足的問題。我們利用網際網路的資訊，它不止量大，更是常常有新的資訊加入，不會有找不到新詞的疑慮。然而，雜訊多一直是網頁資訊的缺點。為此，許多過濾的機制便因應而生，其中一個顯著有效的，並可應用於字詞關係推論的方法，就是”語意式網路探勘技術”。它藉由找尋一些特定結構的句子，來判斷字詞間的關連。我們利用這種概念，並加入網路位址，以及連結的分析，希望從網頁找出更豐富的資訊，來協助知識本體整合。這樣的系統，其背後的文件庫乃是日新月異且龐大的網際網路資源，這樣的方法可擺脫語彙典的限制，使知識本體整合更具時效性。

　　Ontology is a kind of knowledge representation model. It can represent the concepts of humans and the relationship between concepts. Owing to its characters which are formal, explicit, and sharing, the computers can communicate with each other through ontologies automatically.
　　However, different organizations construct their own ontologies to use by themselves. It brought the situation that many ontologies appears but stand on different standards. When Knowledge sharing requirement between organizations arises, the integration problem between ontologies with variant standards happens. Since 2000, many researches attempt to deal with the ontology integration problem. In their methods, the thesauri are main auxiliary information resources. The thesauri are used to detect synonym, hypernym, and hyponym in the mapping process, and increase the mapping precision. But the thesauri construction are time-consuming and human power-consuming. It causes two problems. First, for the new domain, it does not have a thesaurus. And for the new terms, they are not put in to the thesaurus in time. Owing to the bottleneck, the ontology mapping applications above are restricted to some domains.
　　To solve the problem of auxiliary information lacking, we utilize web as our resources. For find the precise concepts relations on web, we use sites structure analysis and linguistic pattern mining to find some concepts relations cues. Then we combine both cues to extract the hierarchy knowledge hidden in the site. The hierarchy knowledge can be used to support ontology mapping process to improve mapping precision.

摘要　　　　　　　　　　　　　　　　I
abstract　　　　　　　　　　　　　　II
目錄圖目錄　　　　　　　　　　　　　III
圖目錄　　　　　　　　　　　　　　　IV
表目錄　　　　　　　　　　　　　　　V
　　　緒論　　　　　　　　　　　　　1
1　　　研究背景	               1
2　　　研究動機與目的	            4
3　　　研究流程	               6
4　　　研究範圍限制	             7
5　　　論文章節說明	             7
　　　文獻探討	               8
1　　　知識本體論	              8
2.2　　　傳統知識本體比對方法	10
2.3　　　目前知識本體比對方法	11
　　　研究方法	              16
1　　　方法架構總覽	            16
2　　　概念樹(Concept tree)　　　　17
2.1　　　Site Selection	          19
2.2　　　Site tree construction	  20
2.3　　　Site link structure	  20
2.4　　　Page content structure	  21
2.5　　　概念屬性蒐集	            31
2.6　　　Concept tree construction 33
　　　實作驗證	              38
1　　　系統建構	              38
2　　　實驗方法	              39
2.1　　　資料來源	              39
2.2　　　評估指標	              44
　　　結論　　　　　　　　　　　　　51
　　　　參考文獻　　　　　　　　　　　53
                                    

Baeza-Yates, R. and Ribeiro-Neto, B. (1999). Modern Informatin Retrieval. New 　　　York: The ACM Press.
Berghel, H. (1997). Cyberspace 2000: Dealing with information overload. 　　　　　　Communications of the ACM, 40(2), 19-24.
Bernstein, P. A., Madhavan, J. and Rahm, E.(2001). Generic Schema Matching with 　　Cupid. In the Twenty Seventh International Conference on Very Large Databases 　　(VLDB'2001), Aug, Roma, Italy.
BRILL, E. (1994). Some advances in transformation-based part of speech tagging. 　　In Proceedings of the Twelfth National Conference on Artificial Intelligence,　　722-727.
Chen, J., Zhou, B., Shi, J., Zhang, H. and Wu, Q. ( 2001). Function-based Object 　　Model Towards Website Adaptation. In Proceedings of 10th International WWW 　　　Conference, Hong-Kong. 587-596.
Chen, Z., Liu, S., Liu, W. and Ma, W.-Y. (2003, July). Building a Web Thesaurus 　　from Web Link Structure. In Proceeding of the 26th annual international ACM 　　SIGIR conference on Research and development in information retrieval, 　　　　　Toronto, Canada. 48-55.
Cimiano, P. and Staab, S. (2004). Learning by Googling. SIGKDD Explorations, 　　　　6(2), 24-34.
Cimiano, P., Ladwig, G.. and Staab, S. (2005). Gimme The Context: Context-driven 　　automatic semantic annotation with C-PANKOW. In Proceedings of the 14th World 　　Wide Web Conference, Chiba, Japan. 332 – 341.
Davulcu, H., Vadrevu, S., Nagarajan, S., and Gelgi, F. (2005). Automated Metadata 　　and Instance Extraction from News Web Sites. International Journal of Web and 　　Grid Services 2005, 1(2), 196 - 221.
Ding, Y., Fensel, D., Klein, M. and Omelayenko, B. (2002). The Semantic Web: Yet 　　Another Hip?. Data and Knowledge Engineering, 41(3), 205-227.
Do, H. H. and Rahm, E. (2002, Aug). COMA - A system for flexible combination of 　　schema matching approaches. In Proceedings of the 28th International 　　　　　　Conference on Very Large Databases, Hongkong.
Doan, A., Madhavan, J., Dhamankar, R., Domingos, P. and Halevy, A. (2003). 　　　　　Learning to match ontologies on the Semantic Web. The International Journal 　　on Very Large Data Bases, 12(4), 303-319.
Ehrig, M. and Staab, S. (2004,Nov). QOM - Quick Ontology Mapping.In proceedings 　　of the Third International Semantic Web Conference, Hiroshima, Japan.
Gupta, S., Kaiser, G., Grimm, P., Chiang, M., and Starren, J. (2005). 　　　　　　　Automating Content Extraction of HTML Documents. World Wide Web, 　　　　　　　　8(2),179-224.
Hage, V. W. R., Katrenko, S., and Schreiber, G.. (2005). A Method to Combine 　　　　Linguistic Ontology-Mapping Techniques. In Proceedings of ISWC, Galway, 　　　　Ireland.
Hearst, M. A. (1992). Automatic Acquisition of Hyponyms from Large Text Corpora. 　　In Proceedings of the 14th International Conference on Computational 　　　　　　Linguistics, 539-545.
Hippisley, A., Cheng, D., and Ahmad, K. (2005). The head-modifier principle and 　　multilingual term extraction. Natural Language Engineering,11(2),129-157.
Jinwon, H., and Rong, T. (2001). Towards an optimal resolution to information 　　　overload: An infomediary approach. Proceedings of the 2001 International ACM 　　SIGGROUP Conference on Supporting Group Work, 91-96.
Kalfoglou, Y. and Schorlemmer, M. (2002, Oct). Information-Flow-based Ontology 　　　Mapping. In proceedings of the 1st International Conference on Ontologies, 　　　Databases and Application of Semantics (ODBASE'02), Irvine, CA, USA.
Kalfoglou, Y. and Schorlemmer, M. (2003). IF-Map: an ontology mapping method 　　　　based on Information Flow theory. Journal on Data Semantics, 1(1), 98-127.
Kalfoglou, Y. and Schorlemmer, M. (2003). Ontology mapping: the state of the art. 　　The Knowledge Engineering Review, 18(1), 1-31.
Liu, B., Chin, C. W., and Ng, H. T.(2003), Mining Topic-Specific Concepts and 　　　Definitions on the Web. In Proceedings International WWW Conference, 　　　　　　Budapest, Hungary.
Lu, W. H., Chien, L. F., and Lee, H. J. (2004). Anchor Text Mining for 　　　　　　　Translation of Web Queries: A Transitive Translation Approach. ACM 　　　　　　　Transactions on Information Systems, 22(2), 242-269.
Noy, N. F. and Musen, M. A. (2003). The PROMPT Suite: Interactive Tools For 　　　　Ontology Merging And Mapping. International Journal of Human-Computer 　　　　　　Studies, 59(6), 983-1024.
Noy, N. F. and Musen, M. A. (2001). Anchor-PROMPT: Using Non-Local Context for 　　　Semantic Matching. In Proceedings of WS Ontologies & Information Sharing at 　　IJCAI-2001, Seattle, USA.
Rahm, E. and Bernstein, P. A. (2001). A survey of approaches to automatic schema 　　matching. The VLDB Journal, 10(4), 334-350.
Resnik, p. and Smith, N. (2003). The web as a parallel corpus. Computational 　　　　Linguistics, 29(3), 349-380.
Rocha, C., Schwabe, D. and Aragao, M. P. (2004, May). A hybrid approach for 　　　　searching in the semantic web. In proceedings of the 13th international 　　　　conference on World Wide Web, NewYork, USA.
Sergey, M., Hector, G.-M. and Erhard, R. (2002). Similarity Flooding: A Versatile 　　Graph Matching Algorithm and its Application to Schema Matching. In 18th 　　　　International Conference on Data Engineering (ICDE'02), San Jose, California, 　　USA.
Shvaiko, P. (2004, Nov). A classification of schema-based matching approaches. In 　　Proceedings of the Meaning, Negotiation and Coordination workshop (MCN'04) at 　　the 3rd International Semantic Web Conference (ISWC'04), Hiroshima, Japan.
Uschold, M. and Gruninger, M. (2004). Ontologies and semantics for seamless 　　　　connectivity. ACM SIGMOD Record, 33(4), 58-64.
Zhang, D. and Lee, W. S. (2004). Learning to Integrate Web Taxonomies. Journal of 　　Web Semantics, 2(2), 131-151.

2006-09-13公開

簡易檢索 / 詳目顯示

相關論文