| 研究生: |
鍾明強 Chung, Min-Chiang |
|---|---|
| 論文名稱: |
基於Ontology架構之文件分類網路服務研究與建構 A Study and Construction on Ontology-based Document Classification Web Service |
| 指導教授: |
郭淑美
Guo, Shu-Mei 李健興 Lee, Chang-Shing |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2004 |
| 畢業學年度: | 92 |
| 語文別: | 中文 |
| 論文頁數: | 65 |
| 中文關鍵詞: | 網路服務 、模糊推論 、文件分類 、實體論 |
| 外文關鍵詞: | Fuzzy Inference, Document Classification, Ontology, Web Service |
| 相關次數: | 點閱:98 下載:1 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
由於Ontology能夠描述特定知識領域內相關的概念與關係,利用此種特性,本論文提出一種基於Ontology架構的文件分類方式,以期能有效的協助知識管理者在電子文件的管理上達到正確的分類效果。本論文首先利用訓練方式,分別求得Ontology內的概念與關係之權重值,進而得到一個具有權重值的Weighted Ontology。當一份新聞文件需要分類時,則須先經由文件前處理程序得到新聞文件內的詞彙集合,然後與Ontology中各類別的Concept進行比對,以求得新聞文件與Ontology的對應關係。再來,我們將此對應關係轉換成具有權重的有向圖,並透過搜尋演算法找出有向圖中最長的路徑與權重值最大的路徑,再加上有向圖內各個節點所代表的對應Concept數量,則可產生藉以判斷文件類別的模糊變數,包括對應概念權重(MCW)、最大語意路徑長度(MSRPL)以及最大語意路徑權重(MSRPW)等三個模糊變數。最後,利用這三種模糊變數進行模糊推論,藉以評估該文件與各類別Ontology之間的相似度,達到正確的分類結果。經由實驗証實,本論文所提出之web Services能有效地進行中文文件自動分類。
Because of the Ontology can describe the specific domain knowledge with the concepts and relationships, we utilize the properties of the Ontology to assist in classifying the documents automatically. In this thesis, we first compute the weights of concepts and relationships based on the tf*idf values, respectively, then construct the weighted Ontology based on the weights. When a news is necessary to be classified, the document pre-processing procedure will generate a term set of the news, then the mapping relationship between the concept of the Ontology and the term set of the news will be produced by a mapping mechanism. In addition, the mapping relationship will be transformed into a weighted digraph. By a search algorithm, we can find the maximal path length, the maximal weights of all the paths, and the number of the concepts in the digraph for generating the fuzzy variables of the Fuzzy Inference Mechanism(FIM). The three fuzzy variables including Mapped Concept Weight (MCW), Maximal Semantic Relation Path Length (MSRPL), and Maximal Semantic Relation Path Weight (MSRPW) are utilized in FIM to evaluate the similarity between the news and the domain Ontology. The experimental results show that our approach can effectively classify the Chinese documents correctly.
[1] T. R. Gruber. “A translation approach to portable ontology specifications,” Knowledge Acquisition, vol. 5, issue 2, pp. 199-220, Jun. 1993.
[2] N. F. Noy and D. L. McGuinness, “Ontology Development 101: A Guide to Creating Your First Ontology,” Stanford Knowledge Systems Laboratory Technical Report KSL-01-05 and Stanford Medical Informatics Technical Report SMI-2001-0880, Mar. 2001.
[3] 陳雅娟, “基於Ontology之模糊代理人於中文新聞文件摘要技術之研究,” 長榮大學資訊管理所, 2003.
[4] 廖嘉新, “Automatic Ontology construction Approach and Its Application for Information Classification,” 成功大學資訊工程所, 2002.
[5] D. Beneventano, S. Bergamaschi, F. Guerra, and M. Vincini, “Synthesizing an Integrated Ontology,” IEEE Internet Computing, vol. 7, no. 5, pp. 42-51, Sep./Oct. 2003.
[6] F. t. Sheldon, M. T. Elmore, and T. E. Potok, “An Ontology-Based Software Agent System Case Study,” IEEE conference on Information Technology: Computers and Communications, pp. 500, Apr. 2003.
[7] U. Hahn, S. Schulz, and M. Romacker, ”Part-whole reasoning: a case study in medical ontology engineering,” IEEE Intelligent Systems, vol. 14, issue 5, pp. 59-67, Sep. 1999.
[8] B. Medjahed and A. Bouguettaya, “Composing Web services on the Semantic Web,” The International Journal on Very Large Data Bases, vol. 12, issue 4, pp.333-351, Nov. 2003.
[9] A. Gomez-Perez and O. Corcho, “Ontology Languages for the Semantic Web,” IEEE Intelligent System, pp.54-60, Jan./Feb. 2002.
[10] A. Maedche, B. Motik, L. Stojanovic, and R. Studer, “Ontologies for Enterprise Knowledge Management,” IEEE Intelligent System, vol. 18, no. 2, pp. 26-33, Mar./Apr. 2003.
[11] T. B. Lee, J. Hendler, and O. Lassila, “The Semantic Web,” Scientific American, vol. 284, no. 5, pp. 34-43, May. 2001.
[12] 曾新穆、李健興,「支援語意空間的Ontology 擷取與建構技術研究」,期中技術報告,資策會,2002.
[13] 曾新穆、李健興,「文件自動分類技術研究」,計畫期中技術報告,資策會,2001.
[14] R. Karp, V. Chaudhri, and J. Thomere, “XOL: An XML-Based Ontology Exchange Language (Version 0.4),” Technical Report, Aug. 1999.
[15] D. Fensel, “The Semantic Web and Its Languages,” IEEE Intelligence Systems, vol. 15, no. 6, pp.67-73, Nov./Dec. 2000.
[16] G. Salton, Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer, Addison-Wesley, 1989.
[17] E. H. Han, G. Karypis, and V. Kumar, “Text Categorization Using Weight Adjusted k-Nearest Neighbor Classification,” PAKDD, pp. 53-65, 2001.
[18] C. Apte, F. Damerau, and S. M. Weiss, “Automated Learning of Decision Trees and Decision Rules,” Automated Learning and Discovery Conference, Carnegie-Mellon University, Jun. 1998.
[19] C. Apte, F. Damerau, and S.M. Weiss, “Automated Learning of Decision Rules for Text Categorization,” in ACM Transactions on Information Systems, 1994.
[20] A. McCallum and K. Nigam, “A Comparison of Event Models for Naive Bayes Text Classification,” AAAI-98 Workshop on “learning for Text Categorization”.
[21] T. M. Mitchell, Machine Learning, McGraw-Hill, 1997.
[22] J. R. Quinlan, “C4.5: Programs for machine learning,” Morgan Kaufmann Publishers, 1993.
[23] 吳文峰, “中文郵件分類器之設計及實作,” 逢甲大學資訊工程所, 2002.
[24] 曾元顯, “文件主題自動分類成效因素探討,” 中國圖書館學會會報, 第68期, pp. 62-83, 2002.
[25] Y. Yang and X. Liu, “A Re-Examination of Text Categorization Methods,” Proceedings of the 22nd Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, pp. 42-49, 1999.
[26] Http://www.w3.org/TR/2003/REC-soap12-part1-20030624/
[27] Http://www.w3.org/2002/ws/desc/
[28] Http://www.microsoft.com/taiwan/technet/
[29] Http://www.uddi.org/
[30] A. Maedche, B. Motik, L. Stojanovic, and R. Studer, “Ontologies for Enterprise Knowledge Management,” IEEE Intelligent System, vol. 18, no. 2, pp. 26-33, Mar./Apr. 2003.
[31] C. T. Lin, C. S. Gerorge Lee, “Neural-Network-based Fuzzy Logic Control and Decision System,” IEEE Transactions on Computers, vol. 40, no. 12, pp.1320-1336, Dec. 1991.
[32] C. S. Lee, J. X. Liao, and Y. H. Kuo, “A Semantic-based Concept Clustering Mechanism for Chinese News Ontology Construction,” International Computer Symposium, Taiwan, 2002.
[33] C. S. Lee, Y-J Chen, and Z-W Jian, “Ontology-based Fuzzy Event Extraction Agent for Chinese e-News Summarization,” (SCI) Expert Systems with Applications, vol. 25, no. 3, pp. 431-447, 2003.
[34] CKIP AutoTag, Http://godel.iis.sinica.edu.tw/CKIP/. Chinese Knowledge Information Processing Group, Academic sinica.
[35] Http://www.chinatimes.com.tw