簡易檢索 / 詳目顯示

研究生: 鄧振樹
Teng, Chen-Shu
論文名稱: 建置企業產品知識主題地圖-以N公司為例
Using Topic Map to Construct an Enterprise Product Knowledge Base
指導教授: 王惠嘉
Wang, Hei-Chia
學位類別: 碩士
Master
系所名稱: 管理學院 - 工業與資訊管理學系碩士在職專班
Department of Industrial and Information Management (on the job class)
論文出版年: 2013
畢業學年度: 101
語文別: 中文
論文頁數: 52
中文關鍵詞: 資訊檢索支援向量機階層分類主題地圖
外文關鍵詞: Information Retrieval, SVM, Hierarchical Classification, Topic Map
相關次數: 點閱:108下載:3
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 資訊技術進步以及網際網路發達,人們的生活從以前單元性演變成現在的多元化,網際網路更是提供更多學習的資源,所以知識管理在資訊網路層面更是重要。資料及資訊,彼此之間存在著更深層的關聯性、內隱知識等,無法初步被清楚定義出來。因此需要一個合適的知識管理平台來輔佐,其中知識本體可以提供數位資訊需要制式化且具備讓機器可理解、傳遞的知識結構、幫助知識管理。
    但也因為科技的快速進步,許多資訊都以電子文件的方式呈現,資訊累積的速度相當快及龐大。資訊量的快速成長卻使得在搜尋相關的資訊時,一般人通常只能透過關鍵字查詢的方式,但使用者仍需面對著大量的搜尋結果,往往須經過長時間的篩選才能得到真正想要的資訊。為了對文件進行分門別類的加值處理,使得文件易於管理、利用,透過分類可以讓我們提升學習及表現的效率,文件分類在處理和組織大量的新聞資料中扮演著一個極為重要的角色。然而現在的文件資料多會存在著階層式關係,對此關係的分類處理方法一般都只是用壓平式(Flat),因此會忽略掉與父子(Parent-Child)及兄弟(Sibling)之間關聯的因子。
    對應到公司的運作和管理上,公司也會採取電子文件的方式,透過網際網路定期的發布公司訊息與產品資訊上,但也因為電子化文件成長快速,在不斷的更新各項公司資訊與產品發表的內容之虞,一般多以人工瀏覽文件再加以分類的方式,不但耗時耗力且耗費成本,造成文件並未有效管理與利用,這樣的潛在問題一直存在著。所以本研究將以資訊檢索、階層式分類及知識管理的概念,套用在N公司的營運。本研究將N公司產品新聞進行階層式分類,研究發現產品新聞標題及本文中,各以30%及70%的權重方式,並以SVM支援向量機的分類方法,可以達到最佳的分類效果。最後結合主題地圖,將主題與主題之間的關聯串接起來,藉此能夠有效提供使用者在資訊搜尋中獲取更完整的參考資料,發揮產品的知識價值,提供一個更寬廣的檢索視野。

    Currently, knowledge management for Internet is a new trend. Although classification can filter noisy information, it is not easy to define the relationship between data sets and to extract tacit knowledge contained in data. Therefore, constructing a platform of knowledge management for providing realizable information should be considered. For achieving the goal of this thesis, topic map is used for knowledge management.
    With the advance of information technology and Internet, many documents are presented in digital formats. Search engines are designed for users to use keywords to search documents. Because the searching process may spend much time to filter noisy information from abundant results. Therefore, an efficient classification mechanism should be implemented in processing and organizing the abundant information. Classification can categorize and systematize electronic documents.
    Some enterprises regard Internet as a good way to broadcast product information and news. But the processing of classification still is by manual and thus fails to provide knowledge relationship for knowledge information. This study will adapt the concept of information retrieval, classification and knowledge management to build a hierarchical classification via Support Vector Machines (SVM) technology and topic map concept.

    1. 緒論 1 1.1. 研究背景 1 1.2. 研究動機與目的 2 1.3. 研究流程 3 1.4. 研究範圍與限制 4 1.5. 論文架構 5 2. 文獻探討 6 2.1. 自然語言處理(NATURAL LANGUAGE PROCESSING) 6 2.2. 資訊檢索(INFORMATION RETRIEVAL) 7 2.2.1. 檢索型式(Retrieval Type) 8 2.2.2. 檢索模式(Retrieval Model) 8 2.3. 分類(CLASSIFICATION) 10 2.3.1. K個最近鄰居法(K-Nearest Neighbor) 10 2.3.2. 支援向量機(Support Vector Machines) 10 2.4. 主題地圖(TOPIC MAP) 14 2.4.1. 主題地圖概念 14 2.5. 小結 16 3. 研究方法 17 3.1. 研究架構 18 3.2. 資料收集與前處理 19 3.3. 特徵選取 20 3.4. 分類方法 22 3.5. 主題地圖擷取模組 24 3.5.1. 主題擷取 24 3.5.2. 關聯擷取 25 3.5.3. 資源指引擷取 27 3.6. 小結 27 4. 實作與驗證 29 4.1. 系統實作設計 29 4.2. 實驗方法 30 4.2.1. 資料來源 30 4.2.2. 分類比較對象 32 4.2.3. 評估指標 32 4.2.4. 實驗方法設計 35 4.3. 實驗結果 36 4.3.1. 實驗一 36 4.3.2. 問卷基本資料分析 40 4.3.3. 實驗二 41 4.4. 系統畫面範例 42 5. 結論及未來研究方向 47 5.1. 研究成果 47 5.2. 未來研究方向 48 參考文獻 49

    英文文獻
    Allan, M.T., (1950). Computing Machinery and Intelligence. Mind 49: 433-460.
    Alexander, S., (2000). Towards Knowledge Organization with Topic Map, in: Conference Proceedings XML Europe 2000, 12-16 June 2000, Le Palais des Congrès de Paris, Paris, France. GCA, p. 603-611.
    Aixin, S., & Ee-Peng, L. (2001). Hierarchical Text Classification and Evaluation, in: Proceedings of the 2001 IEEE International Conference on Data Mining (ICDM 2001), California, USA, November 2001, p. 521-528.
    Andreas, H., & Andreas, N., & Gerhard, P., (2005). A Brief Survey of Text Mining. GLDV Journal for Computational Linguistics and Language Technology, 20(1), p. 19-62.
    Baeza-Yates, R., & Ribeiro-Neto, B., (1999). Modern Information Retrieval. ACM Press ; Harlow, England : Addison-Wesley, New York.
    Berwick, R. C., & S. P. Abney, & C. Tenny (eds.) (1991). Principle-Based Parsing: Computation and Psycholinguistics, p. 257-278. Kluwer Academic Publishers, Boston.
    Bloodgood, J.M., & Salisbury, W.D., (2001). Understanding the Influence of Organizational Change Strategies on Information Technology and Knowledge Management Strategies, Decision Support Systems, Vol.31, p. 55-69.
    Bo, L., & Zhifeng, H. & Xiaowai, Y., & Xudong, L., (2006). Chinese question classification with support vector machine. International Journal of Computer Science and Network Security, 6(7A), p. 231-240.
    Bonnie Jean, D., (2001). Review of Natural Language Processing in R.A. Wilson and F.C. Keil (Eds.), The MIT Encyclopedia of the Cognitive Sciences. Artificial Intelligence, 130(2), p. 185-189.
    Carl, G., & Peter, S., (2008). Model selection for support vector machine classification. Neurocomputing, 55(1-2), p.221-249.
    Chen, D., & Muller, H. M., & Sternberg, P. W. (2006). Automatic document classification of biological literature. BMC Bioinformatics, 7(370), p.1-11.
    Christopher, D. M., & Prabhakar R., & Hinrich S., (2008), Introduction to Information Retrieval, Cambridge University Press.
    Cordon, O., (2003). A review on the application of evolutionary computation to information retrieval. International Journal of Approximate Reasoning, 34(2-3), p. 241-264.
    Dave, K., Lawrence, S., & Pennock, D. M., (2003). Mining the peanut gallery: Opinion extraction and semantic classification of product reviews. Paper presented at the In Proceedings of the 12th International World Wide Web Conference, New York, NY.
    Dumais, S., & Chen, H. (2000). Hierarchical classification of web content. In Proceedings of the Proceedings of 23rd international conference on research and development in information retrieval (SIGIR'00).
    Dumais, S. T., & Platt, J., & Heckerman, D., & Sahami, M., (1998). Inductive learning algorithms and representations for text categorization. Proceedings of the Seventh International Conference on Information and Knowledge Management (CIKM), p. 148-155.
    Hao, P. Y., & Chiang, J. H., & Tu, Y. K. (2007). Hierarchically SVM classification based on support vector clustering method and its application to document categorization. Expert Systems with Applications, 33, p. 627-635.
    Hearst, M., (1999). Untangling text data mining. In Proceed of ACL’99 the 37th Annual Meeting of the Association for Computational Linguistics.
    Hu, M., & Liu, B., (2004). Mining and Summarizing Customer Reviews. Paper presented at the Proceedings of the 20th International Conference on Computational Linguistics, Morristown, NJ.
    Jiang, S. Q., & Du, J., & Huang, Q. M., & Huang, T. J., & Gao, W. (2005). Visual ontology construction for digitized art image retrieval. Journal of Computer Science and Technology, 20(6), p. 855-860.
    Jung, Y., & Ryu, J., & Kim, K. M., & Myaeng, S. H. (2010). Automatic construction of a large-scale situation ontology by mining how-to instructions from the web. Web Semantics: Science, Services and Agents on the World Wide Web, 8(2-3), p. 110-124.
    Lee, C. S., & Kao, Y. F., & Kuo, Y. H., & Wang, M. H., (2007). Automated ontology construction for unstructured text documents. Data & Knowledge Engineering, 60(3), p. 547-566.
    Lei, T. C., & Chou, T. Y., & Wan S., & Yang L. S., & Syu, J. J., (2007). Space characteristic classifier of Support Vector Machine for satellite image classification. Journal of Photogrammetry and Remote Sensing, 12(2), p. 145-163.
    Jack P., & Sam H., (2003). XML Topic Map: creating and using Topic Map for the Web, Addison-Wesley.
    Joachims, T., (1998). Text categorization with support vector machines: Learning with many relevant features. In C. Nedellec and C. Rouveirol, editors, European Conf. on Machine Learning (ECML).
    Kim, J. M., Shin, H., & Kim, H. J. (2007). Schema and constraints-based matching and merging of Topic Map. Information Processing & Management, 43(4), p. 930-945.
    Kodratoff Y., (1999). Knowledge discovery in texts: A definition and applications. Lecture Notes in Computer Science, Vol. 1609, p. 16-29.
    Manning, C. D., Raghavan, P. & Schütze, H. (2008). Introduction to Information Retrieval: Cambridge University Press.
    Oezguer, L., & Geungoer, T., (2010). Text classification with the support of pruned dependency patterns. Pattern Recognition Letters, 31(12), 1598-1607.
    Pepper, S., (2000). The TAO of Topic Map - Finding the way in the age of infoglut. http://www.gca.org/papers/xmleurope2000/papers/s11-01.html. Date accessed: 26-11-2003.
    Robertson, S. E. & Sparck J. K., (1976). Relevance weighting of search terms. Journal of the American Society for Information Sciences, 27(3), p. 129-146.
    Salton G., & Wong, A. & Yang, C. S., (1975). A vector space model for automatic indexing. Communications of the ACM, 18(11), p. 613-620.
    Sanchez, D., & Moreno, A., (2008). Learning non-taxonomic relationships from web documents for domain ontology construction. Data & Knowledge Engineering, 64(3), p. 600-623.
    Sebastiani F., (2002). Machine learning in automated text categorization. ACM Computing Surveys, Vol. 34, p. 1-47.
    Serra, J., (1980)., The Boolean model and random sets, Computer Graphics and Image Processing. 231(12), p. 99-126.
    Sparck J. K., (1972). A Statistical Interpretation of Term Specificity and Its Application in Retrieval. Journal of Documentation. 28(1), p. 11-21.
    Takeuchi, K., & Collier, N., (2002). Use of support vector machines in extended named entity recognition. In 6th Conf. on Natural Language Learning (CoNLL-02), p. 119-125.
    Tyndale, P., (2002). A Taxonomy of Knowledge Management Software Tools: Origins and Applications. Evaluation and Program Planning, Vol. 25, p. 183-190.
    Vapnik, V. N., (1995). The Nature of Statistical Learning Theory. Springer-Verlag, London, UK.
    Vapnik, V. N., (1998). Statistical Learning Theory, John Wiley & Sons.
    Vapnik, V. N., (1999). An Overview of Statistical Learning Theory. IEEE Transactions on Neural Networks, 10(5), p. 988-999.
    Wan, X. (2007). A novel document similarity measure based on earth mover's distance. Information Sciences, 177(18), p. 3718-3730.
    Wilks, Y., (1997). Information Extraction as a Core Language Technology, in Maria Theresa Paziensa (Ed.), Information Extraction, Springer-Verlag, Berlin. p. 1-9.
    Yang, Y., & Lui, Y., (1999). A re-examination of text categorization methods. Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’99), 42-49, 1999.
    Yi, M., (2008). Information organization and retrieval using a Topic Map-based ontology: Results of a task-based evaluation. Journal of the American Society for Information Science and Technology, 59(12), p. 1898-1911.
    Zhang, M. L., & Zhou, Z. H., (2007). ML-KKK: A lazy learning approach to multi-label learning. Pattern Recognition, 40(7), p. 2038-2048.

    中文文獻
    林信成、歐陽慧、歐陽崇榮, (2004). 以主題地圖建構索引典之語意網路模型, 圖書與資訊學刊, Vol. 48, p. 35-56.
    曾文顯, (2002). 文件主題自動分類成效因素探討, 中國圖書館學會會報, Vol. 68, p. 62-83.
    陳光華, (2000). 數位圖書館中權威控制系統的設計,政治大學圖書與資訊學刊, Vol.34, p. 51-71.
    黃卓倫, (1997). 利用隱藏語意索引進行文件分段檢索之硏究, 國立臺灣大學資訊管理學硏究所碩士論文
    謝志強, (2008). 整合文件分類於概念圖建構之設計與實作, 國立中正大學資訊工程硏究所碩士論文

    網站資料
    http://function1122.blogspot.tw/2010/08/information-retrieval.html
    http://en.wikipedia.org/wiki/Main_Page
    Chang, C.C. & Lin, C.J., (2001) LIBSVM: A Library for Support Vector Machines, http://www.csie.ntu.edu.tw/~cjlin/libsvm/
    Liora A., (2000) Topic Map, June 16, 2000. http://www.xml.com/pub/a/2000/06/xmleurope/maps.html
    Pepper, S., (2002) The TAO of Topic Map. Retrieved May 22, 2011, from the World Wide Web: http://www.ontopia.net/topicmaps/materials/tao.html#d0e632

    下載圖示 校內:2018-07-17公開
    校外:2018-07-17公開
    QR CODE