| 研究生: |
楊穗結 Yang, Suei-Jie |
|---|---|
| 論文名稱: |
以論文關鍵字為基礎建立知識地圖 Construct Knowledge Map Based on Journal Keywords |
| 指導教授: |
王惠嘉
Wang, Hei-Chia |
| 學位類別: |
碩士 Master |
| 系所名稱: |
管理學院 - 工業與資訊管理學系碩士在職專班 Department of Industrial and Information Management (on the job class) |
| 論文出版年: | 2020 |
| 畢業學年度: | 108 |
| 語文別: | 中文 |
| 論文頁數: | 49 |
| 中文關鍵詞: | 知識地圖 、階層式主題架構 、詞向量 、共同資訊 |
| 外文關鍵詞: | Knowledge map, hierarchical topic model, word2Vec, Mutual Information |
| 相關次數: | 點閱:93 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
由於全球電子化的普及,許多的期刊網站都有提供大量的電子期刊,研究人員在剛進入陌生的專業領域時,通常都會透過關鍵字搜尋相關的期刊,透過期刊的內容理解關鍵字的含意以及找尋到與關鍵字相關的方法。
然而透過關鍵字來找尋期刊雖然比較精準,但仍需要透過人工閱讀的方式,才可以慢慢的釐清專業領域關鍵字的關聯性,而透過此方式往往會耗費許多的精神以及時間,因此如果能夠透過文字探勘內的相關方法,擷取出期刊裡關鍵字間的關聯性,並透過視覺化的方式讓研究人員快速地理解專業領域的全貌,將可以大幅地節省閱讀的時間。
有研究指出知識地圖能夠描述領域專業知識間的關鍵字之間的關係,本研究希望利用此特性來收集專業領域的期刊,並透過文字探勘的相關手法進行文字資料處理,接著使用階層式隱性樹 建立出樹狀結構的階層式知識地圖,然而此演算法的核心概念為使用相互資訊求得關鍵字的關聯分數,而連續詞袋法與Skip-Gram,這兩個演算法與相互資訊相似,都是在計算字詞的關聯性,有許多研究中指明Skip-Gram演算法有較好的準確度,因此本研究將會建立出原生使用相互資訊的階層式隱性樹,接著將相互資訊的方法使用Skip-Gram演算法進行替換,建立出另一個階層式隱性樹,比較兩個不同方法所建立出的階層式隱性樹正確性。
最後,由於知識地圖的特色就是以視覺化圖形方式協助研究人員較直觀地理解知識的分佈方式,能夠快速地找到所需要的知識,因此本研究使用Squarified Tree Map、 Tree View建立了樹狀式的階層知識地圖,整合了視覺化的知識地圖相關方法。
接著為了驗證知識地圖的正確性,將透過專家所定義的字詞與知識地圖第一層進行比較,驗證在相互資訊與Skip-Gram方法所建立出來的階層式知識地圖何者與專家較為接近,並將較為準確的階層式知識地圖透過實驗方法邀請研究人員進行實際操作,進行階層式知識地圖與一般傳統搜尋方法的優缺點比較。
Due to the popularization of e-services, many professional journal websites provide an abundance of e-journals. When researchers first enter into an unfamiliar professional field, they usually search for journals using keywords, and then use the contents of the journal to obtain other knowledge to improve research direction.
Using keywords to find journals is an accurate method, but researchers still need to read the journal to understand the relevance of the keywords of the journal and thoroughly reading often takes a lot of time and energy. Therefore, if we can extract the relevance of journal keywords through text mining methods and use visual methods to guide researchers to quickly understand the whole picture of the professional field, it will greatly save the reading time.
Many studies find that the knowledge map can describe the relationship between keywords in the field of domain knowledge. This research intends to use this feature to collect journals in the professional field, and to process text data through the text mining methods, and then use Hierarchical Latent Tree Analysis (HLTA) to construct a hierarchical knowledge map in a tree structure. The concept of this algorithm is to use Mutual Information to find the relations of keywords, but two algorithms, CBOW (Continuous Bag Of Words) and Skip-Gram, are similar to mutual information; both calculate the relations of keywords. Many studies indicate that the Skip-Gram algorithm has good accuracy. Therefore, this study constructs two knowledge maps by HLTA (use Mutual Information and Skip-Gram) and compares the accuracy of the knowledge map of Mutual Information and Skip-Gram.
The characteristics of the knowledge map help researchers understand the distribution of knowledge more intuitively by visual graphics, and can quickly find the required knowledge. Therefore, this study uses Squarified Tree Map and Tree View to construct a hierarchical knowledge map by integrating visual knowledge map related methods.
Finally, this study use keyword defined by experts in the professional field to verify the accuracy of hierarchical knowledge map and invites research The personnel performs the actual operation to compare the advantages and shortcomings of the hierarchical knowledge map and the general traditional search interface .
[英文]
Blei, D., Ng, A., & Jordan, M. (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research, 3, 993-1022. doi:10.1162/jmlr.2003.3.4-5.993
Carrasco, R. S. M., & Sicilia, M.-A. (2018). Unsupervised intrusion detection through skip-gram models of network behavior. Computers & Security, 78, 187-197. doi:https://doi.org/10.1016/j.cose.2018.07.003
Carriere, J., & Kazman, R. (1995, 30-31 Oct. 1995). Research report. Interacting with huge hierarchies: beyond cone trees. Paper presented at the Proceedings of Visualization 1995 Conference.
Chen, P., Zhang, N. L., Liu, T., Poon, L. K. M., Chen, Z., & Khawar, F. (2017). Latent tree models for hierarchical topic detection. Artificial Intelligence, 250, 105-124. doi:https://doi.org/10.1016/j.artint.2017.06.004
Cover, T. M., & Thomas, J. A. (2006). Elements of Information Theory, 2nd edition, Wiley.
Doddington, G. (2002). Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. Paper presented at the Proceedings of the second international conference on Human Language Technology Research, San Diego, California.
Dong, C., Wang, F., Li, H., Ding, L., & Luo, H. (2018). Knowledge dynamics-integrated map as a blueprint for system development: Applications to safety risk management in Wuhan metro project. Automation in Construction, 93, 112-122. doi:https://doi.org/10.1016/j.autcon.2018.05.014
Gao, W., L.G. Guirao, J., Basavanagoud, B., & Wu, J. (2018). Partial multi-dividing ontology learning algorithm. Information Sciences, 467, 35-58. doi:https://doi.org/10.1016/j.ins.2018.07.049
Google. (2019). Google Trends. Retrieved from https://trends.google.com.tw/trends/explore?date=all&q=Google%20Scholar
Gretarsson, B., O’Donovan, J., Bostandjiev, S., Höllerer, T., Asuncion, A., Newman, D., & Smyth, P. (2012). TopicNets: Visual Analysis of Large Text Corpora with Topic Modeling. ACM Trans. Intell. Syst. Technol., 3(2), Article 23. doi:10.1145/2089094.2089099
Gruber, T. R. (1993). A translation approach to portable ontology specifications. Knowledge Acquisition, 5(2), 199-220. doi:https://doi.org/10.1006/knac.1993.1008
Hahn, U., Schulz, S., & Romacker, M. (1999). Part-whole reasoning: a case study in medical ontology engineering. IEEE Intelligent Systems and their Applications, 14(5), 59-67. doi:10.1109/5254.796091
Hartley, J. (2005). To Attract or to Inform: What are Titles for? Journal of Technical Writing and Communication, 35(2), 203-213. doi:10.2190/NV6E-FN3N-7NGN-TWQT
Jones, K. S. (1973). Index term weighting. Information Storage and Retrieval, 9(11), 619-633. doi:https://doi.org/10.1016/0020-0271(73)90043-0
Kaushik, N., & Chatterjee, N. (2018). Automatic relationship extraction from agricultural text for ontology construction. Information Processing in Agriculture, 5(1), 60-73. doi:https://doi.org/10.1016/j.inpa.2017.11.003
Li, M., Lu, X., Chen, L., & Wang, J. (2020). Knowledge map construction for question and answer archives. Expert Systems with Applications, 141, 112923. doi:https://doi.org/10.1016/j.eswa.2019.112923
Lin, F.-r., & Hsueh, C.-m. (2006). Knowledge map creation and maintenance for virtual communities of practice. Information Processing & Management, 42(2), 551-568. doi:https://doi.org/10.1016/j.ipm.2005.03.026
McGuinness, N. F. N. a. D. L. (2001). A Guide to Creating Your First Ontology. Stanford Knowledge Systems Laboratory Technical Report KSL-01-05 and Stanford Medical Informatics Technical Report(SMI-2001-0880).
Mikolov, T., Chen, K., Corrado, G. S., & Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space. CoRR, abs/1301.3781.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. Paper presented at the Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2, Lake Tahoe, Nevada.
Nagai, M., Shiraki, K., Kato, H., & Akahori, K. (2003). The effectiveness of a Web bulletin board enhanced with a knowledge map.
Neches, R., Fikes, R., Finin, T., R. Gruber, T., Patil, R., E. Senator, T., & Swartout, W. (1991). Enabling Technology for Knowledge Sharing. AI Magazine, 12, 36-56.
Ong, T.-H., Chen, H., Sung, W.-k., & Zhu, B. (2005). Newsmap: a knowledge map for online news. Decision Support Systems, 39(4), 583-597. doi:https://doi.org/10.1016/j.dss.2004.03.008
Park, J., Kim, K., Hwang, W., & Lee, D. (2019). Concept embedding to measure semantic relatedness for biomedical information ontologies. Journal of Biomedical Informatics, 94, 103182. doi:https://doi.org/10.1016/j.jbi.2019.103182
Richard, A., & Gall, J. (2017). A bag-of-words equivalent recurrent neural network for action recognition. Computer Vision and Image Understanding, 156, 79-91. doi:https://doi.org/10.1016/j.cviu.2016.10.014
Salton, G. (1971). The SMART Retrieval System—Experiments in Automatic Document Processing: Prentice-Hall, Inc.
Salton, G. (1989). Automatic text processing: the transformation, analysis, and retrieval of information by computer: Addison-Wesley Longman Publishing Co., Inc.
Savoy, J. (1996). An extended vector-processing scheme for searching information in hypertext systems. Information Processing & Management, 32(2), 155-170. doi:https://doi.org/10.1016/S0306-4573(96)85003-5
Schwarz, G. (1978). Estimating the Dimension of a Model. Ann. Statist., 6(2), 461-464. doi:10.1214/aos/1176344136
Sheldon, F. T., Elmore, M. T., & Potok, T. E. (2003). An ontology-based software agent system case study. Paper presented at the Proceedings ITCC 2003. International Conference on Information Technology: Coding and Computing, Las Vegas, NV, USA, USA.
Shneiderman, B. (1992). Tree visualization with tree-maps: 2-d space-filling approach. ACM Trans. Graph., 11(1), 92–99. doi:10.1145/102377.115768
Skupin, A., & Fabrikant, S. I. (2003). Spatialization Methods: A Cartographic Research Agenda for Non-geographic Information Visualization. Cartography and Geographic Information Science, 30(2), 99-119. doi:10.1559/152304003100011081
Sorapure, M. (2019). Text, Image, Data, Interaction: Understanding Information Visualization. Computers and Composition, 54, 102519. doi:https://doi.org/10.1016/j.compcom.2019.102519
Xing, X., Zhong, B., Luo, H., Li, H., & Wu, H. (2019). Ontology for safety risk identification in metro construction. Computers in Industry, 109, 14-30. doi:https://doi.org/10.1016/j.compind.2019.04.001
Yang, Y., & Liu, X. (2003). A Re-Examination of Text Categorization Methods. Paper presented at the Proceedings of the 22nd SIGIR, New York, NY, USA.
Zimmerman, M. (2000). Weaving the web: the original design and ultimate destiny of the world wide web by its inventor. IEEE Transactions on Professional Communication, 43(2), 217-218. doi:10.1109/TPC.2000.843652
天雨粟. (2019). Skip-Gram Example. Retrieved from https://zhuanlan.zhihu.com/p/27234078
國家教育研究院. (2020). 國家教育研究院. Retrieved from http://terms.naer.edu.tw/download/
遠見編輯部. (2019). 企業最愛大學. Retrieved from https://www.gvm.com.tw/article.html?id=56078
黃智毅. (2006). 階層視覺化知識地圖法之研究. 成功大學, Retrieved from http://www.AiritiLibrary.com/Publication/Index/U0026-0812200911532937
鐘明強. (2004). 基於Ontology 架構之文件分類網路服務研究與建構.
[網站]
Google. (2019). Google Trends. Retrieved from https://trends.google.com.tw/trends/explore?date=all&q=Google%20Scholar
wikipedia-c. (2019). wikipedia-c. Retrieved from https://zh.wikipedia.org/wiki/File:Mason-ontology.png
工業技術研究院. (2003). 繁體知網(E-HowNet). Retrieved from https://ckip.iis.sinica.edu.tw/
天雨粟. (2019). Skip-Gram Example. Retrieved from https://zhuanlan.zhihu.com/p/27234078
遠見編輯部. (2019). 企業最愛大學. Retrieved from https://www.gvm.com.tw/article.html?id=56078
工業技術研究院. (2003). 繁體知網(E-HowNet). Retrieved from https://ckip.iis.sinica.edu.tw/
天雨粟. (2019). Skip-Gram Example. Retrieved from https://zhuanlan.zhihu.com/p/27234078
國家教育研究院. (2020). 國家教育研究院. Retrieved from http://terms.naer.edu.tw/download/
medium. (2018). Retrieved from https://medium.com/@chih.sheng.huang821/%E6%A9%9F%E5%99%A8%E5%AD%B8%E7%BF%92%E6%87%89%E7%94%A8-%E5%9E%83%E5%9C%BE%E8%A8%8A%E6%81%AF%E5%81%B5%E6%B8%AC-%E8%88%87-tf-idf%E4%BB%8B%E7%B4%B9-%E5%90%AB%E7%AF%84%E4%BE%8B%E7%A8%8B%E5%BC%8F-2cddc7f7b2c5
校內:2025-06-19公開