成功大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	陳明彥 Chen, Ming-Yen
論文名稱：	語意感知之多意圖資訊檢索機制研發 Research and Development of a Semantic-Aware Mechanism for Multipurpose Information Retrieval
指導教授：	陳裕民 Chen, Yuh-Min
學位類別：	博士 Doctor
系所名稱：	電機資訊學院 - 製造工程研究所 Institute of Manufacturing Engineering
論文出版年：	2009
畢業學年度：	97
語文別：	英文
論文頁數：	180
中文關鍵詞：	多意圖資訊檢索、資訊檢索、語意感知
外文關鍵詞：	semantic aware, multipurpose information retrieval, information retrieval
相關次數：	點閱：100 下載：1
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

知識經濟時代的來臨使得知識變成個人與組織最重要的資產，也成為決定企業競爭力的關鍵要素。資訊為一知識的載體，隱含人類欲與他人溝通與傳遞之知識內容，故一有效率的資訊檢索機制將可達成知識分享與再用的目標。現有之資訊檢索機制大多以關鍵字檢索為主，以關鍵字為基礎進行比對替檢索者搜尋所需內容。以關鍵字進行檢索雖然易於實施與使用，但不易完整呈現查詢與內容中之語意特徵，故易導致檢索錯誤的情況。
在以文字為基礎的內容中，作者欲表達的意圖與概念會透過字詞與邏輯的組合，以人類可理解的語意呈現。故若能透過以語意為基礎的方式進行知識內容的檢索，將可有效提升知識內容的通透性與能見度，引導內容的作者及使用者以語意為基礎進行無縫地溝通及互動，進而使知識內容得以正確、快速地傳遞至使用者手中。
本研究提出ㄧ語意感知之多意圖資訊檢索機制，透過對內容語意的處理、識別、擷取、擴張與比對等程序，達成以下目的：(1) 分析與辨別資訊內容中的語意特徵、(2) 發展一可呈現資訊內容語意特徵，並將語意特徵結構化與具體化的語意圖像、(3) 設計一多意圖資訊檢索機制，可根據不同類型的使用者及其使用需求，提供不同的檢索模式。本研究提出之語意感知機制可改善傳統以關鍵字為主之資訊檢索模式，令使用者可透過一語意感知的查詢與檢索方式獲取其所需資訊，進而提昇資訊內容的分享與再用性。

In recent years, knowledge becomes the most important asset of individuals as well organizations, and also determines the competitiveness of an enterprise. Information content is a knowledge container that implies what human beings transform their knowledge in when they want to communicate with other people. Therefore, effective information content retrieval can achieve the goal and value of knowledge sharing and reusing. The existing information retrieval systems are mostly keyword-based and retrieve relevant information content by matching keywords. Keyword-based search, in spite of its merits of expedient query for information and ease-of-use, has failed to represent the complete semantics contained in the content and has let to the retrieval failure.
In a textual content, the author’s intention is represented in a semantic format of various combinations of word-word relations that are comprehensible to human beings. Accordingly, retrieving information content from a semantic approach can effectively improve transparency and visibility of the content and guide both the content creator and the content user to engage in seamless, semantic-based communications and interactions.
This study developed a semantic-aware mechanism for multipurpose information retrieval that handles the processing, recognition, extraction, extensions and matching of content semantics to achieve the following objectives: (1) to analyze and determine the semantic features of information content; (2) to develop a semantic pattern that represents semantic features of the content, and to structuralize and materialize semantic features; (3) to design a multipurpose information retrieval model that provides the most appropriate retrieval method for different types of users depending on their needs. This mechanism is capable of improving the traditional problem of keyword search and enables the user to perform a semantic-aware query and search for the required information, thereby improving the reusing and sharing of information content.

中文摘要	I
ABSTRACT	II
誌謝	III
TABLE OF CONENTS	IV
LIST OF FIGURES	VIII
LIST OF TABLES	XI
CHAPTER 1.	INTRODUCTION	1
1 	Background	1
2 	Motivation	2
3 	Objective	3
4 	Research Framework	4
CHAPTER 2.	LITERATURE REVIEW	7
1 	Domain Investigation	7
1.1 	Boolean Model	7
1.2 	Vector Space Model	9
1.3 	Probability Model	10
2 	Technologies Investigation	12
2.1 	Data Mining	12
2.2 	Latent Semantic Analysis	14
2.3 	Support Vector Machine	15
2.4 	Concept Map	16
2.5 	Constrained Spreading Activation Model	17
3 	Summary	20
CHAPTER 3.	DESIGN OF SEMANTIC-AWARE MECHANISM FUNCTIONAL FRAMEWORK FOR MULTIPURPOSE　INFORMATION RETRIEVAL	23
1 	Information Retrieval Process and Information Retrieval System	23
2 	Semantic Issues of Information Content Retrieval	27
3 	The Conceptual Model of Semantic Aware Mechanism for Multipurpose Information Retrieval	30
4 	The Functional Framework of Semantic Aware Mechanism for Multipurpose Information Retrieval	37
5 	Summary	40
CHAPTER 4.	INFORMATION CONTENT SEMANTIC RECOGNITION AND REPRESENTATION	43
1 	Content Semantic Recognition and Representation Process	44
2 	Development of The Content Semantic Recognition and Representation Functional Model	46
3 	Content Preprocess Module	48
3.1 	Content Parsing	49
3.2 	Content Filtering	49
3.3 	Part-of-Speech Analysis	50
4 	Content Abstraction Module 	51
4.1 	Concept Determination	52
4.2 	Concept and Sentence Weighting	54
4.3 	Content Extraction and Redundancy Eliminating	56
5 	Content Semantic Recognition and Annotation Module	57
5.1 	Semantic Mining	59
5.2 	Semantic Identification and Semantic Space Construction	61
5.3 	Semantic Pattern Generation and Transcoding	65
6 	Summary	67
CHAPTER 5.	QUERY-BASED INFORMATION RETRIEVAL	69
1 	Query-based Information Retrieval Process	70
2 	Development of The Query-based Information Retrieval Functional Model	72
3 	Semantic Determination and Extraction	74
4 	Query Content Semantic Extension	75
4.1 	Semantic Matrix Construction	76
4.2 	Singular Value Decomposition	77
4.3 	Semantic Matrix Dimensionality Reduction	78
4.4 	Latent Semantic Selection	79
5 	Semantic Pattern Clustering and Matching	82
5.1 	Content Semantic Pattern Pre-clustering	83
5.2 	Optimal Hyper-plane Separate	85
5.3 	Support Vector Generation	86
5.4 	Semantic Pattern Matching and Ranking	88
6 	Summary	90
CHAPTER 6.	CONTENT-BASED INFORMATION RETRIEVAL	93
1 	Content-based Information Retrieval Process	94
2 	Development of The Content-based Information Retrieval Functional Model	96
3 	Content Semantic Determination and Extraction	98
4 	Content Map Construction 	100
4.1 	Content Map Construction Procedure	100
4.2 	Content Map Construction Module	102
5 	Content Mapping	106
5.1 	Content Mapping Procedure	106
5.2 	Content Mapping Module	108
6 	Summary	112
CHAPTER 7.	CONCEPT-BASED INFORMATION RETRIEVAL	115
1 	Concept-based Information Retrieval Process	116
2 	Development of The Concept-based Information Retrieval Functional Model	119
3 	Hybrid Concept Map Construction Module	121
3.1 	Document Preprocess	123
3.2 	Concept Map Construction	124
4 	Concept Map Navigation Module	129
4.1 	Concept Spreading	130
4.2 	Constrained Spreading Judgment	130
5 	Question-based Concept Exploration Module	132
5.1 	Concept Matching	133
5.2 	Sentence Generation	134
5.3 	Document Retrieval	136
6 	Summary	138
CHAPTER 8.	EXPERIMENT AND RESULT ANSLYSIS OF SEMANTIC-AWARENESS MULTIPURPOSE INFORMATION RETRIEVAL	141
1 	Mechanism Implement Environment	141
2 	Experiment and Result Analysis	142
2.1 	Content Semantic Recognition and Representation	142
2.2 	Query-based Information Retrieval	149
2.3 	Content-based Information Retrieval	156
CHAPTER 9.	CONCLUSIONS	171

REFERENCES	175
                                    

[1] Abdelali, A., Cowie, J., Soliman, H.S., (2007). Improving Query Precision using Semantic Expansion, Information Processing and Management, vol.43, pp.705–716.
[2] Baldi, P., Frasconi, P., and Smyth, P., (2003). Modeling the Internet and the Web, Chichester, England: Wiley.
[3] Belgacem, F.B., (1999). The Mortar Finite Element Method with Lagrange Multipliers, Numericche Mathematik, vol. 84, pp.173–197.
[4] Bergholtz, M., Johannesson, P., (2001). Classifying the Semantics of Relationships in Conceptual Modeling by Categorization of Roles. Proceedings of the 6th International Workshop on Applications of Natural Language to Information Systems, pp. 199-203.
[5] Berry, M.W., (1992). Large Scale Singular Value Computations, International Journal of Supercomputer Applications, vol. 6, no. 1, pp.13-49.
[6] Berry, M.W., Dumais, S.T., O'Brien, G.W., (1995). Using Linear Algebra for Intelligent Information Retrieval. Society for Industrial and Applied Mathematics, vol. 37, no. 4, pp.573-595.
[7] Bezerra, B.L.D., Carvalho, F. de A.T., (2004). A Symbolic Approach for Content-Based Information Filtering, Information Processing Letters, vol. 92, pp. 45-52.
[8] Campbell, I., (2000). The Ostensive Model of Developing Information Needs. PhD thesis, University of Glasgow.
[9] Chang, C.C., Hsu, C.W., Lin, C.J., (2000). The Analysis of Decomposition Methods for Support Vector Machines. IEEE Transactions on Neural Networks, vol.11, no. 4, pp.1003-1008.
[10] Chen, S.W., Lin, S.C., Chang, K.E., (2001). Attributed Concept Maps: Fuzzy Integration and Fuzzy Matching, IEEE Transactions on Systems, Man, and Cybernetics, vol. 31, no. 5, pp. 842-852.
[11] Chi, Y. L. (2009). A Consumer-centric Design Approach to Develop Comprehensive Knowledge-based Systems for Keyword Discovery. Expert System with Application, vol. 36, pp. 2481-2493.
[12] Chi, Y.L. (2005). Elicitation synergy of extracting conceptual tags and hierarchies in textual document. Expert Systems with Applications, vol. 32, no. 2, pp. 349-357.

[13] Collins, A.M., Quillian, M.R., (1969). Retrieval Time from Semantic Memory, Journal of Verbal Behavior and Verbal Learning, vol. 8, pp. 240-247.
[14] Crestani, F., (1997). Applications of Spreading Activation Techniques in Information Retrieval. Artificial Intelligence Review, vol. 11, no.6, pp. 453-582.
[15] Crestani, F., Lee, P.L., (2000). Searching the Web by Constrained Spreading Activation. Information Processing and Management, vol. 36, pp. 585-605.
[16] Cristianini, N., Shawe-Taylor, J., (2000). An Introduction to Support Vector Machines, Cambridge, UK: Cambridge Univ. Press.
[17] Daconta, M. C., Obrst, L J.., Smith, K.T., (2003). The Semantic Web: a Guide to the Future of Xml, Web Services, and Knowledge Management, Wiley Publishing, Inc. IN: Indianapolis
[18] Dale, R., Somers, H.L., Moisl, H., (2000). Handbook of Natural Language Processing, Marcel Dekker, Inc.
[19] Davies, J., Fensel, D., Harmelen, F. V., (2003). Towards The Semantic Web: Ontology-driven Knowledge Management, John Wiley & Sons, Ltd, England.
[20] Dearman, D., Kellar, M., Truong, K. N., (2008). An Examination of Daily Information Needs and Sharing Opportunities. Proceedings of the ACM 2008 Conference on Computer supported cooperative work, pp. 679-688.
[21] Evans, C., Gibbons, N.J., (2007). The Interactivity Effect in Multimedia Learning. Computer and Education, vol. 49, pp. 1147-1160.
[22] Fattahi, R., Wilson, C. S., Cole, F., (2008). An Alternative Approach to Natural Language Query Expansion in Search Engines Text Analysis of Non-topical Terms in Web Documents, Information Processing and Management, vol. 44, pp. 1503-1516.
[23] Fellbaum, V., (1998). Wordnet: an Electronic Lexical Database, The MIT Press, MA: Cambridge.
[24] Feng, T., Millard, D., Woukeu, A., and Davis, H., (2005). Managing the Semantic Aspects of Learning using the Knowledge Life Cycle, Proceedings of 5th IEEE International Conference on Advanced Learning Technologies, pp. 575-579.
[25] Frakes, W. B., Baeza-Yates, R. (1992). Information Retrieval: Data Structures and Algorithms, Prentice-Hall, Inc. Upper Saddle River, NJ, USA.
[26] Frakes, W.B., Baeza-Yates, R., (1992). Information Retrieval: Data Structures and Algorithms, Prentice Hall PTR.
[27] Gomez-Perez, A., Fernandez-Lopez, M., Corcho, O., (2004). Ontological Engineering with Example from the Areas of Knowledge Management, E-commerce and the Semantic web. Springer-Verlag, London.
[28] Han, J., Kamber, M., (2001). Data Mining: Concepts and Techniques. Morgan-Kaufman, CA: San Francisco.
[29] Hassan, O.A.B. (2004). Application of Value-focused Thinking on the Environmental Selection of Wall Structures, Journal of Environmental Management, vol. 70, pp. 181-187.

[30] Hassan, O.A.B., (2004). Application of Value-focused Thinking on The Environmental Selection of Wall Structures, Journal of Environmental Management, vol. 70, pp. 181-187.

[31] Hofmann, T., (2001). Unsupervised Learning by Probabilistic Latent Semantic Analysis, Machine Learning, vol. 42, no. 1, pp. 177-196.
[32] Jeffreys, A. J., Wilson, V., and Thein, S. L., (1985). Individual-specific Fingerprints of Human DNA, Nature, vol. 314, pp. 67-73
[33] Jeong, B., Lee, D., Cho, H., Lee, J., (2008). A Novel Method for Measuring Semantic Similarity for XML Schema Matching, Expert Systems with Applications, vol. 34, no. 3, pp. 1651-1658.
[34] Justeson, J.S., Katz, S.M., (1995). Technical Terminology: Some Linguistic Properties And an Algorithm for Identification in Text, Almadem: IBM Research Division.
[35] Kao, G.Y.M., Lin, S.S.J., Sun, C.T., (2008). Breaking Concept Boundaries to Enhance Creative Potential Using Integrated Concept Maps for Conceptual Self-awareness, Computers and Education, vol. 51, pp. 1718-1728.
[36] Kozakov, L., Park, Y., Fin, T., Drissi, Y., Doganata, Y., Cofino, T., (2004). Glossary Extraction and Utilization in the Information Search and Delivery System for IBM Technical Support. IBM System Journal, vol. 43, no.3, pp. 546-563.
[37] Kurengkrai, C., Jaruskulchai, C., (2003). Generic Text Summarization Using Local and Global Properties of Sentences, Proceeding of IEEE International Conference on Web Intelligence, Halifax, pp. 201-206.
[38] Landauer, T.K., Foltz, P.W., Laham, D., (1998). Introduction to Latent Semantic Analysis, Discourse Processes, vol.25, pp.259-284.
[39] Lee, C.H., Yang, H.C., (2005). A Classifier-based Text Mining Approach for Evaluating Semantic Relatedness Using Support Vector Machines. Proceedings of the International Conference on Information Technology, vol.1, pp.128- 133.
[40] Lee, M. C., Tsai, K. H., Wang, T. I., (2008). A Practical Ontology Query Expansion Algorithm for Semantic-aware Learning Objects Retrieval, Computers and Education, vol. 50, pp. 1240-1257.
[41] Li, D.C., Fang, Y.H., (2006). An Algorithm to Cluster Data for Efficient Classification of Support Vector Machines. Expert Systems with Applications, vol. 34, pp.2013–2018.
[42] Lin, S.C., Chang, K.E., Sung, Y.T., Chen, G.D., (2002). A New Structural Knowledge Assessment Based on Weighted Concept Maps, Proceeding of the International Conference on Computers in Education, pp. 679-680.
[43] Lu, X.Q., Ren, F.L., Huang, Z.D., Yao, T.S., (2003). Sentence Similarity Model and the Most Similar Sentence Search Algorithm, Journal of Northeastern University, vol. 24, no. 6, pp. 531-534.
[44] Malone, J., Dekkers, J. (1984). The Concept Map as an Aid to Instruction in Science and Mathematics, School Science and Mathematics, vol. 84, no. 3, pp. 220-231.
[45] Moldovan, D., Harabagiu, S., Pasca, M., Mihalcea, R., Goodrum, R., Girju, R., and Rus, V., (1999). Lasso: A Tool for Surfing the Answer Net, Proceedings of the 8th Text Retrieval Conference (TREC-8), pp.175-183.
[46] Moreale, E., Vargas-Vera, M., (2004). A Question-Answering System Using Argumentation, Proceeding of Mexican International Conference on Artificial Intelligence, pp.400-409.
[47] Moreda, P., Navarro, B., Palomar, M., (2007). Manuel Corpus-based Semantic Role Approach in Information Retrieval, Data & Knowledge Engineering, vol. 61, no. 3, pp. 467-483.
[48] Nouali, O., Blache, P., (2004). A Semantic Vector Space and Features-based Approach for Automatic Information Filtering. Expert Systems with Applications, vol.26, no. 2, pp. 171-179.
[49] Novak, J. D., Musonda, D., (1991). A Twelve-year Longitudinal Study of Science Concept learning, American Education Research Journal, vol. 28, no. 1, pp. 117-153
[50] Novak, J.D., Gowin, D.B., (1984). Learning How to Learn, Cambridge. London: Cambridge University Press.
[51] Oh, H.J., Myaeng, S.H., & Jang, M.G., (2007). Semantic Passage Segmentation Based on Sentence Topics for Question Answering, Information Sciences, vol. 177, pp.3696–3717.
[52] O'Leary, Daniel E., (1999). Internet-based Information and Retrieval Systems, Decision Support Systems, vol. 27, no. 3, pp. 319-327.
[53] Park, J. and Hunting, S., (2002). XML Topic Maps, Addison-Wesley Professional, MA: Boston.
[54] Rumelhart, D., Norman, D., (1983). Representation in Memory. Technical report, Department of Psychology and Institute of Cognitive Science, UCSD La Jolla, USA.
[55] Salton, G., Christopher, B., (1988). Term-Weighting Approaches in Automatic Text Retrieval, Information Processing and Management, vol. 24, no. 5, pp. 513-523.
[56] Salton, G., Michael J.M., (1986). Introduction to Modern Information Retrieval, McGraw-Hill, Inc., New York, USA.
[57] Scardamalia, M., Bereiter, C., (1994). Computer Support for Knowledge-Building Communities, The Journal of the Learning Sciences vol.3, no.3, pp.265-283
[58] Shokouhi, M., Zobel, J., Tahaghoghi, S., & Scholer, F., (2007).Using Query Logs to Establish Vocabularies in Distributed Information Retrieval, Information Processing and Management, vol. 43, pp.169–180.
[59] Silva, N., Rocha, J., (2003). Complex Semantic Web Ontology Mapping, Proceedings of 2003 IEEE/WIC International Conference on Web Intelligence, Halifax, Canada.
[60] Sowa, J. F. (2000). Knowledge Representation: Logical, Philosophical and Computational Foundations, Brooks/Cole Publishing Corporation.
[61] Srikant, R., Agrawal, R., (1997). Mining Generalized Association Rules Future Generation Computer Systems, Expert Systems with Applications, vol. 13, pp. 161-180.

[62] Storey, V.C., (1993). Understanding Semantic Relationships, Very Large Data Bases Journal, vol. 12, no. 4, pp. 455-488.
[63] Storey, V.C., (2006). Comparing Relationships in Conceptual Modeling: Mapping to Semantic Classifications, Data and Knowledge Engineering, vol. 17, no. 11, pp. 1478-1489.
[64] Su, X., Gulla, J. A. (2006). An Information Retrieval Approach to Ontology Mapping, Data & Knowledge Engineering, vol. 58, no. 1, pp. 47-69.
[65] Tseng, S.S., Sue, P.C., Su, J.M., Weng, J.F., Tsai, W.N., (2007). A New Approach for Constructing the Concept Map, Computers & Education, vol. 49, pp. 691-707.
[66] Vechtomova, O., Karamuftuoglu, M., Robertson, S.E., (2006). On Document Relevance and Lexical Cohesion between Query Terms. Information Processing and Management, vol. 42, pp.1230–1247.
[67] Wang, W.M., Cheung, C. F., Lee, W. B., Kwok, S.K., (2008). Mining Knowledge from Natural Language Texts Using Fuzzy Associated Concept mapping, Information Processing and Management, vol. 44, pp. 1707-1719.
[68] Wang, W.M., Cheung, C.F., Lee, W.B., Kwok, S.K., (2008). Self-associated Concept Mapping for Representation, Elicitation and Inference of Knowledge. Knowledge-based systems, vol. 21, pp. 51-61.
[69] Weng, S., Chang, H., (2008). Using Ontology Network Analysis for Research Document Recommendation, Expert Systems with Applications, vol. 34, no. 3, pp. 1857-1869.
[70] Witten, I.H., Frank, E., (2005). Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann, CA: San Francisco.
[71] Yang, C.C., Yen, J., Chen, H.C., (2000). Intelligent Internet Searching Agent Based on Hybrid Simulated Annealing, Decision Support System, vol. 28, no. 3, pp. 269-277.
[72] Yeh, J.Y., Ke H.R., Yang, W.P., Meng, I.H., (2005). Text Summarization Using a Trainable Summarizer and Latent Semantic Analysis, Information Processing and Management, vol. 41, pp.75-95.
[73] Zantout, H., Farhi, M., (1999). Document Management Systems from Current Capabilities Towards Intelligent Information Retrieval: an Overview, International Journal of Information Management, vol. 19, no. 6, pp. 471-484.

校內：2014-07-27公開
校外：2014-07-27公開

簡易檢索 / 詳目顯示

相關論文