簡易檢索 / 詳目顯示

研究生: 吳典恩
Wu, Tien-en
論文名稱: 結合本體論以及關聯法則於查詢擴展之研究
Combine Ontology with Association Rules in Query Expansion Research
指導教授: 謝中奇
Hsieh, Chung-chi
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理研究所
Institute of Information Management
論文出版年: 2007
畢業學年度: 95
語文別: 中文
論文頁數: 57
中文關鍵詞: 資訊擷取本體論關聯法則查詢擴展
外文關鍵詞: Information retrieval, Ontology, Query expansion, Association rules
相關次數: 點閱:150下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著網路技術的成熟以及普及化發展,使得網頁數量呈現爆炸性的成長,使用者想在這浩瀚無垠的網路世界中快速的找到所想要的資訊,必須透過搜尋引擎的力量才能達成。
    搜尋引擎藉由資訊擷取的技術蒐集網路上的網頁並擔任資訊提供者提供資訊給使用者,使用者只需輸入關鍵字即能獲取所需的資訊。
    然而,對於相同概念的文件來說,網頁作者以及使用者所使用的字詞不一樣會造成使用者無法獲得搜尋引擎中其他描述相同概念的網頁文件,
    這個問題即是字詞使用差異上的不同所造成,而解決這類問題的方法即是查詢擴展,將使用者所輸入的查詢自動擴展成更多的字詞,以期能搜尋到更為完備的資訊。

    本研究所提出的方法,是以結合本體論以及關聯法則進行查詢擴展,希望能改善字詞使用差異的問題並擷取到更多描述同一概念的網頁文件數量,滿足使用者的需求。
    以建構出來的本體為主,
    並使用網路爬行器蒐集所需的網頁文件為資料集合,再進行探勘字詞之間的關聯法則,並結合本體之中字詞之間的語意關係以及字詞之間的關聯法則關係做為推薦字詞的基礎,
    提供給使用者一查詢擴展的推薦機制,協助使用者進行查詢。

    With the development of Internet, web pages grow rapidly. In
    order to search information they need the users often depend on
    search engine. A search engine collects web pages in Internet by
    information retrieval techniques, and serves as an information
    provider to users. However, regarding web pages of the same
    concept, the words used by authors and users use may be different.
    This is a "word dismatch" problem which prevents users from
    retrieving all web pages of the same concept. The solution is
    "query expansion"(QE). QE can expand users' queries and let users
    gain more complete information.

    This research proposes one method for combining ontology with
    association rules to perform QE. It can resolve the word dismatch
    problem and retrieve more web pages of the same concept and
    satisfy users' needs. The method we proposed is based on ontology,
    and uses spider to collect web pages as the data set. After the
    spider's operation is finished, we will mine the association rules
    between words. We provide one QE's recommendation mechanism which
    combines words' semantic relationships within ontology with
    association rules among words to help user do query.

    摘要 II Abstract III 誌謝 IV 表目錄 VIII 圖目錄 X 第一章 緒論 1 1.1 研究背景與動機 1 1.2 研究目的 3 1.3 研究流程 4 1.4 論文架構 5 1.5 研究範圍與限制 5 第二章 文獻探討 6 2.1 資訊擷取 6 2.1.1 布林模式 9 2.1.2 向量模式 10 2.1.3 機率模式 12 2.2 本體論 13 2.3 知識探索 14 2.4 關聯法則 16 2.4.1 關聯法則的定義及相關名詞介紹 16 2.4.2 支持度以及信心度計算方式 17 2.4.3 確定因子 17 2.5 查詢擴展 18 2.5.1 本體論查詢擴展法 20 2.5.2 關聯法則查詢擴展法 20 第三章 研究方法 22 3.1 建構字詞語意階層 22 3.2 資料來源 23 3.2.1 蒐集資料 23 3.2.2 資料前置處理 25 3.3 探勘關聯法則 27 3.4 字詞推薦 28 3.5 查詢程序 30 第四章 系統實做與驗證 33 4.1 建構本體 33 4.2 系統架構 35 4.3 系統實做發展 37 4.3.1 系統部署 37 4.4 系統操作 38 4.5 實驗結果 39 4.5.1 類別二字詞 40 4.5.2 類別三字詞 42 4.5.3 類別四字詞 43 第五章 結論與建議 48 5.1 結論 48 5.2 建議 49 附錄 附表 50 參考文獻 53

    Agrawal, R., Umielinski, T. and Swami, A. Mining association rules between sets of
    items in large database. The 1993 ACM SIGMOD International Conference on
    Management of Data, 207-216.

    Akrivas, G., Wallace, M., Stamou, G. andKollias, S. Context-sensitive query expansion
    based on fuzzy clustering of index terms. Flexible Query Answering Systems,
    Proceedings Lecture Notes in Arti¯cal Intelligence, 1-11, 2002.

    Berry, M. J. A. and Lino, G. S. Data Mining Techniques: for Marking, Sales, and
    Customer Support. John Wiley and Sons, 1997.

    Berzal, F., Blanco, I., Sanchez, D. and Vila, M. A. Measuring the accuracy and
    importance of association rules: a new framework. Intelligent Data Analysis, 6,
    221-235, 2002.

    Buckley, C., Salton, G., Allan, J. and Singhal, A. Automatic query expansion using
    SMART: TREC 3. Proceeding of Third Text Retrieval Conference, NIST Special
    Publication 500-225, 69-80, 1994.

    Carpineto, C., De Mori, R., Romano, G. and Bigi, B. An information-theoretic ap-
    proach to automatic query expansion. ACM Transactions on Information Sys-
    tems, 19(1), 1-27, 1999.

    Chandrasekaran, B., Josephson, J. R. and Benjamins, V. R. What are ontologies, and
    why do we need them?. IEEE Intelligent Systems and Their Applications, 14(1),
    20-26, 1999.

    Chau, M., Fang, X. and Liu Sheng, R. O. Analysis of the query logs of a web site search
    engine. Journal of the American Society for Information Science and Technology,
    56(13), 1363-1376, 2005.

    Chiang, H. L., Chua, E. H. and Storey, V. C. A smart web query method for retrieval
    of web data. Data and Knowledge Engineering, 38(1), 63-84, 2001.

    Chli, M. and Dewilde, P. Internet search: subdivision-based interactive query expansion
    and the soft semantic web. Applied Soft Computing, 6(4), 372-383, 2006.

    Croft, W. B. and Harper, D. J. Using probabilistic models of document retrieval
    without relevance information. Journal of Documentation, 35, 285-295, 1979.

    Croft, W. B., Cook, R. and Wilder, D. Providing government information on the
    internet: experiences with Thomas. Proceedings of Digital Libraries '95, 19-25,
    1995.

    Cui, H., Wen, J. R., Nie, J. Y. and Ma, W. Y. Query expansion by mining user logs.
    IEEE Transactions on Knowledge and Data Engineering, 15(4), 829-840, 2003.

    Dey, L., Singh, S., Rai, R. and Gupta, S. Ontology aided query expansion for retriev-
    ing relevant texts. Advances in Web Intelligence, Proceedings Lecture Notes in
    Computer Science, 126-132, 2005.

    Fayyad, U., Piatetsky-Shapiro, G. and Smyth, P. The KDD process for extracting
    useful knowledge from volumes of data. Communications of the ACM, 39(11),
    27-34, 1996.

    Fayyad, U., Piatetsky-Shapiro, G. and Smyth, P. From data mining to knowledge
    discovery in databases. AI Magazine, 17(3), 37-54, 1996.

    Frawley, W. J., Piatetsky-Shapiro, G. and Matheus, C. J. Knowledge discovery in
    databases - an overview. AI Magazine, 13(3), 57-70, 1992.

    Gauck, S. and Smith, J. B. An expert system for automatic query reformulation.
    Journal of the American Society of Information Science, 44(3), 124-136, 1993.

    Gordon, C. and Pathak, P. Finding information on the World Wide Web: the retrieval
    effectiveness of search engines. Information Processing and Management, 35(2),
    141-180, 1999.

    Gurber, T. R. A translation approach to portable ontology specifications. Knowledge
    Acquisition, 5(2), 199-200, 1993.

    Hoeber, O., Yang, X. D. and Yao, Y. Y. Conceptual query expansion. Advances in Web
    Intelligence, Proceedings Lecture Notes in Computer Science, 190-196, 2005.

    Hong, T. P., Kuo, C. S. and Chi, S. C. Mining association rules from quantitative
    data. Intelligent Data Analysis, 3(5), 363-376, 1999.

    Kantardzic, M. Data Mining :Concepts, Models, Methods, and Algorithms. John Wiley
    and Sons, 2003.

    Kim, D. W. and Lee K. H. A new fuzzy information retrieval system based on user
    preference model. The 10th IEEE International Conference on Fuzzy Systems,
    1, 127-130, 2001.

    Klir, G. J. and Yuan, B. Fuzzy Sets and Fuzzy Logic: Theory and Applications. Prentice
    Hall, 1995.

    Kraft, D. H., Martin-Bautista, M. J., Chen, J. and Vila, M. A. Rules and fuzzy rules
    in text: concept, extraction and usage. International Journal of Approximate
    Reasoning, 34(2), 145-161, 2003.

    Li, W. S. and Agrawal, D. Supporting web query expansion effciently using multi-
    granularity indexing and query processing. Data and Knowledge Engineering,
    35(3), 239-257, 2000.

    Martin-Bautista, M. J., Sanchez, D., Chamorro-Martinez, J., Serrano, J. M. and Vi-
    la, M. A. Mining web documents to find additional query terms using fuzzy
    association rules. Fuzzy Sets and Systems, 148(1), 85-104, 2004.

    Noy, N. F. and McGuinness, D. L. Ontology development 101: a guide to creating
    your ¯rst ontology. Stanford Medical Informatics Technical Report, 2001.

    Peat, H. P. and Willet, P. The limitations of term co-occurrence data for query expan-
    sion in document retrieval systems. Journal of the American Society Information
    Science, 42(5), 378-383, 1991.

    Porter, M. F. An algorithm for suffix stripping. Program, 14(5), 130-137, 1980.

    Qiu, Y. and Frei, H. P. Concept based query expansion. Proceedings of ACM SIGIR
    International Conference on Research and Development in Information Retrieval,
    160-169, 1993.

    Ricardo, B. Y. and Berthier, R. N. Modern Information Retrieval. Addison-Wesley,
    2002.

    Roberson, S. E. and Sparck Jones, K. Relevance weighting of search terms. Journal of
    the American Society for Information Science, 27(3), 129-146, 1993.

    Salton, G. and Lesk, M. E. Computer evaluation of indexing and text processing.
    Journal of the ACM, 15(1), 8-36, 1968.

    Sparck-Jones, K. Automatic Keyword Classification for Information Retrieval. But-
    terworth, London, 1971.

    Spink, A.,Wolfram, D., Jansen, B. J. and Saracevic, T. Query expansion via conceptual
    distance in thesaurus indexed collections. Journal of the American Society for
    Information Science, 52(3), 226-234, 2001.

    Tudhope, D., Binding, C., Blocks, D. and Cunliffe, D. Query expansion via conceptual
    distance in thesaurus indexed collections. Journal of Documentation, 62(4), 509-
    533, 2006.

    Vechtomova, O. and Wang, Y. A study of the effect of term proximity on query
    expansion. Journal of Information Science , 32(4), 324-333, 2006.

    Velez, B., Weiss, R., Sheldon, M. A. and Gifford, G. K. Fast and effective query
    refinement. Proceedings of 20th ACM Conference on Research and Development
    in Information Retrieval (SIGIR'97), Philadelphia, Pennsylvania , 1997.

    Xu, J. and Croft, W. B. Query expansion using local and global document analysis.
    Proceedings of the Nineteenth Annual International ACM SIGIR Conference on
    Research and Development in Information Retrieval, 4-11, 1996.

    下載圖示 校內:立即公開
    校外:2007-06-27公開
    QR CODE