簡易檢索 / 詳目顯示

研究生: 張礽川
Chang, Jeng-Chuan
論文名稱: 具時間特性之網頁瀏覽行為探勘與預測機制
Mining and Predicting User Navigation Patterns based on Web Temporality
指導教授: 曾新穆
Tseng, Shin-Mu
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2004
畢業學年度: 92
語文別: 中文
論文頁數: 60
中文關鍵詞: 時間性規則變化瀏覽樣式資料探勘網頁探勘
外文關鍵詞: Temporality, Rule Changes, Navigation Patterns, Web Mining
相關次數: 點閱:132下載:8
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  •   由於網際網路的普及與電子商務的發展,預測使用者在下一步可能瀏覽的網頁亦隨之受到相當的重視。在過去預測使用者行為的研究中,多是探勘使用者瀏覽路徑或是對相似網頁作叢集以建立使用者行為模型。然而,在這些研究中,卻沒有考慮時間性的因素,即使用者進入網站的時間。因此,本論文創新性地考慮了時間性的因素以探勘使用者行為瀏覽模型,並藉以預測使用者未來之瀏覽路徑。此外,我們為了了解時間性的變化,提出三個方法來評量,也利用時間性的變化來評估某個時間性因素是否會提升整體網站的預測準確率。最後,由我們的實驗可以驗證當時間性變化愈明顯或是愈不穩定,以該時間性瀏覽模型來預測使用者行為,則預測之準確率會有一定的提升。

      With the rapid growth of the World Wide Web and the development of E-commerce, mining and predicting user’s web browsing patterns have become a hot topic. The past researches on this field focus on mining users’ navigation patterns or clustering pageviews so as to model users’ behavior. However, none of them are concerned with the web log temporality, i.e., the start time of a user session in our definition. In this paper, we take into account the Web temporality for constructing the time-based user behavior model, based on which the user behavior can be predicted. In addition, we propose three methods to measure the changes of Web temporality in order to evaluate the applicability of a temporality model. Our experiments show that the precision of prediction can be improved more if there exist more distinct changes of temporality in the user’s browsing behaviors.

    英文摘要......................................I 中文摘要.....................................II 誌謝........................................III 目錄.........................................IV 表目錄......................................VII 圖目錄.....................................VIII 第一章 導論...............................1 1.1 研究背景..............................1 1.2 研究動機..............................1 1.3 環境與問題描述........................2 1.4 研究方法..............................5 1.5 研究貢獻..............................7 1.6 論文架構..............................7 第二章 文獻探討...........................8 2.1 網路探勘之環境究......................8 2.1.1 資料來源..............................8 2.1.2 環境上的問題..........................9 2.1.3 分析應用.............................11 2.2 網路探勘預測機制.....................12 2.2.1 預測機制.............................13 2.2.2 關聯規則.............................13 2.2.3 預測規則整理.........................14 2.2.4 N-gram模型...........................15 2.2.5 馬可夫模型...........................15 2.2.6 規則挑選.............................19 2.3 應用時間性的網路探勘.................21 2.3.1 時間序關聯規則.......................21 2.3.2 時間性人次叢集.......................22 2.3.3 時間馬可夫模型.......................23 2.3.4 時間事件預測.........................23 2.4 本章總結.............................24 第三章 時間性瀏覽路徑模型與規則變化探勘..25 3.1 時間性N-gram模型.....................25 3.1.1 資料前處理...........................26 3.1.2 時間性N-gram模型.....................28 3.1.3 預測方法.............................28 3.2 時間性規則變化.......................32 3.2.1 卡方齊一性檢定.......................33 3.2.2 基本變化.............................34 3.2.3 規則支持度的基本變化.................35 3.2.4 規則信賴度的基本變化.................36 3.2.5 規則的預測變化.......................37 3.3 本章總結.............................39 第四章 實驗分析..........................40 4.1 實際資料之實驗分析...................41 4.1.1 實驗資料來源.........................41 4.1.2 實際資料之實驗結果與分析.............42 4.2 模擬資料之實驗分析...................48 4.3 實驗總結.............................54 第五章 結論與未來研究方向................55 5.1 結論.................................55 5.2 應用與未來研究方向...................55 5.2.1 自動化時間性模型.....................55 5.2.2 時間性網頁之預測.....................56 參考文獻.....................................57

    [1] R. Agrawal, T. Imielinski and A. Swami. Mining Association Rules Between Sets of Items in Large Databases. Proc. of the ACM SIGMOD Conference on Management of Data, pp 207-216. 1993.
    [2] R. Agrawal and R. Srikant. Fast Algorithms for Mining Association Rules. Proc. 20th Very Large Databases (VLDB) Conference, pp 487-499, Santiage, Chile. 1994.
    [3] R. Agrawal and R. Srikant. Mining Sequential Patterns. Proc. of the Int'l Conference on Data Engineering (ICDE), Taipei, Taiwan, March 1995. Expanded version available as IBM Research Report RJ9910, October 1994.
    [4] J. Borges and M. Levene. Data Mining of User Navigation Patterns. Proc. of the Workshop on Web Usage Analysis and User Profiling (WEBKDD'99), pages 31-36. August 15,1999, San Diego, CA.
    [5] R. Cooley, B. Mobasher, and J. Srivastava. Data preparation for mining World Wide Web browsing patterns. Journal ofKnowledge and Information Systems, (1) 1, 1999.
    [6] M. Deshpande and G. Karypis. Selective Markov Models for Predicting Web-Page Accesses. Procs. SIAM Int. Conference on Data Mining (SDM'2001), Apr 2001
    [7] E. Frias-Martinez and V. Karamcheti. A Prediction Model for User Access Sequences. Proc. of the WEBKDD Workshop: Web Mining for Usage Patterns and User Profiles, ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, July 2002.
    [8] Ş. Gündüz and M. T. Özsu. A User Interest Model for Web Page Navigation. Proc. of International Workshop on Data Mining for Actionable Knowledge (DMAK), Seoul, Korea, April 2003, pages 46-57.
    [9] Ş. Gündüz and M. T. Özsu. A Web Page Prediction Model Based on Click-Stream Tree Representation of User Behavior. Proc. of Ninth ACM International Conference on Knowledge Discovery and Data Mining (KDD), Washington, DC, August 2003, pages 535-540.
    [10] Y. Li, P. Ning, X. S. Wang, and S. Jajodia. Discovering Calendar-based Temporal Association Rules. In Journal of Data and Knowledge Engineering (DKE), Elsevier, Vol. 44, No. 2, pages 193-218, 2002.
    [11] B. Liu, W. Hsu, and Y. Ma. Discovering the Set of Fundamental Rule Changes. Proc. of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD-2001), San Francisco, CA; Aug 20-23, 2001.
    [12] A. Nanopoulos, D. Katsaros, and Y. Manolopoulos. A Data Mining Algorithm for Generalized Web Prefetching. IEEE Transactions on Knowledge and Data Engineering, to appear, 2003.
    [13] A. E. Nicholson, I. Zukerman, and D.W. Albrech. A Decision-theoretic Approach for Pre-sending Information on the WWW. In PRICAI’98 – Proceedings of the Fifth Pacific Rim International Conference on Artificial Intelligence, 575–586.
    [14] V. Padmanabhan and J. Mogul. Using Predictive Prefetching to Improve World Wide Web Latency. ACM SIGCOMM Computer Comm. Rev., vol. 26, no. 3, July 1996.
    [15] T. Palpanas and A. Mendelzon. Web Prefetching Using Partial Match Prediction. Proc. Fourth Web Caching Workshop (WCW ’99), Mar. 1999.
    [16] A. Papoulis. Probability, Random Variables, and Stochastic Processes. McGraw Hill, 1991.
    [17] J. Pitkow and P. Pirolli. Mining Longest Repeating Subsequences to Predict World Wide Web Surfing. Proc. USENIX Symp. Internet Technologies and Systems (USITS ’99), Oct. 1999.
    [18] J. Srivastava, R. Cooley, M. Deshpande, and P. Tan. Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data. In SIGKDD Explorations, ACM SIGKDD, Jan. 2000.
    [19] Z. Su, Q. Yang, Y. Lu, and H. Zhang. WhatNext: A Prediction System for Web Requests Using N-gram Sequence Models. Proc. of the First Int’l Conf. on Web Information Systems and Engineering Conference, Hong Kong June 2000, pages 200-207.
    [20] P. Tan, V. Kumar, and J. Srivastava. Indirect Association: Mining Higher Order Dependencies in Data. Proc. of the 4th European Conference on Principles and Practice of Knowledge Discovery in Databases, Lyon, France, Sept 13-16, 632-637, 2000.
    [21] P. Tan and V. Kumar. Mining Indirect Associations in Web Data. Knowledge Discovery and Data Mining on the Web (WebKDD): Mining Log Data Across All Customer TouchPoints, August 2001.
    [22] P. Tan and V. Kumar. Mining Association Patterns in Web Usage Data. International Conference on Advances in Infrastructure for e-Business, e-Education, e-Science, and e-Medicine on the Internet, 2002.
    [23] W. Wang and O. R. Zaiane. Clustering Web Sessions by Sequence Alignment. Third International Workshop on Management of Information on the Web in conjunction with 13th International Conference on Database and Expert Systems Applications DEXA'2002, pages 394-398, Aix en Provence, France, September 2-6, 2002.
    [24] Q. Yang, H. Wang, and W. Zhang. Web-log Mining for Quantitative Temporal-Event Prediction. IEEE Computational Intelligence Bulletin, 1(1), pages 10-18, 2002.
    [25] Q. Yang, T. Li, and K. Wang. Building association rule based sequential classifiers for web document prediction. Journal of Data Mining and Knowledge Discovery, 8(3), 253-273, 2004.
    [26] I. Zukerman, D. W. Albrecht, and A. E. Nicholson. Predicting user’s request on the WWW. UM99 – Proceedings of the Seventh International Conference on User Modeling.

    下載圖示 校內:立即公開
    校外:2004-08-24公開
    QR CODE