簡易檢索 / 詳目顯示

研究生: 陳力揚
Chen, Li-Yang
論文名稱: 基於使用者瀏覽動作及瀏覽目的以預測其關注之網頁區塊
Predicting Interesting Page Blocks Based on Surfing Actions and Surfing Goals of Users
指導教授: 盧文祥
Lu, Wen-Hsiang
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2011
畢業學年度: 99
語文別: 中文
論文頁數: 53
中文關鍵詞: 使用者目的使用者行為使用者需求
外文關鍵詞: User Goals, User Behavior, User Needs
相關次數: 點閱:69下載:5
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著網路的迅速發展,其內含之資訊已是相當龐大且複雜,使用者要獲取想要的資訊往往必須經由多次搜尋和瀏覽。同時單一個網頁內容相當地豐富,其往往能夠提供多個以上的資訊或是數種不同的功能及服務。但是使用者當下瀏覽時所需要的資訊往往只有一小部分,而過多的無用訊息更會造成瀏覽上的混淆和不便。過去的研究多集中在找尋富含有資訊或較為重要的網頁區塊,但是並不能滿足使用者所有狀況的需求。
    本研究希望根據使用者不同的瀏覽目的,來替使用者預測出他們可能的關注區塊。我們觀察使用者的瀏覽行為,發現在其瀏覽過程中會有隱含的目標轉換,例如想要購買一款理想手機,其瀏覽意圖會先由尋找推薦型號、尋找賣家而至訂購手機等等,同時我們發現使用者關注的區塊內容可以反映出其當下的瀏覽目標,根據內容特性將其分為資訊型、導覽型、交易型、廣告型四類。故我們利用使用者的隱含瀏覽動作轉換和區塊類型的對映,模擬使用者的瀏覽行為並預測出其可能的關注區塊。同時我們觀察對於內容混亂程度較高的網頁,由於較難找到想要的內容,故假設使用者可能較需要導覽和搜尋功能引導他們至目標內容。本研究試著建立出一使用者瀏覽行為模型,並分為瀏覽動作轉換、區塊類型對映、重要區塊三個子模型。
    實驗結果顯示本研究在尋找同類型中重要區塊的準確度算佳,而在預測瀏覽動作轉換上約有五成至六成正確性,並且能有效幫助預測使用者可能之關注區塊。未來希望就本論文沒考慮的使用者查詢詞部分分析其搜尋意圖,並結合瀏覽動作做進一步的預測;並希望改進瀏覽動作的預測模型使其效果更佳。

    With the rapid development of the Internet, the Web contains very large amounts of information contents. Therefore, users generally have to search and browse through websites many times to get the information they want. Besides, to provide abundant information or different services, a web page may contain large amount of contents. However, a user often requires only a small part in one page, but too much useless information on pages will also cause inconvenience of browsing. Past studies have concentrated in finding relevant pages or important page blocks, but they do not meet all the conditions users need.
    This study is to predict the blocks user might concern according to their different goals. Based on the observation of users’ browsing behaviors, we found that there are some implicit goal transitions in their surfing sessions. For example, if the user wants to buy an ideal cellphone, then his surfing intent will follow the following three steps, “finding the recommended phone”, “looking for sellers” and “ordering the specified phone”. We also found that the block contents users focus can reflect their immediate surfing goals. We try to classify each block into one of the four block types: informational, navigational, transactional, and advertising based on properties of block contents. Therefore, we use the user's implicit surfing action transition and block type mapping to build user's surfing behavior model and predict their interesting blocks. Our proposed user surfing behavior model is composed of three sub-models, including surfing action transition model, block type mapping model, and important block detection model.
    The experimental results show that the important block detection sub-model achieves good performance while the precision of the surfing action transition sub-model is over 50%, and our proposed user surfing behavior model can effectively help predict users’ interesting blocks. In the future, we plan to investigate the effectiveness of combining the search goals of query terms with surfing actions, and we will improve our surfing action transition model to achieve better performance.

    摘要 IV Abstract VI 誌謝 VIII 章節目錄 X 表目錄 XII 圖目錄 XIII 第一章 緒論 1 1.1研究背景與問題 1 1.2研究動機與觀察 3 網頁區塊類型 4 1.3研究方法和貢獻 6 1.4論文架構 7 第二章 相關研究 8 2.1 尋找網頁內重要區塊 8 2.2 使用者搜尋目的 10 第三章 使用者關注區塊位置分析 12 3.1觀察與想法 12 使用者瀏覽行為模型 12 網頁內容與區塊亂度 15 3.2系統架構 16 3.3 Feature的抽取與辨認 18 3.3.1網頁區塊內容的切割 18 3.3.2區塊類型和瀏覽動作類型的辨識 20 瀏覽動作(Surfing Action) 20 區塊類型(Block Type) 21 3.4 使用者網頁瀏覽模型 22 3.4.1預測使用者目標通用模型 23 3.4.2基本瀏覽模式 23 3.4.3瀏覽動作轉換模型 27 3.4.4區塊類型模型 29 3.4.5重要區塊模型 29 區塊的空間特性 30 區塊的內容特性 32 第四章 實驗 36 4.1實驗資料 36 4.2評估方法 38 4.2.1瀏覽動作轉換模型評估方式 38 4.2.2重要區塊模型評估方式 39 4.2.3整體模型評估方式 39 4.3 實驗結果 40 4.3.1瀏覽動作轉換模型 40 4.3.2重要區塊模型 42 4.3.3整體模型 45 第五章 結論與未來工作 50 5.1結論 50 5.2未來工作 50 參考文獻 52

    [1] Bernard J. Jansen, Danielle L. Booth and Amanda Spink, Determining the informational, navigational, and transactional intent of Web queries. Information Processing & Management.2008. p.1251-1266.
    [2] Broder, A., A taxonomy of web search. SIGIR Forum, 2002. 36(2): p. 3-10.
    [3] Cai, D., et al. VIPS: a Vision-based Page Segmentation Algorithm. 2003.
    [4] Chakrabarti, D., R. Kumar, and K. Punera, A graph-theoretic approach to webpage segmentation, in Proceeding of the 17th international conference on World Wide Web. 2008, ACM: Beijing, China. p.377-386.
    [5] Cho, W.-T., Y.-M. Lin, and H.-Y. Kao, Entropy-Based Visual Tree Evaluation on Block Extraction, in Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01. 2009, IEEE Computer Society. p. 580-583.
    [6] Eldar Sadikov, Jayant Madhavan, Lu Wang and Alon Y. Halevy, Clustering Query Refinements by User Intent. in Proceedings of the 19th International Conference on World Wide Web,2010, p. 841-850.
    [7] H.-Y. Kao, S.-H. Lin, J.-M. Ho and M.-S. Chen., Mining Web Information Structures and Contents based on Entropy Analysis. IEEE Trans. on Knowledge and Data Engineering, January 2004.
    [8] He, K.-Y., Y.-S. Chang, and W.-H. Lu, Improving Identification of Latent User Goals through Search-Result Snippet Classification, in Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence. 2007, IEEE Computer Society. p. 683-686.
    [9] Hidden Markov model. Available from: http://en.wikipedia.org/wiki/Hidden_Markov_model
    [10] Log-linear model. Available from: http://en.wikipedia.org/wiki/Multinomial_logit
    [11] Kao, H.-Y., et al. Entropy-Based Link Analysis for Mining Web Informative Structures. in CIKM. 2002.
    [12] Kao, H.-Y., J.-M. Ho, and M.-S. Chen. DOMISA: DOM-based Information Space Adsorption for Web Information Hierarchy Mining. in Proceedings of the 4th SIAM Intern'l Conference on Data Mining (SDM-04) 2004.
    [13] Kohlschütter, C. and W. Nejdl, A densitometric approach to web page segmentation, in Proceeding of the 17th ACM conference on Information and knowledge management. 2008, ACM: Napa Valley, California, USA. p. 1173-1182.
    [14] Lee, U., Z. Liu, and J. Cho, Automatic identification of user goals in Web search, in Proceedings of the 14th international conference on World Wide Web. 2005, ACM: Chiba, Japan. p. 391-400.
    [15] Lin, S.-H. and J.-M. Ho, Discovering informative content blocks from Web documents, in Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining. 2002, ACM: Edmonton, Alberta, Canada. p. 588-593.
    [16] Precision and recall. Available from: http://en.wikipedia.org/wiki/Precision_and_recall
    [17] Rose D.E. and Levinson D.,Understanding User Goals in Web Search, in Proceedings of the 13th International Conference on World Wide Web,2004, p. 13-19.
    [18] Song, R., et al., Learning block importance models for web pages, in Proceedings of the 13th international conference on World Wide Web. 2004, ACM: New York, NY, USA. p. 203-211.
    [19] Yi-Feng Tseng, Hung-Yu Kao. The Mining and Extraction of Primary Informative Blocks and Data Objects from Systematic Web Pages. 2006. IEEE/WIC/ACM International Conference on Web Intelligence (WI-2006).
    [20] Zhicong Cheng, Bin Gao and Tie-Yan Liu, Actively predicting diverse search intent from user browsing behaviors, in Proceedings of the 19th International Conference on World Wide Web,2010, p. 221-230.
    [21] 官咨含,「基於區塊類型之網頁客製化」,國立成功大學資訊工程所碩士論文,2010。

    下載圖示 校內:2012-09-08公開
    校外:2013-09-08公開
    QR CODE