| 研究生: |
林宣佑 Lin, Xuan-You |
|---|---|
| 論文名稱: |
搜尋結果分類及重排類別順序的新方法 A New Method To Find And Rearrange Search Result Clusters |
| 指導教授: |
李強
Lee, Chiang |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2011 |
| 畢業學年度: | 99 |
| 語文別: | 中文 |
| 論文頁數: | 61 |
| 中文關鍵詞: | 網頁搜尋 、分群 、點擊模型 、使用者行為 |
| 外文關鍵詞: | Web search, clustering, click model, user behavio |
| 相關次數: | 點閱:79 下載:3 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
為了讓使用者能更便利地瀏覽,許多的研究在討論如何有效率的分類搜尋引擎回傳的資訊。在此篇論文中,我們提出一個新的分群演算法,能將查詢到的搜尋結果分類為階層式的主題。我們把這個演算法命名為 GFIC。 我們利用 $kSSL$ 與 $F_ eta$ measure 的衡量標準來證明我們提出的方法是準確且有效率的。除此之外,我們更進一步的提供一個方法,能增進系統在使用者介面呈現類別時的效果。我們利用一個階層式的點擊模型來觀察使用者瀏覽的行為,透過模型找出在所有分類的網頁中較為熱門的類別。接著根據點擊模型所取得的資訊進一步在使用者介面上調整這些類別擺的位置,讓使用者能快速地看到感興趣的類別。 我們在網路上找到數百名經常瀏覽網頁的自願者,並搜集了他們在系統內瀏覽點擊的歷史資訊。我們用這些搜集的資訊來訓練點擊模型,並用一些點擊模型的衡量標準做效能實驗。我們提供訓練後模型的效能實驗數據及有趣的發現。
Classifying web search results into categories facilitates users' browsing through Web search. For this purpose, many approaches which classify search results efficiently have been proposed. In this paper, we propose a new search result clustering algorithm, named GFIC, to build a topic hierarchy for the search results in response to a query. We use some measures such as $kSSL$ and $F_ eta$ measure for an empirical comparison of the state-of-the-art algorithms, and the results shows that our proposed method is effective and efficient. Besides the method of clustering search result, we also proposed a method to improve the cluster layout of the user interface.
We design a hierarchical click model to analyze click log, and to get hot topics. Therefore, we can adjust the positions of clusters on the user interface, and the users would easily find their interested clusters. Finally, we get a dataset recorded the click log of several hundereds of volunteers which often browse web pages. We use the dataset to train our click model, and then provide some experimental results and some interesting findings.
[1] Andrea Bernardini, Claudio Carpineto, and Massimiliano D'Amico.
Full-subtopic retrieval with keyphrase-based search results clustering. In Web Intelligence, pages 206-213, 2009.
[2] Claudio Carpineto, Stefano Mizzaro, Giovanni Romano, and Matteo Snidero.
Mobile information retrieval with search results clustering: Prototypes and evaluations. Journal of American Society for Information Science and Technology (JASIST), pages 877-895, 2009.
[3] Claudio Carpineto, Stanislaw Osinski, Giovanni Romano, and Dawid Weiss.
A survey of web clustering engines. ACM Comput. Surv., 41(3):17:1-17:38, 2009.
[4] Claudio Carpineto and Giovanni Romano. Optimal meta search results clustering. In SIGIR, pages 170-177, 2010.
[5] Olivier Chapelle and Ya Zhang. A dynamic bayesian network click model for web search ranking. In Proceedings of the 18th international conference on World wide web, WWW '09, pages 1-10, New York, NY, USA, 2009. ACM.
[6] Georges Dupret and Benjamin Piwowarski. A user browsing model to predict search engine click data from past observations. In SIGIR, pages 331-338, 2008.
[7] Brian S. Everitt, Sabine Landau, and Morven Leese. Cluster Analysis. Wiley, 4th edition, January 2009.
[8] Paolo Ferragina and Antonio Gulli. Apersonalized search engine based on web-snippet hierarchical clustering. In WWW (Special interest tracks and posters), pages 801-810, 2005.
[9] Benjamin C. M. Fung, We Wang, and Martin Ester. Hierarchical document clustering using frequent itemsets. In SDM, pages 59-70, 2003.
[10] Emilio Di Giacomo, Walter Didimo, Luca Grilli, and Giuseppe Liotta. Graph visualization techniques for web clustering engines. IEEE Trans. Vis. Comput. Graph., 13(2):294-304, 2007.
[11] Fosca Giannotti, Mirco Nanni, Dino Pedreschi, and F. Samaritani. Webcat: Automatic categorization of web search results. In SEBD, pages 507-518, 2003.
[12] Fan Guo, Lei Li, and Christos Faloutsos. Tailoring click models to user goals. In Proceedings of the 2009 workshop on Web Search Click Data, pages 88-92. ACM, 2009.
[13] Fan Guo, Chao Liu, Anitha Kannan, Tom Minka, Michael Taylor, Yi M. Wang, and Christos Faloutsos. Click chain model in web search. In WWW, pages 11-20, 2009.
[14] Fan Guo, Chao Liu, and Yi Min Wang. Efficient multiple-click models in web search. In Proceedings of the Second ACM International Conference on Web Search and Data Mining, WSDM '09, pages 124-131. ACM, 2009.
[15] Jiawei Han, Jian Pei, and YiWen Yin. Mining frequent patterns without candidate generation. In Proceedings of the 2000 ACM SIGMOD international conference on Management of data, SIGMOD '00, pages 1-12, New York, NY, USA, 2000. ACM.
[16] Dawn J. Lawrie and W. Bruce Croft. Generating hierarchical summaries for web searches. In SIGIR, pages 457-458, 2003.
[17] Chao Liu, Fan Guo, and Christos Faloutsos. BBM: bayesian browsing model from petabyte-scale data. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '09, pages 537-546, NewYork, NY, USA, 2009. ACM.
[18] Mostafa, Javed. Seeking better web searches. Scientific American, 292, 2005.
[19] Stanislaw Osinski, Jerzy Stefanowski, and Dawid Weiss. Lingo: Search results clustering algorithm based on singular value decomposition. In Intelligent Information Systems, pages 359-368, 2004.
[20] M. Porter. An algorithm for suffix stripping. Program, 14(3):130-137, 1980.
[21] Dikan Xing, Gui-Rong Xue, Qiang Yang, and Yong Yu. Deep classifier: automatically categorizing serch results into large-scale hierarchies. In WSDM, pages 139-148, 2008.
[22] Oren Zamir and Oren Etzioni. Web document clustering: A feasibility demonstration. In SIGIR, pages 46-54, 1998.
[23] Zeyuan Allen Zhu, Weizhu Chen, Tom Minka, Chenguang Zhu, and Zhen Chen. A novel click model and its applications to online advertising. In WSDM, pages 321-330, 2010.