| 研究生: |
陳盈良 Chen, Ying-Liang |
|---|---|
| 論文名稱: |
在問答社群網站中利用知識隔閡分析找出困難問題之研究 Finding Hard Questions by Knowledge Gap Analysis in Question Answer Communities |
| 指導教授: |
高宏宇
Kao, Hung-Yu |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2010 |
| 畢業學年度: | 98 |
| 語文別: | 英文 |
| 論文頁數: | 81 |
| 中文關鍵詞: | 知識論壇 、問題難度 、知識隔閡 、連結分析 、專家的找尋 、社會網路 |
| 外文關鍵詞: | CQA portal, Difficulty, Knowledge gap, Link analysis, Expert finding, Social network |
| 相關次數: | 點閱:157 下載:2 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
奇摩知識+是一個典型的Web2.0知識分享平台,每一天都有成千上萬筆問題被提出和解決。隨著資料量越來越大和各種不同的使用者在此平台的互動,因此能夠快速的搜尋使用者想要的知識就成了一個很重要的研究問題。在本篇論文中,我們提出了一個議題”如何鑑別一個問題是簡單的或是困難的?”,希望把簡單的和困難的問題推薦給不同程度的使用者。在此問題中,專家的搜尋是一個很重要的子問題。也就是說,專家對於一個問題的參與是會影響到一個問題難易度的判斷。在過去的研究裡已經有很多人做過專家的搜尋這個研究,不幸的是,如果想要解決如何鑑別難的問題這個問題是不夠的,因為前人沒有考慮到使用者們在知識+裡發問或回答的習性。舉例來說,一個專家不只會回答難的問題,他也會去回答簡單的問題,而之前的研究都忽略了這一點。幸運的是,我們觀察到在知識+裡有一種我們稱為知識隔閡(knowledge gap)的現象。這個現象和使用者們活動的習性很有關係,為了完成專家的搜尋和排名問題的難度,我們把這個現象考慮進來並加入到我們的以知識隔閡為基底的難度排名演算法(KG-DRank)。此外,KG-Drank還可細分為兩種型態。一個是以使用者為中心的LKG-Drank(Local)和以整個類別為中心的GKG-DRank(Global)。實驗的結果最後顯示出LKG-Drank在我們的方法裡是必要的,且效能比其他的基礎方法的表現好15%~20%。
The Community Question Answer (CQA) service is a typical forum of Web2.0 in sharing knowledge among people. There are thousands of questions have been posted and solved every day. Because of the above reasons and the variant users in CQA service, the question search and ranking are the most important researches in the CQA portal. In this paper, we address the problem of detecting the question being easy or hard by means of a probability model. In addition, we observed the phenomenon called knowledge gap that is related to the habit of users and use knowledge gap diagram to illustrate how much knowledge gap in different categories. In this task, we propose an approach called knowledge-gap-based difficulty rank (KG-DRank) algorithm that combines the user-user network and the architecture of the CQA service to solve this problem. The experimental results show our approach leads to a better performance than other baseline approaches and increases the f-measure by a factor ranging from 15% to 20%.
[1]. Adamic, L.A., Zhang, J., Bakshy, E. and Ackerman, M.S. (2008) Knowledge sharing and yahoo answers: everyone knows something. Proceeding of the 17th international conference on World Wide Web. ACM, Beijing, China..
[2]. Agichtein, E., Castillo, C., Donato, D., Gionis, A. and Mishne, G. (2008) Finding high-quality content in social media. Proceedings of the international conference on Web searc h and web data mining. ACM, Palo Alto, California, USA.
[3]. Brin, S. & Page, L., (1998) The anatomy of a large-scale hypertextual web search engine. Proceedings of the seventh international conference on World Wide Web 7. Brisbane, Australia: Elsevier Science Publishers B. V.
[4]. Burke, R.D., Hammond, K.J., Kulyukin, V.A., Lytinen, S.L., Tomuro, N. and Schoenberg, S. (1997) Question Answering from Frequently Asked Question Files: Experiences with the FAQ Finder System. University of Chicago.
[5]. Bian, J., Liu, Y., Agichtein, E. and Zha, H. (2008) Finding the right facts in the crowd: factoid question answering over social media. Proceeding of the 17th international conference on World Wide Web. ACM, Beijing, China.
[6]. Balog, K., Azzopardi, L. and Rijke, M.d. (2006) Formal models for expert finding in enterprise corpora. Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, Seattle, Washington, USA.
[7]. Campbell, C.S., Maglio, P.P., Cozzi, A. and Dom, B. (2003) Expertise identification using email communications. Proceedings of the twelfth international conference on Information and knowledge management. ACM, New Orleans, LA, USA.
[8]. Cao, Y., Duan, H., Lin, C.-Y., Yu, Y. and Hon, H.-W. (2008) Recommending questions using the mdl-based tree cut model. Proceeding of the 17th international conference on World Wide Web. ACM, Beijing, China.
[9]. Dom, B., Eiron, I., Cozzi, A. and Zhang, Y. (2003) Graph-based ranking algorithms for e-mail expertise analysis. Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery. ACM, San Diego, California.
[10]. Fujimura, K., Inoue, T., Sugisaki, M., (2005)The EigenRumor Algorithm for Ranking Weblogs, Proceedings of the 2nd Annual Workshop on the Weblogging Ecosystem: Aggregation, Analysis and Dynamics, WWW 2005.
[11]. Fang, H. and Zhai, C. (2007) Probabilistic models for expert finding. Proceedings of the 29th European conference on IR research. Springer-Verlag, Rome, Italy.
[12]. Jeon, J., Croft, W.B. and Lee, J.H. (2005) Finding similar questions in large question and answer archives. Proceedings of the 14th ACM international conference on Information and knowledge management. ACM, Bremen, Germany.
[13]. Jeon, J., Croft, W.B. and Lee, J.H. (2005) Finding semantically similar questions based on their answers. Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, Salvador, Brazil.
[14]. Jeon, J.A.C., W. B., Learning translation-based language models using q&a archives. Technical Report,University of Massachusetts.
[15]. Jurczyk, P. and Agichtein, E. (2007) Discovering authorities in question answer communities by using link analysis. Proceedings of the sixteenth ACM conference on Conference on information and knowledge management. ACM, Lisbon, Portugal.
[16]. Jurczyk, P. and Agichtein, E. (2007) Hits on question answer portals: exploration of link analysis for author ranking. Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, Amsterdam, The Netherlands.
[17]. Jeon, J., Croft, W.B., Lee, J.H. and Park, S. (2006) A framework to predict the quality of answers with non-textual features. Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, Seattle, Washington, USA.
[18]. Kleinberg, J.M. (1999) Authoritative sources in a hyperlinked environment, J. ACM, 46, 604-632.
[19]. Liu, X., Croft, W.B. and Koll, M. (2005) Finding experts in community-based question-answering services. Proceedings of the 14th ACM international conference on Information and knowledge management. ACM, Bremen, Germany.
[20]. Lai, Y.-S., Fung, K.-A. and Wu, C.-H. (2002) FAQ mining via list detection. proceedings of the 2002 conference on multilingual summarization and question answering - Volume 19. Association for Computational Linguistics.
[21]. McCallum, A, A. Corrada-Emanuel, and X. Wang. (2005) Topic and role discovery in social networks. In Proceedings of 19th International Joint Conference on Artificial Intelligence.
[22]. Nam, K.K., Ackerman, M.S. and Adamic, L.A. (2009) Questions in, knowledge in?: a study of naver's question answering community. Proceedings of the 27th international conference on Human factors in computing systems. ACM, Boston, MA, USA.
[23]. N. Craswell, A.P.d.V., and I. Soboroff. (2005) Overview of the trec-2005 enterprise track TREC’05, 199-205.
[24]. Sun, K., Cao, Y., Song, X., Song, Y.-I., Wang, X. and Lin, C.-Y. (2009) Learning to recommend questions based on user ratings. Proceeding of the 18th ACM conference on Information and knowledge management. ACM, Hong Kong, China.
[25]. Sneiders, E. (2002) Automated Question Answering Using Question Templates That Cover the Conceptual Model of the Database. Proceedings of the 6th International Conference on Applications of Natural Language to Information Systems-Revised Papers. Springer-Verlag.
[26]. Suryanto, M.A., Lim, E.P., Sun, A. and Chiang, R.H.L. (2009) Quality-aware collaborative question answering: methods and evaluation. Proceedings of the Second ACM International Conference on Web Search and Data Mining. ACM, Barcelona, Spain.
[27]. Su, Q., Pavlov, D., Chow, J.-H. and Baker, W.C. (2007) Internet-scale collection of human-reviewed data. Proceedings of the 16th international conference on World Wide Web. ACM, Banff, Alberta, Canada.
[28]. Yang, J., and Wei, X.,(2009) Seeking and Offering Expertise across Categories: A Sustainable Mechanism Works for Baidu Knows. Proceedings of the Advancement of Artificial Intelligence.
[29]. Wang, X.-J., Tu, X., Feng, D. and Zhang, L. (2009) Ranking community answers by modeling question-answer relationships via analogical reasoning. Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval. ACM, Boston, MA, USA.
[30]. Zhang, J., Ackerman, M.S. and Adamic, L. (2007) Expertise networks in online communities: structure and algorithms. Proceedings of the 16th international conference on World Wide Web. ACM, Banff, Alberta, Canada.
[31]. Zhou, Y., Cong, G., Cui, B., Jensen, C.S. and Yao, J. (2009) Routing Questions to the Right Users in Online Communities. Proceedings of the 2009 IEEE International Conference on Data Engineering. IEEE Computer Society.