簡易檢索 / 詳目顯示

研究生: 陳盈良
Chen, Ying-Liang
論文名稱: 在問答社群網站中利用知識隔閡分析找出困難問題之研究
Finding Hard Questions by Knowledge Gap Analysis in Question Answer Communities
指導教授: 高宏宇
Kao, Hung-Yu
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2010
畢業學年度: 98
語文別: 英文
論文頁數: 81
中文關鍵詞: 知識論壇問題難度知識隔閡連結分析專家的找尋社會網路
外文關鍵詞: CQA portal, Difficulty, Knowledge gap, Link analysis, Expert finding, Social network
相關次數: 點閱:157下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 奇摩知識+是一個典型的Web2.0知識分享平台,每一天都有成千上萬筆問題被提出和解決。隨著資料量越來越大和各種不同的使用者在此平台的互動,因此能夠快速的搜尋使用者想要的知識就成了一個很重要的研究問題。在本篇論文中,我們提出了一個議題”如何鑑別一個問題是簡單的或是困難的?”,希望把簡單的和困難的問題推薦給不同程度的使用者。在此問題中,專家的搜尋是一個很重要的子問題。也就是說,專家對於一個問題的參與是會影響到一個問題難易度的判斷。在過去的研究裡已經有很多人做過專家的搜尋這個研究,不幸的是,如果想要解決如何鑑別難的問題這個問題是不夠的,因為前人沒有考慮到使用者們在知識+裡發問或回答的習性。舉例來說,一個專家不只會回答難的問題,他也會去回答簡單的問題,而之前的研究都忽略了這一點。幸運的是,我們觀察到在知識+裡有一種我們稱為知識隔閡(knowledge gap)的現象。這個現象和使用者們活動的習性很有關係,為了完成專家的搜尋和排名問題的難度,我們把這個現象考慮進來並加入到我們的以知識隔閡為基底的難度排名演算法(KG-DRank)。此外,KG-Drank還可細分為兩種型態。一個是以使用者為中心的LKG-Drank(Local)和以整個類別為中心的GKG-DRank(Global)。實驗的結果最後顯示出LKG-Drank在我們的方法裡是必要的,且效能比其他的基礎方法的表現好15%~20%。

    The Community Question Answer (CQA) service is a typical forum of Web2.0 in sharing knowledge among people. There are thousands of questions have been posted and solved every day. Because of the above reasons and the variant users in CQA service, the question search and ranking are the most important researches in the CQA portal. In this paper, we address the problem of detecting the question being easy or hard by means of a probability model. In addition, we observed the phenomenon called knowledge gap that is related to the habit of users and use knowledge gap diagram to illustrate how much knowledge gap in different categories. In this task, we propose an approach called knowledge-gap-based difficulty rank (KG-DRank) algorithm that combines the user-user network and the architecture of the CQA service to solve this problem. The experimental results show our approach leads to a better performance than other baseline approaches and increases the f-measure by a factor ranging from 15% to 20%.

    Outline 中文摘要 IV ABSTRACT V TABLE LISTING 4 Figure LISTING 6 1.序章 7 2.相關研究 8 3.知識隔閡 9 4.問題定義和方法 10 5.實驗 11 6.結論 12 1. INTRODUCTION 13 1.1 Background and motivation 13 1.2 Motivation 13 1.3 The introduction of YA 15 1.4 The difficulty in this task 17 1.5 Method Abstract 18 1.6 Application 20 1.7 Paper Organization 20 2. RELATED WORK 22 2.1 Link Analysis 22 2.2 Expert finding in social media 22 2.3 Question Ranking 23 2.4 Question Search 24 2.5 Question Recommendation 24 3. The knowledge gap 26 3.1 The Knowledge Gap in YA 26 3.2 The Quantification of Knowledge Gap Diagram 28 3.2.1 The preliminary for quantifying the knowledge gap diagram 28 3.2.2 The four zones in the knowledge gap diagram 29 3.2.3 The quantification of non-expert and expert 31 3.2.4 The weight for non-expert and expert 32 3.2.5 The L-KGS 33 3.2.6 The results of knowledge gap score (KGS) 34 3.3 The Expert Finding By Knowledge Gap 34 4. Problem definition and Our Approach 36 4.1 Problem Definition 36 4.2 The Flowchart of Our Approach 36 4.3 Expertise Model 37 4.4 Scores 38 4.5 Difficulty Degree Model 39 4.5.1 The part of asking 40 4.5.2 The part of answering 41 4.6 Reinforcement Model 41 5. Experiments 44 5.1 Dataset 44 5.1.1 The basic statistics analysis 44 5.1.2 The asking and answering of users in each category 45 5.1.3 The number of answers in each question 49 5.1.4 The level of users for each category 50 5.1.5 The level of askers in each category 51 5.2 Answer set 53 5.2.1 The number of answers in a question for answer set 53 5.2.2 The length of question 54 5.2.3 The length of answer 55 5.3 Baseline 56 5.4 Evaluation metrics 57 5.5 Parameter setting 58 5.5.1 The results of parameter α 58 5.5.2 The results of parameter β 60 5.5.3 The results of parameter t 61 5.6 Comparison to baseline methods 62 5.6.1 F-measure of easy question and hard question prediction 62 5.6.2 Roc curve and AUC 64 5.6.3 The analysis of the expertise of users 66 5.6.4 The examples of low-level experts 68 5.6.5 The examples of the outputs compared with the basic approaches 71 6. Conclusion and futuer works 77 7. REFERENCE 78

    [1]. Adamic, L.A., Zhang, J., Bakshy, E. and Ackerman, M.S. (2008) Knowledge sharing and yahoo answers: everyone knows something. Proceeding of the 17th international conference on World Wide Web. ACM, Beijing, China..
    [2]. Agichtein, E., Castillo, C., Donato, D., Gionis, A. and Mishne, G. (2008) Finding high-quality content in social media. Proceedings of the international conference on Web searc h and web data mining. ACM, Palo Alto, California, USA.
    [3]. Brin, S. & Page, L., (1998) The anatomy of a large-scale hypertextual web search engine. Proceedings of the seventh international conference on World Wide Web 7. Brisbane, Australia: Elsevier Science Publishers B. V.
    [4]. Burke, R.D., Hammond, K.J., Kulyukin, V.A., Lytinen, S.L., Tomuro, N. and Schoenberg, S. (1997) Question Answering from Frequently Asked Question Files: Experiences with the FAQ Finder System. University of Chicago.
    [5]. Bian, J., Liu, Y., Agichtein, E. and Zha, H. (2008) Finding the right facts in the crowd: factoid question answering over social media. Proceeding of the 17th international conference on World Wide Web. ACM, Beijing, China.
    [6]. Balog, K., Azzopardi, L. and Rijke, M.d. (2006) Formal models for expert finding in enterprise corpora. Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, Seattle, Washington, USA.
    [7]. Campbell, C.S., Maglio, P.P., Cozzi, A. and Dom, B. (2003) Expertise identification using email communications. Proceedings of the twelfth international conference on Information and knowledge management. ACM, New Orleans, LA, USA.
    [8]. Cao, Y., Duan, H., Lin, C.-Y., Yu, Y. and Hon, H.-W. (2008) Recommending questions using the mdl-based tree cut model. Proceeding of the 17th international conference on World Wide Web. ACM, Beijing, China.
    [9]. Dom, B., Eiron, I., Cozzi, A. and Zhang, Y. (2003) Graph-based ranking algorithms for e-mail expertise analysis. Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery. ACM, San Diego, California.
    [10]. Fujimura, K., Inoue, T., Sugisaki, M., (2005)The EigenRumor Algorithm for Ranking Weblogs, Proceedings of the 2nd Annual Workshop on the Weblogging Ecosystem: Aggregation, Analysis and Dynamics, WWW 2005.
    [11]. Fang, H. and Zhai, C. (2007) Probabilistic models for expert finding. Proceedings of the 29th European conference on IR research. Springer-Verlag, Rome, Italy.
    [12]. Jeon, J., Croft, W.B. and Lee, J.H. (2005) Finding similar questions in large question and answer archives. Proceedings of the 14th ACM international conference on Information and knowledge management. ACM, Bremen, Germany.
    [13]. Jeon, J., Croft, W.B. and Lee, J.H. (2005) Finding semantically similar questions based on their answers. Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, Salvador, Brazil.
    [14]. Jeon, J.A.C., W. B., Learning translation-based language models using q&a archives. Technical Report,University of Massachusetts.
    [15]. Jurczyk, P. and Agichtein, E. (2007) Discovering authorities in question answer communities by using link analysis. Proceedings of the sixteenth ACM conference on Conference on information and knowledge management. ACM, Lisbon, Portugal.
    [16]. Jurczyk, P. and Agichtein, E. (2007) Hits on question answer portals: exploration of link analysis for author ranking. Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, Amsterdam, The Netherlands.
    [17]. Jeon, J., Croft, W.B., Lee, J.H. and Park, S. (2006) A framework to predict the quality of answers with non-textual features. Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, Seattle, Washington, USA.
    [18]. Kleinberg, J.M. (1999) Authoritative sources in a hyperlinked environment, J. ACM, 46, 604-632.
    [19]. Liu, X., Croft, W.B. and Koll, M. (2005) Finding experts in community-based question-answering services. Proceedings of the 14th ACM international conference on Information and knowledge management. ACM, Bremen, Germany.
    [20]. Lai, Y.-S., Fung, K.-A. and Wu, C.-H. (2002) FAQ mining via list detection. proceedings of the 2002 conference on multilingual summarization and question answering - Volume 19. Association for Computational Linguistics.
    [21]. McCallum, A, A. Corrada-Emanuel, and X. Wang. (2005) Topic and role discovery in social networks. In Proceedings of 19th International Joint Conference on Artificial Intelligence.
    [22]. Nam, K.K., Ackerman, M.S. and Adamic, L.A. (2009) Questions in, knowledge in?: a study of naver's question answering community. Proceedings of the 27th international conference on Human factors in computing systems. ACM, Boston, MA, USA.
    [23]. N. Craswell, A.P.d.V., and I. Soboroff. (2005) Overview of the trec-2005 enterprise track TREC’05, 199-205.
    [24]. Sun, K., Cao, Y., Song, X., Song, Y.-I., Wang, X. and Lin, C.-Y. (2009) Learning to recommend questions based on user ratings. Proceeding of the 18th ACM conference on Information and knowledge management. ACM, Hong Kong, China.
    [25]. Sneiders, E. (2002) Automated Question Answering Using Question Templates That Cover the Conceptual Model of the Database. Proceedings of the 6th International Conference on Applications of Natural Language to Information Systems-Revised Papers. Springer-Verlag.
    [26]. Suryanto, M.A., Lim, E.P., Sun, A. and Chiang, R.H.L. (2009) Quality-aware collaborative question answering: methods and evaluation. Proceedings of the Second ACM International Conference on Web Search and Data Mining. ACM, Barcelona, Spain.
    [27]. Su, Q., Pavlov, D., Chow, J.-H. and Baker, W.C. (2007) Internet-scale collection of human-reviewed data. Proceedings of the 16th international conference on World Wide Web. ACM, Banff, Alberta, Canada.
    [28]. Yang, J., and Wei, X.,(2009) Seeking and Offering Expertise across Categories: A Sustainable Mechanism Works for Baidu Knows. Proceedings of the Advancement of Artificial Intelligence.
    [29]. Wang, X.-J., Tu, X., Feng, D. and Zhang, L. (2009) Ranking community answers by modeling question-answer relationships via analogical reasoning. Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval. ACM, Boston, MA, USA.
    [30]. Zhang, J., Ackerman, M.S. and Adamic, L. (2007) Expertise networks in online communities: structure and algorithms. Proceedings of the 16th international conference on World Wide Web. ACM, Banff, Alberta, Canada.
    [31]. Zhou, Y., Cong, G., Cui, B., Jensen, C.S. and Yao, J. (2009) Routing Questions to the Right Users in Online Communities. Proceedings of the 2009 IEEE International Conference on Data Engineering. IEEE Computer Society.

    下載圖示 校內:2011-08-20公開
    校外:2011-08-20公開
    QR CODE