| 研究生: |
賴良承 Lai, Liang-Cheng |
|---|---|
| 論文名稱: |
在社群問答網站中結合專業度及行動力之問題導向方法 Question Routing by Modeling User Expertise and Activity in cQA services |
| 指導教授: |
高宏宇
Kao, Hung-Yu |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2011 |
| 畢業學年度: | 99 |
| 語文別: | 英文 |
| 論文頁數: | 73 |
| 中文關鍵詞: | 社群問答網站 、問題導向 、專業度模型建構 、行動力模型建構 |
| 外文關鍵詞: | CQA portal, Question routing, Expertise modeling, Activity modeling |
| 相關次數: | 點閱:133 下載:1 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來,社群問答網站累積了大量的問題以及回答的資訊。像是Yahoo! Answers跟Stack Overflow這種社群問答網站允許讓使用者去根據他們的需求問問題,以及去回答別人的問題。然而,因為問題的量越來越大,我們無法確定使用者的問題是否能夠很快地被其他使用者回答或解決。在我們的研究中,我們主要解決有效地問題導向的問題,我們把問題導向給有能力以及能夠及時回答的使用者讓問題可以被有效地以及快速地被解決。我們相信解決這個問題可以改善社群問答網站的效能以及滿足使用者的需求。這個研究的難度在於我們無法直接套用傳統專家排序的方法在我們的問題上。我們藉由使用者以前的回答紀錄提出了一個計算使用者專業度以及行動力的方法。在專業度模型建構上,我們首先藉由使用者之前回答過的問題內容去衡量其專業度,再透過在社群問答網站中使用者之間的互動去加強專業度的計算;在行動力模型建構上,我們分析使用者以前在網站中活動的記錄及曲線,進而去預測他們在未來的行動力。藉由模擬出使用者的專業度以及行動力,我們就可以將問題導向給那些有能力的使用者,使得問題得到快速的解答。在論文中,我們在一個真實的社群問答網站資料上面進行我們的實驗分析,證實我們的方法可以比其他方法還要好。在MRR的評估標準下,以問題內容為主的效能為0.0999,而我們的方法可達到0.1372。效能改善了約37%。平均來講,一個新問題可以藉由我們的方法,導向給排名前7名的使用者就至少可以得到一個回答。
Community Question Answering (cQA) sites such as Yahoo! Answers and Stack Overflow have emerged as a new type of community portal that allows users to answer the questions asked by other people. The cQA archives have accumulated a huge mass of questions and answers. On account of the progressively increasing questions, there are numbers of questions to be solved or answered by others. In this paper, we address the problem of efficient question routing. We present a new approach that combines user’s expertise and user’s activity to solve this problem. First, we model user’s expertise by the contents of user’s answering questions in the past, and then we enhance user’s expertise by social network characteristic in the cQA portal. Second, we model and predict user’s activity by analyzing the distribution of their previous answering records. Experiments conducted on a real cQA data, Stack Overflow, show that our approach leads to a better performance than other baseline approaches significantly. In terms of the evaluation metric, MRR, the performance of the content-based approach is 0.0999 and that of our method is 0.1372 respectively. We can get a 37.34% improvement over the traditional content-based method. On average, each of 6,160 test questions gets at least one answer if it is routed to the top 7 ranked users by our approach.
[1] Adamic, L.A., J. Zhang, E. Bakshy, and M.S. Ackerman, Knowledge sharing and yahoo answers: everyone knows something, in Proceeding of the 17th international conference on World Wide Web. 2008, ACM: Beijing, China. p. 665-674.
[2] Agichtein, E., C. Castillo, D. Donato, A. Gionis, and G. Mishne, Finding high-quality content in social media, in Proceedings of the international conference on Web search and web data mining. 2008, ACM: Palo Alto, California, USA. p. 183-194.
[3] Balog, K., L. Azzopardi, and M.d. Rijke, Formal models for expert finding in enterprise corpora, in Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval. 2006, ACM: Seattle, Washington, USA. p. 43-50.
[4] Brin, S. and L. Page, The anatomy of a large-scale hypertextual Web search engine, in Proceedings of the seventh international conference on World Wide Web 7. 1998, Elsevier Science Publishers B. V.: Brisbane, Australia. p. 107-117.
[5] Campbell, C.S., P.P. Maglio, A. Cozzi, and B. Dom, Expertise identification using email communications, in Proceedings of the twelfth international conference on Information and knowledge management. 2003, ACM: New Orleans, LA, USA. p. 528-531.
[6] Cao, Y., H. Duan, C.-Y. Lin, Y. Yu, and H.-W. Hon, Recommending questions using the mdl-based tree cut model, in Proceeding of the 17th international conference on World Wide Web. 2008, ACM: Beijing, China. p. 81-90.
[7] Chang, C.-C. and C.-J. Lin, LIBSVM : a library for support vector machines, Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm. 2001.
[8] Craswell, N., A.P.d. Vries, and I. Soboroff, Overview of the TREC-2005 Enterprise Track, in In Proceedings of TREC 2005. 2005.
[9] Dom, B., I. Eiron, A. Cozzi, and Y. Zhang, Graph-based ranking algorithms for e-mail expertise analysis, in Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery. 2003, ACM: San Diego, California. p. 42-48.
[10] Gyongyi, Z., G. Koutrika, J. Pedersen, and H. Garcia-Molina, Questioning Yahoo! Answers. 2007, Stanford InfoLab.
[11] Horowitz, D. and S.D. Kamvar, The anatomy of a large-scale social search engine, in Proceedings of the 19th international conference on World wide web. 2010, ACM: Raleigh, North Carolina, USA. p. 431-440.
[12] Jeon, J., W.B. Croft, and J.H. Lee, Finding semantically similar questions based on their answers, in Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval. 2005, ACM: Salvador, Brazil. p. 617-618.
[13] Jeon, J., W.B. Croft, J.H. Lee, and S. Park, A framework to predict the quality of answers with non-textual features, in Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval. 2006, ACM: Seattle, Washington, USA. p. 228-235.
[14] Jurczyk, P. and E. Agichtein, Discovering authorities in question answer communities by using link analysis, in Proceedings of the sixteenth ACM conference on Conference on information and knowledge management. 2007, ACM: Lisbon, Portugal. p. 919-922.
[15] Kleinberg, J.M., Authoritative sources in a hyperlinked environment. J. ACM, 1999. 46(5): p. 604-632.
[16] Li, B. and I. King, Routing questions to appropriate answerers in community question answering services, in Proceedings of the 19th ACM international conference on Information and knowledge management. 2010, ACM: Toronto, ON, Canada. p. 1585-1588.
[17] Littlepage, G.E. and A.L. Mueller, Recognition and Utilization of Expertise in Problem-Solving Groups: Expert Characteristics and Behavior. Group Dynamics: Theory, Research, and Practice, 1997: p. 324-328.
[18] Liu, Y., J. Bian, and E. Agichtein, Predicting information seeker satisfaction in community question answering, in Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval. 2008, ACM: Singapore, Singapore. p. 483-490.
[19] Ponte, J.M. and W.B. Croft, A language modeling approach to information retrieval, in Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval. 1998, ACM: Melbourne, Australia. p. 275-281.
[20] Qu, M., G. Qiu, X. He, C. Zhang, H. Wu, J. Bu, and C. Chen, Probabilistic question recommendation for question answering communities, in Proceedings of the 18th international conference on World wide web. 2009, ACM: Madrid, Spain. p. 1229-1230.
[21] Suryanto, M.A., E.P. Lim, A. Sun, and R.H.L. Chiang, Quality-aware collaborative question answering: methods and evaluation, in Proceedings of the Second ACM International Conference on Web Search and Data Mining. 2009, ACM: Barcelona, Spain. p. 142-151.
[22] Voorhees, E.M., The TREC-8 question answering track evaluation. TREC, 1999: p. 83-105.
[23] Wang, K., Z. Ming, and T.-S. Chua, A syntactic tree matching approach to finding similar questions in community-based qa services, in Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval. 2009, ACM: Boston, MA, USA. p. 187-194.
[24] Wu, H., Y. Wang, and X. Cheng, Incremental probabilistic latent semantic analysis for automatic question recommendation, in Proceedings of the 2008 ACM conference on Recommender systems. 2008, ACM: Lausanne, Switzerland. p. 99-106.
[25] Xing, W. and A. Ghorbani, Weighted PageRank Algorithm, in Proceedings of the Second Annual Conference on Communication Networks and Services Research. 2004, IEEE Computer Society. p. 305-314.
[26] Zhai, C. and J. Lafferty, A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inf. Syst., 2004. 22(2): p. 179-214.
[27] Zhang, J., M.S. Ackerman, and L. Adamic, Expertise networks in online communities: structure and algorithms, in Proceedings of the 16th international conference on World Wide Web. 2007, ACM: Banff, Alberta, Canada. p. 221-230.
[28] Zhou, Y., G. Cong, B. Cui, C.S. Jensen, and J. Yao, Routing Questions to the Right Users in Online Communities. Proceedings of the 2009 IEEE International Conference on Data Engineering, 2009: p. 700-711.