簡易檢索 / 詳目顯示

研究生: 蔡明曄
Tsai, Ming-Yeh
論文名稱: 聊天主題推薦系統
Recommending topics in dialogue
指導教授: 李強
Lee, Chiang
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2015
畢業學年度: 103
語文別: 英文
論文頁數: 53
中文關鍵詞: 主題模型推薦系統hashtags
外文關鍵詞: topic model, recommendation system, hashtags
相關次數: 點閱:210下載:5
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年來,各種線上聊天系統逐漸被開發出來並應用於日常生活中。然而,目前仍然沒有任何一種推薦系統能推薦雙方合適的聊天的主題。因此,本論文提出了熱門主題推薦系統來解決此一問題。此系統可以根據使用者和他所聊天對象的喜好以及近期熱門的短文,推薦使用者和其聊天對象談話的主題。而為了達成此目標,本論文將使用著名的 Latent Dirichlet Allocation (LDA)演算法為基礎來發展新的演算法,並使用所設計的演算法來分析使用者與其聊天對象過往在Twitter上所發表的短文。如此,我們將能推薦給使用者既是熱門的,且是和使用者聊天的對象所感興趣的主題。最後,實驗部分則證明了我們所提出演算法的效率及有效性。

    In recent years, several kinds of online chat system have been developed. However, there exist no recommendation systems for the generation of appropriate topics for users to bring up in dialogue. This paper proposes a hot-topic recommendation system to overcome this problem. The proposed system analyzes the tweets of the user, his chat partner and similar users, as well as hashtags trending in Twitter, to recommend topics. The proposed system is based on the well-known algorithm, Latent Dirichlet Allocation (LDA). We present a comparison of the results of the proposed system and several other commonly employed recommendation systems for a case study. The proposed system outperforms the other algorithms in terms of both efficiency and accuracy.

    Chinese Abstract i Abstract ii Acknowledgements iii List of ontents iv List of Figures vi List of Tables vii Chapter 1 Introduction 1 Chapter 2 Related Work 3 2.1 Collaborative filtering 3 2.2 Personal profiling 4 2.3 Topic models 4 2.4 Hashtags 5 Chapter 3 Topic model 6 3.1 Introduction to topic model 6 3.2 Probabilistic latent semantic indexing (PLSI) 8 3.3 Latent dirichlet allocation (LDA) 10 Chapter 4 Dataset Analysis and Problem Statement 12 4.1 Introduction to dataset 12 4.2 Dataset analysis 13 4.3 Problem statement 17 Chapter 5 Algorithm 19 5.1 Preprocessing 21 5.2 Modification of number of hashtags 21 5.2.1 Number of times popular hashtags were used (Pop(i, h)) 22 5.2.2 Number of times hashtags of interest to chat partner were used (Int(h)) 22 5.2.3 Calculation of chat partner’s interest level for hashtags (wh) 23 5.2.4 Finding base value bh from popular hashtags 30 5.2.5 Time attenuation function 30 5.2.6 Fully worked example of modification of NUH to NMH 31 5.3 Application of LDA algorithm for final topic recommendation results 32 5.4 Example of topic recommendation algorithm 33 Chapter 6 Performance Evaluation 35 6.1 Evaluation metrics 35 6.2 Set appropriate parameters for HT-LDA 36 6.2.1 Setting an appropriate number of topics 37 6.2.2 Setting an appropriate k 38 6.3 Baseline comparisons 39 6.3.1 Influence of the number of topics 40 6.3.2 Influence of k 41 6.3.3 Analysis of types of hashtags provided as recommendations 43 6.4 Component analysis 46 Chapter 7 Conclusions and future work 48 References 49

    [1] P. H. Adams and C. H. Martell, “Topic Detection and Extraction in Chat,” in IEEE Int. Conf. Semantic Computing, pp. 581-588, 2008.
    [2] D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent Dirichlet Allocation,” in Journal of Machine Learning Research, pp. 993-1022, 2003.
    [3] Y. Cha and J. Cho, “Social-network analysis using topic models,” in Proc. 35th Int. ACM SIGIR Conf. Research and development in information retrieval, pp. 565-574, 2012.
    [4] Y. Cha, B. Bi, C. C. Hsieh, and J. Cho, “Incorporating Popularity in Topic Models for Social Network Analysis,” in Proc. 36th Int. ACM SIGIR Conf. Research and development in information retrieval, pp. 223-232, 2013.
    [5] J. Chang and D. Blei, “Relational topic models for document networks,” in Proc. 12th Int. Conf. Artificial Intelligence and Statistics (AISTATS), 2009.
    [6] K. Y. Chen, L. Luesukprasert, and S. T. Chou, “Hot Topic Extraction Based on Timeline Analysis and Multidimensional Sentence Modeling,” in IEEE Transactions Knowledge and Data Engineering, vol. 19, pp. 1016-1025, 2007.
    [7] C.-H. Chu, W.-C. Wu, C.-C. Wang, T.-S. Chen, and J.-J. Chen, “Friend Recommendation for Location-Based Mobile Social Networks,” in Int. Conf. Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS), pp. 365-370, 2013.
    [8] W. Feng and J. Wang, “We can learn your hashtags Connecting tweets to explicit topics,” in IEEE 30th Int. Conf. Data Engineering (ICDE), pp. 856-867, 2014.
    [9] G. Ference, M. Ye, and W. C. Lee, “Location Recommendation for Out-of-Town Users in Location-Based Social Networks,” in Proc. 22nd ACM Int. Conf. Conference on information & knowledge management, pp. 721-726, 2013.
    [10] F. Godin, V. Slavkovikj, and W. D. Neve, “Using Topic Models for Twitter Hashtag Recommendation,” in Proc. 22nd Int. Conf. World Wide Web, pp. 593-896, 2013.
    [11] T. Griffiths and M. Steyvers, “Finding scientific topics. National Academy of Sciences,” in Proc. National Academy of Sciences of the United States of America, vol. 101, 5228-5235, 2004.
    [12] A. Hanze and S. Junmanee, “Travel recommendations in a mobile tourist information system”, in Proc. Information Systems and its Application (ISTA), pp. 86-99, 2005.
    [13] K. Henderson and T. Eliassi-Rad, “Applying latent dirichlet allocation to group discovery in large graphs,” in Proc. 2009 ACM symposium on Applied Computing, pp. 1456-1461, 2009.
    [14] A. Hindle, N. A. Ernst, M. W. Godfrey, and J. Mylopoulos, “Automated topic naming,” in Empirical Software Engineering, vol. 18, pp. 1125-1155, 2013.
    [15] T. Hofmann, “Probabilistic Latent Semantic Indexing,” in Proc. 22nd Int. ACM SIGIR Conf. Research and development in information retrieval, pp. 50-57, 1999.
    [16] T. Horozov, N. Narasimhan, and V. Vasudevan, “Using location for personalized POI recommendations in mobile,” in Int. Symposium on Applications and the Internet, pp. 1-6, 2006.
    [17] J. Huang, B. Zhou, Q. Wu, X. Wang, and Y. Jia, “Contextual correlation based thread detection in short text message streams,” in Journal of Intelligent Information Systems, vol. 38, pp. 449-464, 2012.
    [18] M. Jeon, S. Jun, and E. Hwang, “Hashtag Recommendation Based on User Tweet and Hashtag Classification on Twitter,” Web-Age Information Management, pp. 325-336, 2014.
    [19] M. Karkali, D. Pontikis, and M. Vazirgiannis, “Match the news: a firefox extension for real-time news recommendation,” in Proc. 36th Int. ACM SIGIR Conf. Research and development in information retrieval, pp. 1117-1118, 2013.
    [20] E. Khabiri, J. Caverlee, and K. Y. Kamath, “Predicting semantic annotations on the real-time web,” in Proc. 23rd ACM Conf. Hypertext and social media, pp. 219–228, 2012.
    [21] X. S. Khoshgoftaar, “A Survey of Collaborative Filtering Techniques”, in Advances in Artificial Intelligence, vol. 19, no. 4, 2009.
    [22] J. Kim, S.-K. Kim, and H. Yu, “Scalable and Parallelizable Processing of Influence Maximization for Large-Scale Social Networks,” in IEEE 29th Int. Conf. Data Engineering (ICDE), pp. 266-277, 2013.
    [23] S. M. Kywe, T.-A. Hoang, E.-P. Lim, and F. Zhu, “On Recommending Hashtags in Twitter Networks,” in Proc. 4th Int. Conf. Social Informatics, pp. 337-350, 2012.
    [24] J. J. Levandoski, M. Sarwat, A. Eldawy, and M. F. Mokbel, “LARS: A Location-Aware Recommender System,” in Proc. IEEE 28th Int. Conf. Data Engineering, pp. 450-461, 2012.
    [25] J. Lin, K. Sugiyama M. Y. Kan, and T. S. Chua, “Addressing Cold-Start in App Recommendation: Latent User Models Constructed from Twitter Followers,” in Proc. 36th Int. ACM SIGIR Conf. Research and development in information retrieval, 2013.
    [26] E. H.-C. Lu, C.-Y. Chen, and V. S. Tseng, “Personalized Trip Recommendation with Multiple Constraints by Mining User Check-in Behaviors,” in Proc. 20th Int. Conf. Advances in Geographic Information Systems, pp. 209-218, 2012.
    [27] Z. Ma, A. Sun, and G. Cong, “Will This #Hashtag Be Popular Tomorrow?” in Proc. 35th Int. ACM SIGIR Conf. Research and development in information retrieval, pp. 1173-1174, 2012.
    [28] R. Nallapati, D. A. McFarland, and C. D. Manning, “Topic flow model: Unsupervised learning of topic-specific influences of hyperlinked documents,” in Proc. 14th Int. Conf. Artificial Intelligence and Statistics (AISTATS), 2011.
    [29] O. Phelan, K. McCarthy, and B. Smyth, “Using twitter to recommend real-time topical news,” in Proc. ACM Conf. Recommender systems, pp. 385-388, 2009.
    [30] A. Popescul, L. Ungar, D. Pennock, and S. Lawrence, “Probabilistic Models for Unified Collaborative and Content-Based Recommendation in Sparse-Data Environments, in Proc. 17th Conf. Uncertainty in Artificial Intelligence, pp. 437–444, 2001.
    [31] I. Porteous, D. Newman, A. Ihler, A. Asuncion, P. Smyth, and M. Welling, “Fast Collapsed Gibbs Sampling For Latent Dirichlet Allocation,” in Proc. 14th ACM SIGKDD Int. Conf. Knowledge discovery and data mining, pp. 569-577, 2008.
    [32] L. Posch and C. Wagner, “Meaning as Collective Use: Predicting Semantic Hashtag Categories on Twitter,” in Proc. 22nd Int. Conf. World Wide Web, pp. 621-628, 2013.
    [33] Z. Ren, S. Liang, E. Meij, and M. Rijke, “Personalized Time-Aware Tweets Summarization,” in Proc. 36th Int. ACM SIGIR Conf. Research and development in information retrieval, pp. 513-522, 2013.
    [34] M. Rosen-Zvi, T. Griffiths, M. Steyvers, and P. Smyth, “The author-topic model for authors and documents,” in Proc. 20th Conf. Uncertainty in artificial intelligence, pp. 487-494, 2004.
    [35] J. Salter and N. Antonopoulos, “CinemaScreen recommender agent: combining collaborative and content-based filtering”, in IEEE Intelligent Systems, vol. 21, pp. 35-41, 2006.
    [36] S. Sedhai and A. Sun, “Hashtag recommendation for hyperlinked tweets,” in Proc. 37th Int. ACM SIGIR Conf. Research & development in information retrieval, pp. 831-834, 2014.
    [37] L. Si and R. Jin, “Adjusting Mixture Weights of Gaussian Mixture Model via Regularized Probabilistic Latent Semantic Analysis,” in 9th Ninth Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), pp. 622-631, 2005.
    [38] M. Tavakolifard, J. A. Gulla, K. C. Almeroth, J. E. Ingvaldsen, G. Nygreen, and E. Berg, “Tailored news in the palm of your hand: A multi-perspective transparent approach to news recommendation,” in Proc. 22nd Int. Conf. World Wide Web, pp. 305-308, 2013.
    [39] J. Vosecky, D. Jiang, K. W. T. Leung, and W. Ng, “Dynamic Multi-Faceted Topic Discovery in Twitter,” in Proc. 22nd ACM Int. Conf. Conference on information & knowledge management, pp. 879-884, 2013.
    [40] A. Wilson and P. Chew, “Term weighting schemes for latent dirichlet allocation,” in Conf. North American Chapter of the Association for Computational Linguistics, pp. 465-473, 2010.
    [41] X. Xie, “Potential Friend Recommendation in Online Social Network,” in Cyber-Physical-Social Computing, pp. 831-835, 2010.
    [42] M. Yan, J. Sang, T. Mei, and C. XU, “Friend transfer: Cold-start friend recommendation with cross-platform transfer learning of social knowledge,” in Int. Conf. Multimedia and Expo (ICME), pp. 1-6, 2013.
    [43] L. Yang, T. Sun, M. Zhang, and Q. Mei, “We know what @you #tag: does the dual role affect hashtag adoption?” in Proc. 21st Int. Conf. World Wide Web, 2012.
    [44] E. Zangerle, W. Gassler, and G. Specht, “Recommending #-tags in twitter,” in Proc. Workshop Semantic Adaptive Social Web (SASWeb), pp. 67-78, 2011.
    [45] H. Zhang, B. Qiu, C. L. Giles, H. C. Foley, and J. Yen, “An lda-based community structure discovery approach for large-scale social networks,” in IEEE Int. Conf. Intelligence and Security Informatics, pp. 200-207, 2007.
    [46] Q. Zhang, Y. Gong, X. Sun, and X. Huang, “Time-aware Personalized Hashtag Recommendation on Social Media,” in Proc. 25th Int. Conf. Computational Linguistics (COLING), pp. 203-212, 2014.
    [47] G. Zhao, M. Li Lee, W. Hsu, W. Chen, and H. Hu, “Community-Based User Recommendation in Uni-Directional Social Networks,” in Proc. 22nd ACM Int. Conf. Conference on information & knowledge management, pp. 189-198, 2013.
    [48] N. Zheng, X. Jin, and L. Li, “Cross-Region Collaborative Filtering for New Point-of-Interest Recommendation,” in Proc. 22nd Int. Conf. World Wide Web, pp. 45-46, 2013.
    [49] Y. Zhou, H. Cheng, and J. X. Yu, “Graph Clustering Based on Structural/Attribute Similarities,” in Proc. VLDB Endowment, vol. 2, pp. 718-729, 2009.

    下載圖示 校內:2020-08-24公開
    校外:2020-08-24公開
    QR CODE