| 研究生: |
謝家洋 Hsieh, Chia-Yang |
|---|---|
| 論文名稱: |
推特上基於擴散圖探勘之主題標籤分類 Diffusion Graph Mining for Hashtag Classification in Twitter |
| 指導教授: |
高宏宇
Kao, Hung-Yu |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2013 |
| 畢業學年度: | 101 |
| 語文別: | 英文 |
| 論文頁數: | 41 |
| 中文關鍵詞: | 資訊擴散 、主題標籤 、推特 、分類 |
| 外文關鍵詞: | Information diffusion, Hashtag, Twitter, Classification |
| 相關次數: | 點閱:74 下載:4 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
推特是一個近幾年來非常熱門的社交網路和微網誌服務。在推特上,使用者可以在推文中使用主題標籤來表示這篇推文的主題。然而,因為使用者可以沒有限制的創建一個新的主題標籤,所以在推特上有很多不同類型的主題標籤。有些主題標籤很特別,所以我們很難知道這些主題標籤是什麼意思。出自這個原因,我們的目標是將主題標籤分到特定的類別。這不但可以幫助使用者更了解主題標籤的意思,也可以幫助他們尋找他們有興趣的話題。為了定位這個問題,我們根據使用者間的互動對每個主題標籤建立了訊息擴散圖,並擷取其特徵。由於主題標籤很多樣化,所以我們從主題標籤內容特徵和圖拓樸特徵這兩個觀點上探討特徵。接著我們從訊息轉推圖和擴散圖中萃取特徵用於主題標籤的分類。在主題標籤的分類上,我們的結果顯示擴散圖的特性是很重要的。此外根據我們的研究結果,擴散圖比轉推圖更適合用來描繪主題標籤的特性。
Twitter is a social networking and micro-blogging service which has become popular in recent years. In Twitter, users can use hashtags to represent the topic of a tweet. Nevertheless, there are too many hashtags because users can create a new hashtag without any restriction. Some hashtags are too special so that it is difficult to understand what they are about. For this reason, our goal is to classify hashtags into specific topics that are helpful for users to understand the topic and meaning of a tweet. It can help users not only understanding the meaning of hashtags but also searching for topics that they are interested in. To address this problem, we build the diffusion graph for each hashtag according to the interactions among users. Due to the variety of hashtags, we explored features from the perspectives of hashtag content features and graph topology features. After that, we extract these features from the retweet graph and the diffusion graph for hashtag classification. Our results show that the characteristics of diffusion graph are important and effective in hashtag classification. Besides, the diffusion graph is more suitable for representing hashtag characteristics than the retweet graph.
[1] E. Bakshy, J. M. Hofman, W. A. Mason, and D. J. Watts, "Everyone's an influencer: quantifying influence on twitter," presented at the Proceedings of the fourth ACM international conference on Web search and data mining, Hong Kong, China, 2011.
[2] E. Bakshy, I. Rosenn, C. Marlow, and L. Adamic, "The role of social networks in information diffusion," presented at the Proceedings of the 21st international conference on World Wide Web, Lyon, France, 2012.
[3] H. Becker, M. Naaman, and L. Gravano, "Beyond trending topics: Real-world event identification on Twitter," presented at the Fifth International AAAI Conference on Weblogs and Social Media, 2011.
[4] H. Chien-Tung, L. Cheng-Te, and L. Shou-De, "Modeling and Visualizing Information Propagation in a Micro-blogging Platform," in Advances in Social Networks Analysis and Mining (ASONAM), 2011 International Conference on, 2011, pp. 328-335.
[5] Q. Duong, M. P. Wellman, and S. P. Singh, "Modeling Information Diffusion in Networks with Unobserved Links," presented at the SocialCom/PASSAT, 2011.
[6] L. Hong and B. D. Davison, "Empirical study of topic modeling in Twitter," presented at the Proceedings of the First Workshop on Social Media Analytics, Washington D.C., District of Columbia, 2010.
[7] Y. Jaewon and J. Leskovec, "Modeling Information Diffusion in Implicit Networks," in Data Mining (ICDM), 2010 IEEE 10th International Conference on, 2010, pp. 599-608.
[8] H. Kwak, C. Lee, H. Park, and S. Moon, "What is Twitter, a social network or a news media?," presented at the Proceedings of the 19th international conference on World wide web, Raleigh, North Carolina, USA, 2010.
[9] K. Lee, D. Palsetia, R. Narayanan, M. M. A. Patwary, A. Agrawal, and A. Choudhary, "Twitter Trending Topic Classification," presented at the Proceedings of the 2011 IEEE 11th International Conference on Data Mining Workshops, 2011.
[10] J. Lehmann, B. Gon, #231, alves, Jos, #233, J. Ramasco, and C. Cattuto, "Dynamical classes of collective attention in twitter," presented at the Proceedings of the 21st international conference on World Wide Web, Lyon, France, 2012.
[11] Z. Ma, A. Sun, and G. Cong, "Will this #hashtag be popular tomorrow?," presented at the Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval, Portland, Oregon, USA, 2012.
[12] Y. Matsubara, Y. Sakurai, B. A. Prakash, L. Li, and C. Faloutsos, "Rise and fall patterns of information diffusion: model and implications," presented at the Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, Beijing, China, 2012.
[13] T. Oshino, Y. Asano, and M. Yoshikawa, "Time graph pattern mining for web analysis and information retrieval," presented at the Proceedings of the 11th international conference on Web-age information management, Jiuzhaigou, China, 2010.
[14] D. Ramage, S. Dumais, and D. Liebling, "Characterizing microblogs with topic models," in Proceedings of the Fourth International AAAI Conference on Weblogs and Social Media, ed: AAAI, 2010.
[15] G. Rattanaritnont, M. Toyoda, and M. Kitsuregawa, "Characterizing topic-specific hashtag cascade in twitter based on distributions of user influence," presented at the Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications, Kunming, China, 2012.
[16] M. G. Rodriguez, J. Leskovec, and A. Krause, "Inferring networks of diffusion and influence," presented at the Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, Washington, DC, USA, 2010.
[17] D. M. Romero, B. Meeder, and J. Kleinberg, "Differences in the mechanics of information diffusion across topics: idioms, political hashtags, and complex contagion on twitter," presented at the Proceedings of the 20th international conference on World wide web, Hyderabad, India, 2011.
[18] B. Sriram, D. Fuhry, E. Demir, H. Ferhatosmanoglu, and M. Demirbas, "Short text classification in twitter to improve information filtering," presented at the Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval, Geneva, Switzerland, 2010.
[19] O. Tsur and A. Rappoport, "What's in a hashtag?: content based prediction of the spread of ideas in microblogging communities," presented at the Proceedings of the fifth ACM international conference on Web search and data mining, Seattle, Washington, USA, 2012.
[20] A. H. Wang, "Don't follow me: Spam detection in Twitter," presented at the Security and Cryptography (SECRYPT), Proceedings of the 2010 International Conference on, 2010.
[21] M. J. Welch, U. Schonfeld, D. He, and J. Cho, "Topical semantics of twitter links," presented at the Proceedings of the fourth ACM international conference on Web search and data mining, Hong Kong, China, 2011.
[22] J. Weng, E.-P. Lim, J. Jiang, and Q. He, "TwitterRank: finding topic-sensitive influential twitterers," presented at the Proceedings of the third ACM international conference on Web search and data mining, New York, New York, USA, 2010.
[23] S. Wu, J. M. Hofman, W. A. Mason, and D. J. Watts, "Who says what to whom on twitter," presented at the Proceedings of the 20th international conference on World wide web, Hyderabad, India, 2011.
[24] J. Yang and S. Counts, "Comparing Information Diffusion Structure in Weblogs and Microblogs," in ICWSM'10, ed, 2010, pp. -1-1.
[25] J. Yang and S. Counts, Predicting the Speed, Scale, and Range of Information Diffusion in Twitter, 2010.