簡易檢索 / 詳目顯示

研究生: 陳泠伃
Chen, Ling-Yu
論文名稱: 運用社群關係分析之主題式微網誌意見探勘方法
Theme-based Microblog Opinion Mining Approach by Social Relationship Analysis
指導教授: 郭耀煌
Kuo, Yau-Hwang
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2012
畢業學年度: 100
語文別: 英文
論文頁數: 62
中文關鍵詞: 微網誌意見探勘情緒分類社群關係文字探勘
外文關鍵詞: microblog, opinion mining, sentiment classification, social relationship, text mining
相關次數: 點閱:145下載:5
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 此篇論文中提出一個嶄新的主題式微網誌意見探勘方法,其運用社群關係和噗文內容來分類使用者情緒。不同於傳統專注在噗文的方法,我們視使用者為分析單位以研究在特定主題上的情緒分類,在文本情緒分類方面,藉由貝氏分類或支持向量機工具來分類噗文的意見傾向,我們也使用兩種中文情緒字彙資源來輔助監督式機器學習方法,另外,我們建構社群關係圖以分析微網誌上使用者的情緒。根據不同的社群媒體平台,對應的「人對人關係」個別被採用來建立無向和有向社群關係圖的鍵結,同時,我們更進一步考慮透過社群彼此間互動的影響力(在此稱為「人對噗文關係」),再者,活動程度表示人之間的情緒影響力。此篇論文的目的是利用使用者鄰居去解決針對特定主題之噗文過短且模糊難以分類的問題,在此方法中有兩個主要的優點,分別為社群關係在微網誌上容易取得和此方法移植到其他平台相當直覺且容易,在我們的實驗中,採用在台灣受歡迎的微網誌噗浪,當作以人為單位的微網誌意見分析來源,我們的方法成功地促進意見探勘在不同的應用上。

    A novel theme-based microblog opinion mining approach which employs social relationships and content of posts to analyze user sentiment is proposed in this thesis. Unlike traditional post-level approaches, we regard user as an analysis unit to investigate sentiment classification on a specific topic. In aspect of textual sentiment classification, opinion of posts is classified with Naïve Bayes classification and SVM tool. Meanwhile, we use two Chinese sentiment lexical resources, HowNet and NTUSD, to assist supervised machine learning methods. In addition, we construct social relationship graph to analyze user sentiment on microblog. According to different social media platform, corresponding types of “user-user relationship” are adopted to construct the edge of indirect or direct social relationship graph separately. Simultaneously, we deeply consider influence through social interactions which are called “user-post relationship” between humans and posts. Moreover, sentimental influence between humans is presented by “active degree” in this study. The aim of this research is to leverage user neighbors to solve the problem that posts are often too short and ambiguous to analyze opinions toward a topic. There two main advantages of the proposed approach are social relationships can be easily obtained from microblog and transferring to other social media platform is straightforward and simple. In our experiment, Plurk, a popular microblog in Taiwan is employed as resource to achieve user-level microblog opinion analysis. Our approach improves opinion mining for different applications successfully.

    List of Tables X List of Figures XI Chapter 1 Introduction 1 1.1 Motivation 4 1.2 Issues and Challenges 8 1.3 Contributions 10 1.4 Organization 11 Chapter 2 Background and Related Work 12 2.1 Opining Mining and Sentiment Analysis 12 2.1.1 Textual Sentiment Classification 12 2.1.2 Unsupervised Machine Learning for Classification 14 2.1.3 Opinion Mining on Microblog 17 2.2 Microblog 20 2.2.1 Characteristics of Microblog 20 2.2.2 Comparison of Facebook and Plurk 21 Chapter 3 A Novel Microblog Opinion Mining Approach 23 3.1 Problem Formulation 24 3.2 Data Collection 29 Chapter 4 Textual Sentiment Classification 30 4.1 SD Naïve Bayes Model 31 4.1.1 Bernoulli and Multinomial Bayes 32 4.1.2 Sentiment Dictionary 34 4.2 Bag of Word SVM Model 35 4.3 PU transformation 35 Chapter 5 Social Relationship Analysis 37 5.1 Social Relationship Graph 38 5.2 Edge Pruning and Weighting 39 5.3 Opinion Analysis 41 Chapter 6 Experiments and Discussion 42 6.1 Procedure and Design 42 6.1.1 Data Collection 43 6.1.2 Textual Sentiment Classification 44 6.1.3 Social Relationships Analysis 47 6.2 Discussion and Analysis 49 6.2.1 Criterion of Selecting Related Topics 49 6.2.2 Social Relationships Analysis 50 Chapter 7 Conclusions and Future Work 55 References 57 Appendix-Relaxation Labeling 62

    [Ang06] Angelova, R., Weikum, G.” Graph-based text classification: learn from your neighbors,” In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, Washington, USA. pp. 485-492, 2006.
    [Ber10] A. Bermingham and A. F. Smeaton, "Classifying sentiment on microblogs: is brevity an advantage?," presented at the Proceedings of the 19th ACM international conference on Information and knowledge management, Toronto, ON, Canada, 2010.
    [Bro11] S. Brody and N. Diakopoulos, "Cooooooooooooooollllllllllllll!!!!!!!!!!!!!!: using word lengthening to detect sentiment on microblogs," presented at the Proceedings of the Conference on Empirical Methods in Natural Language Processing, Edinburgh, United Kingdom, 2011.
    [Che10] Hsieh-Wei Chen, et al., "Unsupervised Subjectivity-Lexicon Generation Based on Vector Space Model for Multi-Dimensional Opinion Analysis in Blogosphere," D.-S. Huang et al. (Eds.): ICIC 2010, Advanced Intelligent Computing Theories and Applications, Lecture Notes in Computer Science, Vol. 6215, pp. 372-379, 2010.
    [Dav10] D. Davidov, et al., "Enhanced sentiment learning using Twitter hashtags and smileys," presented at the Proceedings of the 23rd International Conference on Computational Linguistics: Posters, Beijing, China, 2010.
    [Gry09] K. Moilanen and S. Pulman, "Multi-entity sentiment scoring," in proceedings of the Recent Advances in Natural Language Processing (RANLP), 2009
    [Gry10] W. Gryc, K. Moilanen” Leveraging Textual Sentiment Analysis with Social Network Modelling: Sentiment Analysis of Political Blogs in the 2008 U.S. Presidential Election,” In: Proceedings of the From Text to Political Positions Workshop, 2010.
    [Hat93] V. Hatzivassiloglou and K. R. McKeown, “Towards the automatic identification of adjectival scales: clustering adjectives according to meaning,” Proceedings of the 31st annual meeting on Association for Computational Linguistics, pp. 172-182, Jun. 1993.
    [Jai00] A. K. Jain, et al., "Statistical pattern recognition: A review," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 22, pp. 4-37, 2000.
    [Jia11] L. Jiang, et al.”Target-dependent twitter sentiment classification,” In Proceeding of the 49th Annual Meeting of the Association for Computational Linguistics, pp. 151-160. Association for Computational Linguistics, Portland, Oregon1, 2011.
    [Kit85] J. Kittler and J. Illingworth, “Relaxation labelling algorithms-a review,” Image and Vision Computing, 3(11):206–216, 1985.
    [Ku06] L.W. Ku, Liang, Y.T., Chen, H.H.,”Tagging heterogeneous evaluation corpora for opinionated tasks,” In proceedings of Language Resources and Evaluation (LREC), 2006.
    [Kap11] A. M. Kaplan, M. Haenlein, “The early bird catches the news: Nine things you should know about micro-blogging,” Business Horizons, pp. 105-113, 2011.
    [Liu06] B. Liu, Web Data Mining, Springer, December, 2006.
    [Liu10] B. Liu, “Sentiment Analysis and Subjectivity,” Handbook of Natural Language Processing 2nd ed., 2010.
    [Mil01] Miller McPherson, Lynn Smith–Lovin, and James M Cook “Birds of a feather: Homophily in social networks,” Annual Review of Sociology, volume 27, pp. 415–444, 2001.
    [Pan02] B. Pang, et al., “Thumbs Up? Sentiment Classification Using Machine Learning Techniques,” In proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, vol. 10, pp. 79-86. Association for Computational Linguistics, 2002.
    [Pan04] B. Pang and L. Lee, "A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts," 2004, p. 271.
    [Pan08] B. Bang, L. Lee” Opinion Mining and Sentiment Analysis,” Found. Trends Inf. Retr., vol. 2, pp. 1-135. Now Publishers Inc, 2008.
    [Pak10] A. Pak, , P. Paroubek, ” Twitter as a corpus for sentiment analysis and opinion mining,” In proceedings of Language Resources and Evaluation (LREC), Valletta, Malta , 2010.
    [Ril03] E. Riloff and J. Wiebe, “Learning Extraction Patterns for Subjective Expressions,” Proceedings of Empirical methods in natural language processing, pp. 105-112, Jul. 2003.
    [Ril06] E. Riloff, S. Patwardhan, and J. Wiebe, “Feature Subsumption for Opinion Analysis,” In proceedings of Empirical Methods in Natural Language Processing, pp. 440-448, 2006.
    [Sha09] S. K. Shandilya, S. Jain, “Opinion Extraction and Classification of Reviews from Web Documents,” In Proceedings of IEEE International Advance Computing, pp. 924 - 927, Mar. 2009.
    [Sun10] Y.T. Sun, et al.” Sentiment Classification of Short Chinese Sentences,” In processing of the 22nd Conference on Computational Linguistics Speech Processing (ROCLING 2010), pp. 184-198, Taiwan, 2010.
    [Tur02] P. Turney, “Thumb up or thumb down? Semantic orientation applied to unsupervised classification of reviews,” In proceedings of the Association for Computational Linguistics, pp. 417-424, July 2002.
    [Tur03] P. D. Turney and M. L. Littman, “Measuring praise and criticism: Inference of semantic orientation from association,” ACM Transactions on Information System., vol. 21, pp. 315-346, 2003.
    [Tan09] H. Tang, et al., "A survey on sentiment detection of reviews," Expert Systems with Applications, vol. 36, pp. 10760-10773, 2009.
    [The10] M. Thelwall,” Emotion homophily in social network site messages,” First Monday, 2010.
    [Tan11]C. Tan, et al.,” User-Level Sentiment Analysis Incorporating Social Networks,” In proceeding of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, California, USA, 2011.

    下載圖示 校內:2015-09-10公開
    校外:2015-09-10公開
    QR CODE