研究生: |
梁翊群 Liang, Yu-Chon |
---|---|
論文名稱: |
主題模型於情感分析之研究 Sentiment Analysis with Topic Modeling |
指導教授: |
李昇暾
Li, Sheng-Tun |
學位類別: |
碩士 Master |
系所名稱: |
管理學院 - 資訊管理研究所 Institute of Information Management |
論文出版年: | 2013 |
畢業學年度: | 101 |
語文別: | 英文 |
論文頁數: | 53 |
中文關鍵詞: | 情感分析 、意見探勘 、主題模型 、正規概念分析 |
外文關鍵詞: | sentimental analysis, formal concept analysis, topic model, opinion mining |
相關次數: | 點閱:111 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來Web 2.0以及電子商務的蓬勃發展,資訊的傳遞方式已經由單向轉變為雙向。使用者可在網際網路發表自身的經驗以及意見,在線上進行消費行為之後也可透過Web 2.0的工具進行評論及經驗分享。使用者的線上言論中包含了大量的意見、情緒等資訊,透過分析顧客的評論可以了解使用者的喜好以及滿意度。在商業上,由於使用者的情緒會對企業以及潛在的消費者都會帶來巨大的影響,因此使用者的評論有必要進行分析。
情感分析是一種文件的分析方法,透過相關技術自動分析文件的情緒導向以及作者的感受。相關情感分析的方法,在機器學習方面以監督式機器學習為主。雖然監督式機器學習有不錯的分析成效,然而實務上監督式機器學習有些許缺點,尤其是品質良好的已標記訓練資料難以取得。
本研究以主題模型為基礎,並結合正規概念分析分群方法,建立一套新的非監督情感分析模式。首先利用主題模型挖掘出情緒詞彙的潛藏主題,接著透過評論和主題之間的關係建立概念網路,以藉此進行評論的情緒分群。相較於現有非監督方法,本研究建構具有更佳辨識成效的分析方法,並以線上評論資料進行驗證。
In recent years, the development of Web 2.0 and e-commerce are flourishing, information that use to be transferred in one-way has developed into two-ways. Users using the Internet as a medium are able to publish their own experience and opinions. Online consumers whom after consumption can use Web 2.0 as a tool to comment and share they experiences as reflection for further discussions. Comments or experiences that are left by users are filled with opinions, emotions and more information, by analyzing such information enables consumers’ satisfaction to be obtained. In business, consumers emotion can greatly influence a business and potential customers, thus there is a need to analysis reviews left by the users. Sentiment analysis is a method to analysis documents, through related technology that automatically analysis documentation's emotional orientated and the authors feeling. Related to sentiment analysis methods, are based mainly on machine learning and supervised machine learning. Although supervised machine learning shows more effective results, but in reality there are still some faults, especially ones with quality and has tagging training data that are hard to obtain.
This study base upon topic model combined with formal concept analysis (FCA) clustering method, to establish a set of novel and unsupervised sentiment analysis method. Starting with topic modeling to mine the sentiment latent topics of review corpus, then establishing a concept lattice through the relationships between the reviews and latent topics, thus conduct sentiment analysis on the reviews. To compare with existing unsupervised method, this research establishes a model with better performance on sentiment analysis, and then we use review data to verify.
Aue, Anthony, & Gamon, Michael. (2005). Customizing Sentiment Classifiers to New Domains: a Case Study. Paper presented at the Proceedings of recent advances in natural language processing (RANLP).
Baccianella, Stefano, Esuli, Andrea, & Sebastiani, Fabrizio. (2010). SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining. Paper presented at the Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10), Valletta, Malta. http://www.lrec-conf.org/proceedings/lrec2010/pdf/769_Paper.pdf
Benamara, Farah, Cesarano, Carmine, Picariello, Antonio, Reforgiato, Diego, & Subrahmanian, VS. (2007). Sentiment Analysis: Adjectives and Adverbs are better than Adjectives Alone. Paper presented at the International conference web-logs and social media (ICwsm 07).
Blei, David M., Ng, Andrew Y., & Jordan, Michael I. (2003). Latent Dirichlet Allocation. The Journal of Machine Learning Research, 3(4/5).
Burusco, A., & Fuentes-Gonzfilez, R. (1998). Construction of the L-fuzzy concept lattice. Fuzzy Sets and Systems, 97(1), 109-114.
Chaovalit, Pimwadee, & Zhou, Lina. (2005). Movie Review Mining: a Comparison between Supervised and Unsupervised Classification Approaches. Paper presented at the System Sciences, 2005. HICSS '05. Proceedings of the 38th Annual Hawaii International Conference on.
Esuli, Andrea, & Sebastiani, Fabrizio. (2005). Determining the semantic orientation of terms through gloss classification. Paper presented at the Proceedings of the 14th ACM international conference on Information and knowledge management, Bremen, Germany.
Esuli, Andrea, & Sebastiani, Fabrizio. (2006). SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining. Paper presented at the In Proceedings of the 5th Conference on Language Resources and Evaluation (LREC' 06).
Everts, Timothy J., Park, Sung Sik, & Kang, Byeong Ho. (2006). Using formal concept analysis with an incremental knowledge acquisition system for web document management. Paper presented at the Proceedings of the 29th Australasian Computer Science Conference - Volume 48, Hobart, Australia.
Formica, Anna. (2006). Ontology-based concept similarity in Formal Concept Analysis. Information Sciences, 176(18), 2624-2641.
Griffiths, Thomas L., & Steyvers, Mark. (2004). Finding scientific topics. Proc Natl Acad Sci U S A, 101 Suppl 1, 5228-5235.
Hatzivassiloglou, Vasileios, & McKeown, Kathleen R. (1997). Predicting the semantic orientation of adjectives. Paper presented at the Proceedings of the eighth conference on European chapter of the Association for Computational Linguistics, Madrid, Spain.
Heinrich, Gregor. (2009). Parameter estimation for text analysis.
Hofmann, Thomas. (1999). Probabilistic latent semantic analysis. Paper presented at the Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence, Stockholm, Sweden.
Kamps, Jaap, Marx, Maarten, Mokken, Robert J., & Rijke, Maarten de. (2004). Using WordNet to Measure Semantic Orientations of Adjectives. Paper presented at the Language Resources and Evaluation (LREC).
Kumar, Akshi, & Sebastian, Teeja Mary. (2012). Sentiment Analysis: A Perspective on its Past, Present and Future. International Journal of Intelligent Systems and Applications, 4(10), 1-14.
Li, Gang, & Liu, Fei. (2012). Application of a clustering method on sentiment analysis. Journal of Information Science, 38(2).
Lin, Chenghua, He, Yulan, Everson, Richard, & Ruger, Stefan. (2012). Weakly Supervised Joint Sentiment-Topic Detection from Text. Knowledge and Data Engineering, IEEE Transactions on, 24(6), 1134-1145.
Liu, Bing. (2010). Sentiment Analysis and Subjectivity.
Morinaga, Satoshi, Yamanishi, Kenji, Tateishi, Kenji, & Fukushima, Toshikazu. (2002). Mining product reputations on the Web. Paper presented at the Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, Edmonton, Alberta, Canada.
Pang, Bo, & Lee, Lillian. (2008). Opinion Mining and Sentiment Analysis. Foundations and Trends in Information Retrieval.
Pang, Bo, Lee, Lillian, & Vaithyanathan, Shivakumar. (2002). Thumbs up?: sentiment classification using machine learning techniques. Paper presented at the Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10.
Salton, Gerard, & Buckley, Christopher. (1988). Term-weighting approaches in automatic text retrieval. Information Processing & Management, 24(5), 513-523.
Shein, Khin Phyu Phyu. (2009). Ontology based combined approach for sentiment classification.
Steyvers, Mark, & Griffiths, Tom. (2007). Probabilistic Topic Models. In T. Landauer, D. McNamara, S. Dennis & W. Kintsch (Eds.), Handbook of Latent Semantic Analysis: Lawrence Erlbaum Associates.
Tan, Songbo, Wu, Gaowei, Tang, Huifeng, & Cheng, Xueqi. (2007). A novel scheme for domain-transfer problem in the context of sentiment analysis. Paper presented at the Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, Lisbon, Portugal.
Turney, Peter D. (2002). Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. Paper presented at the Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, Philadelphia, Pennsylvania.
Wille, Rudolf. (1982). Restructuring lattice theory: an approach based on hierarchies of concepts: Boston: Dordrecht.
Wolff, Karl Erich. (1994). A first Course in Formal Concept Analysis. Paper presented at the Proceedings SoftStat'93.