| 研究生: |
李佩書 Li, Pei-Shu |
|---|---|
| 論文名稱: |
針對推特上產品特徵聚合之漸進式語意強化 Progressive Semantic Reinforcement for Product Feature Grouping in Twitter |
| 指導教授: |
高宏宇
Kao, Hung-Yu |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2016 |
| 畢業學年度: | 105 |
| 語文別: | 英文 |
| 論文頁數: | 62 |
| 中文關鍵詞: | 基於特徵之意見探勘 、產品特徵聚合 、語意分析 、推特 |
| 外文關鍵詞: | Feature-based Opinion Mining, Product Feature Grouping, Semantic Analysis, Twitter |
| 相關次數: | 點閱:87 下載:2 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來隨著智慧型手機的普及以及社群網站的盛行,使得人們比起在傳統部落格上更常在社群網路上發言,藉由這些服務來快速分享他們的生活以及評論,因此社群網站的內容成為是一個非常好分析使用者對產品使用心得的資料來源。然而,人們在社群網站上的發言內容多半非常的短,而且用字遣詞也比較不完整,其中還夾雜了表情符號、標記、連結等。另外,產品特徵的表示方法也較於多變。因此,在產品特徵聚合的議題上,資訊稀疏的問題往往造成語意分析上的不準確,使得特徵詞在聚合的過程中更加困難。
為了解決上述的問題,我們針對推特上的產品特徵聚合提出了一個漸進式的語意強化方法。我們加入了字面特徵來輔助一開始語意特徵不精確的問題,藉由聚合過程中當下子概念群的狀態回饋,不斷整合相關的資訊去強化語意特徵,使語意特徵更加的精確來達到產品特徵聚合的準確。根據我們的研究結果顯示我們的方法相較於缺其一條件或是單純使用語意特徵,其結果相對都來得好。藉此證明,透過加入其他特徵,並以漸進式的方式來整合資訊,對於在資料稀疏的狀態下了解語意是有幫助的。
In recent years, with the popularity of smartphones and the prevalence of social networking services, people more often share their lives and comments quickly in these services than in the traditional blog websites. Therefore, social networking services are the valuable sources for the product feature opinion mining. However, generally the “quick” comments are quite short, lack complete grammatical and syntactical structures; even they are full of abbreviations, irregular expressions, and URLs etc. In addition, their product feature expressions are more diverse. Thus, the information sparse problem will lead to more difficulty in the product feature grouping
In order to solve the problems above, we propose a progressive semantic reinforcement approach for product feature grouping in Twitter. We combine the literal characteristics to help the semantic characteristics which are rough initially and then aggregate the related information progressively through the sub-concept group information to enhance the semantic characteristics for achieving the better grouping results. Our experimental results show that our approach is better than the ones which only use one of the conditions above or neither. These results prove that adding other characteristics and aggregating information progressively can be helpful for understanding the semantics in the sparse information.
[1] S. Baccianella, A. Esuli, and F. Sebastiani, "SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining," in LREC, 2010, pp. 2200-2204.
[2] D. M. Blei, A. Y. Ng, and M. I. Jordan, "Latent dirichlet allocation," the Journal of machine Learning research, vol. 3, pp. 993-1022, 2003.
[3] W. Chamlertwat, P. Bhattarakosol, T. Rungkasiri, and C. Haruechaiyasak, "Discovering Consumer Insight from Twitter via Sentiment Analysis," J. UCS, vol. 18, pp. 973-992, 2012.
[4] G.-B. Chen and H.-Y. Kao, "Re-organized Topic Modeling for Micro-blogging Data," presented at the Proceedings of the ASE BigData & SocialInformatics 2015, Kaohsiung, Taiwan, 2015.
[5] D. Davidov, O. Tsur, and A. Rappoport, "Enhanced sentiment learning using Twitter hashtags and smileys," presented at the Proceedings of the 23rd International Conference on Computational Linguistics: Posters, Beijing, China, 2010.
[6] M.-C. De Marneffe, B. MacCartney, and C. D. Manning, "Generating typed dependency parses from phrase structure parses," in Proceedings of LREC, 2006, pp. 449-454.
[7] A. Esuli and F. Sebastiani, "Sentiwordnet: A publicly available lexical resource for opinion mining," in Proceedings of LREC, 2006, pp. 417-422.
[8] S. Y. Ganeshbhai and B. K. Shah, "Feature based opinion mining: A survey," in Advance Computing Conference (IACC), 2015 IEEE International, 2015, pp. 919-923.
[9] K. Gimpel, N. Schneider, B. O'Connor, D. Das, D. Mills, J. Eisenstein, et al., "Part-of-speech tagging for Twitter: annotation, features, and experiments," presented at the Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2, Portland, Oregon, 2011.
[10] H. Guo, H. Zhu, Z. Guo, X. Zhang, and Z. Su, "Product feature categorization with multilevel latent semantic association," presented at the Proceedings of the 18th ACM conference on Information and knowledge management, Hong Kong, China, 2009.
[11] Y. He, K. Chakrabarti, T. Cheng, and T. Tylenda, "Automatic Discovery of Attribute Synonyms Using Query Logs and Table Corpora," presented at the Proceedings of the 25th International Conference on World Wide Web, Montréal, Québec, Canada, 2016.
[12] Y. He, J. Song, Y. Nan, and G. Fu, "Clustering Chinese Product Features with Multilevel Similarity," in Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data, ed: Springer, 2015, pp. 347-355.
[13] T. Hofmann, "Probabilistic latent semantic indexing," presented at the Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, Berkeley, California, USA, 1999.
[14] B. Hollerit, M. Kröll, and M. Strohmaier, "Towards linking buyers and sellers: detecting commercial intent on twitter," in Proceedings of the 22nd International Conference on World Wide Web, 2013, pp. 629-632.
[15] L. Hong and B. D. Davison, "Empirical study of topic modeling in Twitter," presented at the Proceedings of the First Workshop on Social Media Analytics, Washington D.C., District of Columbia, 2010.
[16] M. Hu and B. Liu, "Mining and summarizing customer reviews," presented at the Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, Seattle, WA, USA, 2004.
[17] M. Hu and B. Liu, "Mining opinion features in customer reviews," presented at the Proceedings of the 19th national conference on Artificial intelligence, San Jose, California, 2004.
[18] D. Kailer, P. Mandl, and A. Schill, "Grouping Product Aspects from Short Texts Using Multiple Classifiers," in Web Information Systems Engineering – WISE 2015: 16th International Conference, Miami, FL, USA, November 1-3, 2015, Proceedings, Part I, J. Wang, W. Cellary, D. Wang, H. Wang, S.-C. Chen, T. Li, et al., Eds., ed Cham: Springer International Publishing, 2015, pp. 1-15.
[19] T. K. Landauer, P. W. Foltz, and D. Laham, "An introduction to latent semantic analysis," Discourse processes, vol. 25, pp. 259-284, 1998.
[20] Z. Li, D. Zhou, Y.-F. Juan, and J. Han, "Keyword extraction for social snippets," presented at the Proceedings of the 19th international conference on World wide web, Raleigh, North Carolina, USA, 2010.
[21] B. Liu, M. Hu, and J. Cheng, "Opinion observer: analyzing and comparing opinions on the Web," presented at the Proceedings of the 14th international conference on World Wide Web, Chiba, Japan, 2005.
[22] B. Liu and L. Zhang, "A Survey of Opinion Mining and Sentiment Analysis," in Mining Text Data, C. C. Aggarwal and C. Zhai, Eds., ed Boston, MA: Springer US, 2012, pp. 415-463.
[23] Q. Miao, Q. Li, D. Zeng, Y. Meng, S. Zhang, and H. Yu, "Entity attribute discovery and clustering from online reviews," Frontiers of Computer Science, vol. 8, pp. 279-288, 2014.
[24] G. A. Miller, "WordNet: a lexical database for English," Commun. ACM, vol. 38, pp. 39-41, 1995.
[25] B. O'Connor, R. Balasubramanyan, B. R. Routledge, and N. A. Smith, "From tweets to polls: Linking text sentiment to public opinion time series," ICWSM, vol. 11, pp. 1-2, 2010.
[26] B. Pang, L. Lee, and S. Vaithyanathan, "Thumbs up?: sentiment classification using machine learning techniques," presented at the Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10, 2002.
[27] G. Salton and C. Buckley, "Term-weighting approaches in automatic text retrieval," Inf. Process. Manage., vol. 24, pp. 513-523, 1988.
[28] H. Sheng, L. Xinlan, P. Xueping, and N. Zhendong, "Fine-grained Product Features Extraction and Categorization in Reviews Opinion Mining," in Data Mining Workshops (ICDMW), 2012 IEEE 12th International Conference on, 2012, pp. 680-686.
[29] Q. Su, X. Xu, H. Guo, Z. Guo, X. Wu, X. Zhang, et al., "Hidden sentiment association in chinese web opinion mining," presented at the Proceedings of the 17th international conference on World Wide Web, Beijing, China, 2008.
[30] P. D. Turney, "Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews," presented at the Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, Philadelphia, Pennsylvania, 2002.
[31] P. Wang, J. Zhao, K. Huang, and B. Xu, "A Unified Semi-supervised Framework for Author Disambiguation in Academic Social Network," in Database and Expert Systems Applications: 25th International Conference, DEXA 2014, Munich, Germany, September 1-4, 2014. Proceedings, Part II, H. Decker, L. Lhotská, S. Link, M. Spies, and R. R. Wagner, Eds., ed Cham: Springer International Publishing, 2014, pp. 1-16.
[32] X. Wang, F. Wei, X. Liu, M. Zhou, and M. Zhang, "Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach," presented at the Proceedings of the 20th ACM international conference on Information and knowledge management, Glasgow, Scotland, UK, 2011.
[33] J. Weng, E.-P. Lim, J. Jiang, and Q. He, "TwitterRank: finding topic-sensitive influential twitterers," presented at the Proceedings of the third ACM international conference on Web search and data mining, New York, New York, USA, 2010.
[34] Z. Zhai, B. Liu, H. Xu, and P. Jia, "Clustering product features for opinion mining," presented at the Proceedings of the fourth ACM international conference on Web search and data mining, Hong Kong, China, 2011.
[35] L. Zhang and B. Liu, "Aspect and entity extraction for opinion mining," in Data mining and knowledge discovery for big data, ed: Springer, 2014, pp. 1-40.
[36] L. Zhao, M. Huang, H. Chen, J. Cheng, and X. Zhu, "Clustering Aspect-related Phrases by Leveraging Sentiment Distribution Consistency," in EMNLP, 2014, pp. 1614-1623.
[37] L. Zhao, M. Huang, J. Sun, H. Luo, X. Yang, and X. Zhu, "Sentiment Extraction by Leveraging Aspect-Opinion Association Structure," presented at the Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, Melbourne, Australia, 2015.