| 研究生: |
鄭宇軒 Cheng, Yu-Hsuan |
|---|---|
| 論文名稱: |
最大化相互資訊之非監督式商品面向擷取方法 Unsupervised Aspect Extraction via Mutual Information Maximization |
| 指導教授: |
劉任修
Liu, Ren-Shiou |
| 學位類別: |
碩士 Master |
| 系所名稱: |
管理學院 - 資訊管理研究所 Institute of Information Management |
| 論文出版年: | 2021 |
| 畢業學年度: | 109 |
| 語文別: | 中文 |
| 論文頁數: | 76 |
| 中文關鍵詞: | 深度學習 、注意力機制 、非監督學習 、最大化相互資訊 |
| 外文關鍵詞: | Deep Learning, Attention Mechanism, Unsupervised Learning, Maximization Mutual Information |
| 相關次數: | 點閱:178 下載:10 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來,隨著電子商務的成熟以及人們消費習慣的改變,越來越多人在網路上購買商品、服務,加上社群網路的蓬勃發展,在網路中留下個人意見已成為常態。對於商業活動來說,商品評論不僅成為消費者進行購買決策的依據之一,也能夠幫助業者了解使用者關注的商品面向,進而作為新商品改善的方向。不過,由於商品評論增加的數量與日俱增,導致使用者難以對於評論逐一進行人工分析、解讀。因此,處理大量文字資料並擷取出重要資訊的研究開始出現在自然語言處理以及文字探勘領域。
假設一則評論為:「該手機的價格便宜、重量很輕,但效能很差。」,其實體為手機,價格、重量以及效能為「面向」單詞。本研究目的為設計適合的模型來擷取商品評論中用戶關注的「面向」單詞,並以群組的方式呈現多個面向種類。其中,本研究模型使用注意力機制來凸顯在商品評論中用戶關注的單詞。另外,為了減少事前對於資料集的分析時間,使用非監督式表徵學習訓練深度學習模型,並且在訓練過程中引入最大化相互資訊的概念。最終,經過商品評論之收集、分析與訓練。本研究模型產生之面向群組相較於比較模型Latent Dirichlet allocation(LDA)與Attention Based Aspect Extraction(ABAE),能夠獲得較良好群組內部語意一致性,面向群組內的單詞也更接近評論中真實用戶關注的面向。
With the development of social networks, people usually leave personal opinions and reviews on the Internet. For consumers, product reviews help them make purchase decisions. However, with the rapid growth of product reviews, it is difficult for consumers to manually analyze reviews one by one. Therefore, research on processing a large amount of text data and extracting important information began to appear in the fields of natural language processing and text mining.
For example, ``The phone is cheap and light, but its performance is poor." The object is a mobile phone, and the price and performance are aspects. This research aims to design a suitable model to capture the aspect words that users care about in product reviews and present multiple aspect categories in clusters. Among them, this research model uses an attention mechanism to highlight the words that users care about in product reviews. In addition, to reduce the manual analysis time of the dataset beforehand, we use unsupervised learning to train our model. Furthermore, we introduce the concept of maximizing mutual information in our model training process. Finally, experiment results on Amazon datasets demonstrate that our model discovers more meaningful and coherent aspects than Latent Dirichlet Allocation(LDA) and Attention Based Aspect Extraction(ABAE) on two evaluations.
Angelidis, S. and Lapata, M. (2018). Summarizing opinions: Aspect extraction meets sentiment prediction and they are both weakly supervised. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3675– 3686. Association for Computational Linguistics.
Bahdanau, D., Cho, K., and Bengio, Y. (2016). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
Belghazi, M. I., Baratin, A., Rajeshwar, S., Ozair, S., Bengio, Y., Courville, A., and Hjelm, D. (2018). Mutual information neural estimation. In PMLR 2018, volume 80 of Proceedings of Machine Learning Research, pages 531–540. PMLR.
Bengio, Y., Courville, A., and Vincent, P. (2013). Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8):1798–1828.
Blei, D. M., Ng, A. Y., and Jordan, M. I. (2003). Latent dirichlet allocation. Journal of machine Learning research, 3(Jan):993–1022.
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186. Association for Computational Linguistics.
Donahue, J., Krähenbühl, P., and Darrell, T. (2017). Adversarial feature learning. arXiv preprint arXiv:1605.09782.
Donsker, M. D. and Varadhan, S. R. S. (1983). Asymptotic evaluation of certain markov process expectations for large time. iv. Communications on Pure and Applied Mathematics, 36(2):183–212.
Glorot, X. and Bengio, Y. (2010). Understanding the difficulty of training deep feedfor- ward neural networks. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pages 249–256. PMLR.
Gong, C., Shi, K., and Niu, Z. (2019). Hierarchical text-label integrated attention net- work for document classification. In Proceedings of the 2019 3rd High Performance Computing and Cluster Technologies Conference, HPCCT 2019, page 254–260. As- sociation for Computing Machinery.
Harris, D. and Harris, S. (2012). Digital design and computer architecture (2nd ed.). Morgan Kaufmann.
He, R., Lee, W. S., Ng, H. T., and Dahlmeier, D. (2017). An unsupervised neural attention model for aspect extraction. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 388–397. Association for Computational Linguistics.
Hjelm, D. and Bachman, P. (2020). Representation learning with video deep infomax. arXiv preprint arxiv:2007.13278.
Hjelm, D., Fedorov, A., Lavoie-Marchildon, S., Grewal, K., Bachman, P., Trischler, A., and Bengio, Y. (2019). Learning deep representations by mutual information estimation and maximization. In ICLR 2019. ICLR.
Hu, M. and Liu, B. (2004). Mining and summarizing customer reviews. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’04, page 168–177. Association for Computing Machinery.
Jin, W., Ho, H., and Srihari, R. (2009). Opinionminer: A novel machine learning system for web opinion mining and extraction. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1195– 1204.
Jo, Y. and Oh, A. H. (2011). Aspect and sentiment unification model for online review analysis. In Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, WSDM ’11, page 815–824. Association for Computing Machinery.
Kim, Y. (2014). Convolutional neural networks for sentence classification. In Proceed- ings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1746–1751. Association for Computational Linguistics.
Kohonen, T. (1998). The self-organizing map. Neurocomputing, 21(1):1 – 6.
Lafferty, J. D., McCallum, A., and Pereira, F. C. N. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the Eighteenth International Conference on Machine Learning, ICML ’01, page 282–289. Morgan Kaufmann Publishers Inc.
Li, F., Han, C., Huang, M., Zhu, X., Xia, Y.-J., Zhang, S., and Yu, H. (2010). Structure- aware review mining and summarization. In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), pages 653–661. Coling 2010 Organizing Committee.
Li, X., Bing, L., Zhang, W., and Lam, W. (2019). Exploiting BERT for end-to-end aspect- based sentiment analysis. In Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019), pages 34–41. Association for Computational Linguistics.
Linsker, R. (1988). Self-organization in a perceptual network. Computer, 21(3):105– 117.
Liu, B. (2012). Sentiment Analysis and Opinion Mining. Morgan and amp; Claypool Publishers.
Liu, P., Joty, S., and Meng, H. (2015). Fine-grained opinion mining with recurrent neural networks and word embeddings. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 1433–1443. Association for Computational Linguistics.
Liu, Q., Liu, B., Zhang, Y., Kim, D. S., and Gao, Z. (2016). Improving opinion aspect extraction using semantic similarity and aspect associations. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, AAAI’16, page 2986–2992. AAAI Press.
Luo, L., Ao, X., Song, Y., Li, J., Yang, X., He, Q., and Yu, D. (2019). Unsuper- vised neural aspect extraction with sememes. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19, pages 5123–5129. International Joint Conferences on Artificial Intelligence Organization.
Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013). Efficient estimation of word representations in vector space. 1st International Conference on Learning Representations, ICLR 2013 - Workshop Track Proceedings, pages 1–12.
Minaee, S., Kalchbrenner, N., Cambria, E., Nikzad, N., Chenaghlu, M., and Gao, J. (2020). Deep Learning Based Text Classification: A Comprehensive Review. arXiv preprint arXiv:2004.03705.
Mudinas, A., Zhang, D., and Levene, M. (2019). Market trend prediction using sentiment analysis: Lessons learned and paths forward. arXiv preprint arXiv:1903.05440.
Poria, S., Cambria, E., Ku, L.-W., Gui, C., and Gelbukh, A. (2014). A rule-based approach to aspect extraction from product reviews. SocialNLP, 2014.
Qiu, G., Liu, B., Bu, J., and Chen, C. (2011). Opinion word expansion and target extraction through double propagation. Computational Linguistics, 37(1):9–27.
Rabiner, L. and Juang, B. (1986). An introduction to hidden markov models. IEEE ASSP Magazine, 3(1):4–16.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I. (2017). Attention is all you need. In Advances in neural information processing systems, pages 5998–6008.
Weinberger, E. (2020). Mutual information and the kl divergence. https://homes.cs.washington.edu/ ewein//blog/2020/08/09/mutual-information/.
Xie, J., Girshick, R., and Farhadi, A. (2016). Unsupervised deep embedding for cluster- ing analysis. In Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48, ICML’16, page 478–487. JMLR.org.
Yang, B., Wang, L., Wong, D. F., Chao, L. S., and Tu, Z. (2019). Convolutional self-attention networks. In Proceedings of the 2019 Conference of the North Amer- ican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4040–4045. Association for Computational Linguistics.
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., and Hovy, E. (2016). Hierarchical attention networks for document classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1480–1489. Association for Computational Linguistics.
Yaqub, U., Sharma, N., Pabreja, R., Chun, S. A., Atluri, V., and Vaidya, J. (2020). Location-based sentiment analyses and visualization of twitter election data. Digit. Gov.: Res. Pract., 1(2).
Yelp Inc. (2020). Form 10-k 2019. SEC, https://www.sec.gov/ix?doc=/ Archives/edgar/data/1345016/000134501620000009/yelp-20191231.htm.
Yue, L., Chen, W., Li, X., Zuo, W., and Yin, M. (2019). A survey of sentiment analysis in social media. Knowl. Inf. Syst., 60(2):617–663.
Zhang, L., Wang, S., and Liu, B. (2018). Deep learning for sentiment analysis : A survey. arXiv preprint arXiv:1801.07883.
Zhou, X., Li, C., Bu, J., Yao, C., Shi, K., Yu, Z., and Yu, Z. (2020). Matching text with deep mutual information estimation. arXiv preprint arXiv:2003.11521.