| 研究生: |
孫義峰 Sun, Yi-Feng |
|---|---|
| 論文名稱: |
多情感辭典特徵層級情感分析之評論自動摘要 Using Multi-Lexicons in Feature-Level Sentiment Analysis for Reviews Summarization |
| 指導教授: |
王惠嘉
Wang, Hei-Chia |
| 學位類別: |
碩士 Master |
| 系所名稱: |
管理學院 - 資訊管理研究所 Institute of Information Management |
| 論文出版年: | 2015 |
| 畢業學年度: | 103 |
| 語文別: | 中文 |
| 論文頁數: | 54 |
| 中文關鍵詞: | 文字探勘 、產品特徵擷取 、情感分析 |
| 外文關鍵詞: | text mining, product feature extraction, sentiment analysis |
| 相關次數: | 點閱:103 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
現今社會中旅遊已經成為人們生活中的一部分,出外旅行時難免需要一個住所,因此住宿服務產業也就隨之蓬勃。由於網路資訊發達,消費者在規劃旅遊行程時已不必親自前往旅行社詢問旅遊資訊,自行在家中使用個人電腦瀏覽景點及旅館的資訊即可。
許多消費者在決定要入住哪間旅館前,需要取得旅館相關資訊以協助他們做決策,其中最重要的資訊來源就是之前的消費者所撰寫的評論。但Web 2.0盛行使網路上評論數量快速成長,消費者已不易以人工閱讀方式將所有評論看完,因此設計一套自動分析評論的系統就可以協助使用者更有效率的了解評論內容。
在自動分析消費者撰寫之評論時,最主要的目的在於了解該消費者在評論中所表達之意見為正向(讚賞)或負向(批評),利用情感分析技術即可辨別出消費者在評論中想表達之意見。在情感分析方法中,需要知道意見字詞之描述對象,才能釐清其真正含意,在本研究中意見字詞之描述對象稱為特徵,本研究提出非監督式之學習方法,藉由詞性排列模式來幫助系統自動擷取出旅館之特徵,並且加上多源辭典情感分析技術,取出評論中重要的部分(即消費者之意見),透過摘要的方式將評論整合,提供消費者了解旅館的途徑,以利消費者能夠縮短選擇旅館之決策時間。
最終透過實驗來驗證本研究所提出之方法,而實驗結果顯示本研究提出之方法能夠有效地擷取出旅館之特徵,評估指標F-measure達到.628,在特徵擷取表現上優於過去之研究。
Nowadays, tourism has become a part of life. Before reserving hotels, customers need some information, which the most important source is online reviews, about hotels to help them make decisions. Due to the dramatic growing of online reviews, it is impossible for customers to read all reviews manually. Therefore, designing an automatic review analysis system, which summarizes reviews, is necessary. The main purpose of the system is to understand the opinion of reviews, which may be positive or negative. In other words, the system would analyze whether the customers who visited the hotel like it or not. Using sentiment analysis methods will help the system achieve the purpose. In sentiment analysis methods, the target of opinion (we call it “feature”) should be recognized to clarify the polarity of the opinion because polarity of the opinion may be ambiguous. Hence, we propose an unsupervised method using Part-Of-Speech pattern and multi-lexicons sentiment analysis to summarize all reviews and help customers know hotels as well as make decisions efficiently. Experimental results show that our method outperforms the state of the art methods with F-measure .628.
Bagheri, A., Saraee, M., & de Jong, F. (2013). Care more about customers: Unsupervised domain-independent aspect detection for sentiment analysis of customer reviews. Knowledge-Based Systems, 52(0), 201-213. doi: http://dx.doi.org/10.1016/j.knosys.2013.08.011
Bravo-Marquez, F., Mendoza, M., & Poblete, B. (2014). Meta-level sentiment models for big social data analysis. Knowledge-Based Systems, 69(0), 86-99. doi: http://dx.doi.org/10.1016/j.knosys.2014.05.016
Brill, E. (2000). Part-of-speech tagging. Handbook of Natural Language Processing, 403-414.
Cambria, E., Schuller, B., Xia, Y., & Havasi, C. (2013). New avenues in opinion mining and sentiment analysis. IEEE Intelligent Systems, 1.
Chang, P.-C., Galley, M., & Manning, C. D. (2008). Optimizing Chinese word segmentation for machine translation performance. In (Ed.), (pp. 224-232). Association for Computational Linguistics.
Chen, K.-J., & Bai, M.-H. (1998). Unknown word detection for Chinese by a corpus-based learning method. International Journal of Computational Linguistics and Chinese Language Processing, 3(1), 27-44.
Chen, K.-J., & Liu, S.-H. (1992). Word identification for Mandarin Chinese sentences. In (Ed.), (pp. 101-107). Association for Computational Linguistics.
Chen, K.-J., & Ma, W.-Y. (2002). Unknown word extraction for Chinese documents. In (Ed.), (pp. 1-7). Association for Computational Linguistics.
Fu, G., & Luke, K.-K. (2005). Chinese named entity recognition using lexicalized HMMs. ACM SIGKDD Explorations Newsletter, 7(1), 19-25.
Gretzel, U., Yoo, K. H., & Purifoy, M. (2007). Online travel review study: Role and impact of online travel reviews.
Hamburg, M. (1985). Basic Statistics : a modern approach: Harcourt Brace Jovanovich.
Levy, R., & Manning, C. (2003). Is it harder to parse Chinese, or the Chinese Treebank? In (Ed.), (pp. 439-446). Association for Computational Linguistics.
Li, Y., Bandar, Z. A., & McLean, D. (2003). An approach for measuring semantic similarity between words using multiple information sources. Knowledge and Data Engineering, IEEE Transactions on, 15(4), 871-882.
Lightspeed Research. (2011). Consumers reply on online reviews and price comparison to make purchase decisions. from http://www.lightspeedresearch.com/press-releases/consumers-rely-on-online-reviews-and-price-comparisons-to-make-purchase-decisions/
Litvin, S. W., Goldsmith, R. E., & Pan, B. (2008). Electronic word-of-mouth in hospitality and tourism management. Tourism Management, 29(3), 458-468. doi: http://dx.doi.org/10.1016/j.tourman.2007.05.011
Ma, W.-Y., & Chen, K.-J. (2003). A bottom-up merging algorithm for Chinese unknown word extraction. In (Ed.), (pp. 31-38). Association for Computational Linguistics.
Mauri, A. G., & Minazzi, R. (2013). Web reviews influence on expectations and purchasing intentions of hotel potential customers. International Journal of Hospitality Management, 34(0), 99-107. doi: http://dx.doi.org/10.1016/j.ijhm.2013.02.012
Mitchell, T. M. (1997). Machine learning. 1997. Burr Ridge, IL: McGraw Hill, 45.
Palmer, D. D. (1997). A trainable rule-based algorithm for word segmentation. In Cohen, P. R. & Wahlster, W. (Ed.), Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics (pp. 321-328). Association for Computational Linguistics.
Pedersen, T., Patwardhan, S., & Michelizzi, J. (2004). WordNet:: Similarity: measuring the relatedness of concepts. In (Ed.), (pp. 38-41). Association for Computational Linguistics.
Peng, F., Feng, F., & McCallum, A. (2004). Chinese segmentation and new word detection using conditional random fields. In Bird, S. (Ed.), Proceedings of the 20th international conference on Computational Linguistics (pp. 562). Association for Computational Linguistics.
PhoCusWright. (2010). Technology and Independent Distribution in the European Travel Industry. from http://www.phocuswright.com/free_reports/technology-and-independent-distribution-in-the-european-travel-industry
PhoCusWright. (2011). PhoCusWright's global online travel overview. from http://www.phocuswright.com/products/2716/
Qiu, G., Liu, B., Bu, J., & Chen, C. (2011). Opinion word expansion and target extraction through double propagation. Computational linguistics, 37(1), 9-27.
Quan, C., & Ren, F. (2014). Unsupervised product feature extraction for feature-oriented opinion determination. Information Sciences, 272(0), 16-28. doi: http://dx.doi.org/10.1016/j.ins.2014.02.063
Saha, S. K., Sarkar, S., & Mitra, P. (2009). Feature selection techniques for maximum entropy based biomedical named entity recognition. Journal of biomedical informatics, 42(5), 905-911.
Salton, G., Wong, A., & Yang, C.-S. (1975). A vector space model for automatic indexing. Communications of the ACM, 18(11), 613-620.
Serra Cantallops, A., & Salvi, F. (2014). New consumer behavior: A review of research on eWOM and hotels. International Journal of Hospitality Management, 36(0), 41-51. doi: http://dx.doi.org/10.1016/j.ijhm.2013.08.007
Sun, T., Youn, S., Wu, G., & Kuntaraporn, M. (2006). Online Word-of-Mouth (or Mouse): An Exploration of Its Antecedents and Consequences. Journal of Computer-Mediated Communication, 11(4), 1104-1127. doi: 10.1111/j.1083-6101.2006.00310.x
Torres, E. N., Adler, H., & Behnke, C. (2014). Stars, diamonds, and other shiny things: The use of expert and consumer feedback in the hotel industry. Journal of Hospitality and Tourism Management, 21(0), 34-43. doi: http://dx.doi.org/10.1016/j.jhtm.2014.04.001
TripAdvisor. (2014). TripAdvisor論壇. Retrieved 6/16, 2014, from http://www.tripadvisor.com.tw/
Wang, B., & Wang, H. (2008). Bootstrapping Both Product Features and Opinion Words from Chinese Customer Reviews with Cross-Inducing. In Lee, J.-H., Copestake, A. & Matsumoto, Y. (Ed.), Proceedings of the Third International Joint Conference on Natural Language Processing (pp. 289-295). Asian Federation of Natural Language Processing.
Wong, P.-k., & Chan, C. (1996). Chinese word segmentation based on maximum matching and word binding force. In Tsujii, J. (Ed.), Proceedings of the 16th conference on Computational linguistics-Volume 1 (pp. 200-203). Association for Computational Linguistics.
Wu, Z., & Tseng, G. (1993). Chinese text segmentation for text retrieval: Achievements and problems. Journal of the American Society for Information Science, 44(9), 532-542.
Xue, N. (2003). Chinese word segmentation as character tagging. Computational Linguistics and Chinese Language Processing, 8(1), 29-48.
Xue, N., Chiou, F.-D., & Palmer, M. (2002). Building a large-scale annotated Chinese corpus. In (Ed.), (pp. 1-8). Association for Computational Linguistics.
Xue, N., Xia, F., Chiou, F.-D., & Palmer, M. (2005). The Penn Chinese TreeBank: Phrase structure annotation of a large corpus. Natural language engineering, 11(02), 207-238.
Yeh, C.-L., & Lee, H.-J. (1991). Rule-based word identification for Mandarin Chinese sentences-A unification approach. Computer Processing of Chinese and Oriental Languages, 5(2), 97-118.
中央研究院中文詞知識庫小組. (2011). 廣義知網知識本體架構. 2015, from http://ehownet.iis.sinica.edu.tw/
台灣趨勢研究. (2012). 住宿服務業發展趨勢. from http://www.twtrend.com/share_cont.php?id=37
行政院主計總處. (2014). 國民所得統計摘要. 2014, from http://www.dgbas.gov.tw/ct.asp?xItem=33338&ctNode=3099&mp=1
高照明. (2012). 語料庫建構技術 -研究報告. from http://wd.naer.edu.tw/project/NAER-101-12-F-2-03-00-2-01.pdf
校內:2020-06-29公開