| 研究生: |
施智臏 SHIH, CHIH-PIN |
|---|---|
| 論文名稱: |
基於CPED數據的中文情感分析:結合MacBERT嵌入與BiLSTM融合模型 Chinese Sentiment Analysis Based on CPED Data: A Fusion Model of MacBERT Embedding and BiLSTM |
| 指導教授: |
楊大和
Yang, Ta-Ho 王宏鍇 Wang, Hung-Kai |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 製造資訊與系統研究所 Institute of Manufacturing Information and Systems |
| 論文出版年: | 2025 |
| 畢業學年度: | 113 |
| 語文別: | 中文 |
| 論文頁數: | 68 |
| 中文關鍵詞: | 科學心理學 、文字探勘 、大五人格 、中文情緒辨識 、深度學習 |
| 外文關鍵詞: | Scientific Psychology, Text Mining, Big Five Personality, Chinese Emotion Recognition, Deep Learning |
| 相關次數: | 點閱:37 下載:5 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著現代環境的高速發展,現實壓力和工作負擔持續增加,心理健康問題逐漸成為全球關注的議題。焦慮、憂鬱等情緒障礙的發生率逐年攀升,導致心理疾病患者的數量大幅增加。這不僅對個人的生活品質產生深遠影響,也為心理諮商帶來日漸巨大的挑戰。在此背景下,有效運用科技方法幫助心理諮商人員更高效地捕捉與理解個案的情緒,成為亟待解決的問題。現今人臉情緒識別領域中,深度學習技術的應用日益重要,多模態情緒識別系統(Multimodal Emotion Recognition Systems)正成為提升情緒識別準確性的關鍵技術。這類系統結合多種數據來源,如面部表情、語音特徵、文本內容及生理信號等,為情緒識別帶來了更豐富的參考依據。然而,多模態數據的異質性和複雜性對系統的性能與實用性構成挑戰。在實務應用中,傳統的經驗法則依據專業人員的主觀判斷,缺乏有效且準確的管理機制。因此,開發高效且可用於心理諮商輔助的情緒識別系統具有重要意義。本研究選擇中文情緒文本數據集CPED,並結合 Big Five 人格特質作為特徵,用於預測情緒與情感狀態。針對情感識別以提升準確性、減少錯誤分類以及滿足實時處理需求,設計了一套結合多層感知器(MLP)與基於 BiLSTM 的注意力機制的深度學習框架。該框架利用注意力機制聚焦關鍵特徵,提高情緒識別的表現,進一步優化識別效果。實驗結果顯示,該方法在多模態情緒數據上的表現顯著優於基線模型,並在準確性和穩定性上展現了實用價值。
The increasing pressures of modern life have led to a global rise in mental health concerns, with anxiety and depression becoming more prevalent. These issues impact individuals' well-being and pose challenges for psychological counseling. Leveraging technology to enhance emotion recognition has become essential.
Deep learning plays a key role in emotion recognition, particularly in multimodal systems integrating facial expressions, vocal features, text, and physiological signals. However, the complexity of multimodal data affects system performance. Traditional methods often lack accuracy and effective knowledge management, highlighting the need for improved emotion recognition systems.
This study utilizes the Chinese emotion text dataset (CPED) and incorporates the Big Five personality traits for emotion prediction. A deep learning framework combining Multi-Layer Perceptrons (MLP) and BiLSTM-based attention mechanisms is proposed to enhance accuracy, reduce misclassification, and meet real-time processing needs. The attention mechanism focuses on critical features to optimize performance dynamically.
Experimental results show that the proposed method outperforms baseline models in multimodal emotion datasets, demonstrating strong accuracy and stability. This system has the potential to support psychological counseling by improving emotion detection and enhancing therapeutic interventions.
[1] Ekman, P., & Friesen, W. V. (1978). Facial action coding system. Environmental Psychology & Nonverbal Behavior.
[2] Busso, C., Bulut, M., Lee, C. C., Kazemzadeh, A., Mower, E., Kim, S., ... & Narayanan, S. S. (2008). IEMOCAP: Interactive emotional dyadic motion capture database. Language resources and evaluation, 42, 335-359.
[3] Poria, S., Hazarika, D., Majumder, N., Naik, G., Cambria, E., & Mihalcea, R. (2018). Meld: A multimodal multi-party dataset for emotion recognition in conversations. arXiv preprint arXiv:1810.02508.
[4] Saha, T., Patra, A., Saha, S., & Bhattacharyya, P. (2020, July). Towards emotion-aided multi-modal dialogue act classification. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 4361-4372).
[5] Shen, G., Wang, X., Duan, X., Li, H., & Zhu, W. (2020, October). Memor: A dataset for multimodal emotion reasoning in videos. In Proceedings of the 28th ACM international conference on multimedia (pp. 493-502).
[6] Jiang, H., Zhang, X., & Choi, J. D. (2020, April). Automatic text-based personality recognition on monologues and multiparty dialogues using attentive networks and contextual embeddings (student abstract). In Proceedings of the AAAI conference on artificial intelligence (Vol. 34, No. 10, pp. 13821-13822).
[7] Hu, D., Wei, L., & Huai, X. (2021). Dialoguecrn: Contextual reasoning networks for emotion recognition in conversations. arXiv preprint arXiv:2106.01978.
[8] Li, J., Wang, X., Lv, G., & Zeng, Z. (2023). GA2MIF: graph and attention based two-stage multi-source information fusion for conversational emotion detection. IEEE Transactions on affective computing, 15(1), 130-143.
[9] Hu, G., Lin, T. E., Zhao, Y., Lu, G., Wu, Y., & Li, Y. (2022). UniMSE: Towards unified multimodal sentiment analysis and emotion recognition. arXiv preprint arXiv:2211.11256.
[10] Lei, S., Dong, G., Wang, X., Wang, K., & Wang, S. (2023). Instructerc: Reforming emotion recognition in conversation with a retrieval multi-task llms framework. arXiv preprint arXiv:2309.11911.
[11] Chandra, Y., & Jana, A. (2020, March). Sentiment analysis using machine learning and deep learning. In 2020 7th international conference on computing for sustainable global development (INDIACom) (pp. 1-4). IEEE.
[12] Mahima, K. T. Y., Ginige, T. N. D. S., & De Zoysa, K. (2021). Evaluation of sentiment analysis based on AutoML and traditional approaches. International Journal of Advanced Computer Science and Applications, 12(2).
[13] Ashbaugh, L., & Zhang, Y. (2024). A Comparative Study of Sentiment Analysis on Customer Reviews Using Machine Learning and Deep Learning. Computers, 13(12), 340.
[14] Zhang, L., Wang, S., & Liu, B. (2018). Deep learning for sentiment analysis: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 8(4), e1253.
[15] Khan, L., Amjad, A., Afaq, K. M., & Chang, H. T. (2022). Deep sentiment analysis using CNN-LSTM architecture of English and Roman Urdu text shared in social media. Applied Sciences, 12(5), 2694.
[16] Vo, Q. H., Nguyen, H. T., Le, B., & Nguyen, M. L. (2017, October). Multi-channel LSTM-CNN model for Vietnamese sentiment analysis. In 2017 9th international conference on knowledge and systems engineering (KSE) (pp. 24-29). IEEE.
[17] Xu, G., Meng, Y., Qiu, X., Yu, Z., & Wu, X. (2019). Sentiment analysis of comment texts based on BiLSTM. Ieee Access, 7, 51522-51532.
[18] Sangeetha, J., & Kumaran, U. (2023). A hybrid optimization algorithm using BiLSTM structure for sentiment analysis. Measurement: Sensors, 25, 100619.
[19] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI blog, 1(8), 9.
[20] Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., & Hovy, E. (2016, June). Hierarchical attention networks for document classification. In Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies (pp. 1480-1489).
[21] Kowsari, K., Jafari Meimandi, K., Heidarysafa, M., Mendu, S., Barnes, L., & Brown, D. (2019). Text classification algorithms: A survey. Information, 10(4), 150
[22] Nurdin, N., Kluza, K., Fitria, M., Saddami, K., & Utami, R. S. (2023, August). Analysis of social media data using deep learning and NLP method for potential use as natural disaster management in Indonesia. In 2023 2nd International Conference on Computer System, Information Technology, and Electrical Engineering (COSITE) (pp. 143-148). IEEE.
[23] Cui, Y., Che, W., Liu, T., Qin, B., Wang, S., & Hu, G. (2020). Revisiting pre-trained models for Chinese natural language processing. arXiv preprint arXiv:2004.13922.
[24] Wu, M. T. (2022). Confusion matrix and minimum cross-entropy metrics based motion recognition system in the classroom. Scientific Reports, 12(1), 3095.
[25] Dave, C., & Khare, M. (2021, November). Emotion Detection in Conversation Using Class Weights. In 2021 8th International Conference on Soft Computing & Machine Intelligence (ISCMI) (pp. 231-236). IEEE.
[26] Savchenko, A. V., Savchenko, L. V., & Makarov, I. (2022). Classifying emotions and engagement in online learning based on a single facial expression recognition neural network. IEEE Transactions on Affective Computing, 13(4), 2132-2143.
[27] M. Cliche, ‘‘BB_twtr at SemEval-2017 task 4: Twitter sentiment analysis with CNNs and LSTMs,’’ 2017, arXiv:1704.06125.
[28] D. Araci, ‘‘FinBERT: Financial sentiment analysis with pre-trained language models,’’ 2019, arXiv:1908.10063.
[29] I. Abu Farha and W. Magdy, ‘‘Mazajak: An online Arabic sentiment analyser,’’ in Proc. 4th Arabic Natural Lang. Process. Workshop, Stroudsburg, PA, USA, 2019, pp. 192–198.
[30] Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. R. Salakhutdinov, and Q. V. Le, ‘‘XLNet: Generalized autoregressive pretraining for language understanding,’’ in Proc. Adv. Neural Inf. Process. Syst., vol. 32, 2019, pp. 5754–5764
[31] W. Antoun, F. Baly, and H. Hajj, ‘‘AraBERT: Transformer-based model for Arabic language understanding,’’ 2020, arXiv:2003.00104
[32] Q. Xie, Z. Dai, E. Hovy, M.-T. Luong, and Q. V. Le, ‘‘Unsupervised data augmentation for consistency training,’’ in Proc. NIPS, vol. 33, 2020, pp. 6256–6268
[33] Y. Cui, W. Che, T. Liu, B. Qin, and Z. Yang, ‘‘Pre-training with whole word masking for Chinese BERT,’’ IEEE/ACM Trans. Audio, Speech, Language Process., vol. 29, pp. 3504–3514, 2021.
[34] I. M. Moosa, M. E. Akhter, and A. B. Habib, ‘‘Does transliteration help multilingual language modeling?’’ 2022, arXiv:2201.12501.
[35] L. Shang, Z. Lu, and H. Li, “Neural responding machine for short-text conversation,” in Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Beijing, China: Association for Computational Linguistics, Jul. 2015, pp. 1577–1586. [Online]. Available: https: //www.aclweb.org/anthology/P15-1152
[36] Y. Wu, W. Wu, C. Xing, M. Zhou, and Z. Li, “Sequential matching network: A new architecture for multi-turn response selection in retrieval-based chatbots,” in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Vancouver, Canada: Association for Computational Linguistics, Jul. 2017, pp. 496–505. [Online]. Available: https://www.aclweb.org/anthology/P17-1046
[37] Wang, Y., Ke, P., Zheng, Y., Huang, K., Jiang, Y., Zhu, X., & Huang, M. (2020). A large-scale chinese short-text conversation dataset. In Natural Language Processing and Chinese Computing: 9th CCF International Conference, NLPCC 2020, Zhengzhou, China, October 14–18, 2020, Proceedings, Part I 9 (pp. 91-103). Springer International Publishing.
[38] H. Zhou, P. Ke, Z. Zhang, Y. Gu, Y. Zheng, C. Zheng, Y. Wang, C. H. Wu, H. Sun, X. Yang, B. Wen, X. Zhu, M. Huang, and J. Tang, “EVA: an open-domain chinese dialogue system with large-scale generative pre-training,” CoRR, vol. abs/2108.01547, 2021. [Online]. Available: https://arxiv.org/abs/2108.01547
[39] Zhou, H., Huang, M., Zhang, T., Zhu, X., & Liu, B. (2018, April). Emotional chatting machine: Emotional conversation generation with internal and external memory. In Proceedings of the AAAI conference on artificial intelligence (Vol. 32, No. 1).
[40] Zheng, Y., Chen, G., Huang, M., Liu, S., & Zhu, X. (2019). Personalized dialogue generation with diversified traits. arXiv preprint arXiv:1901.09672.
[41] Chen, Y., Fan, W., Xing, X., Pang, J., Huang, M., Han, W., ... & Xu, X. (2022). Cped: A large-scale chinese personalized and emotional dialogue dataset for conversational ai. arXiv preprint arXiv:2205.14727.
[42] Y. Kim, “Convolutional neural networks for sentence classification,” CoRR, vol. abs/1408.5882, 2014. [Online]. Available: http://arxiv.org/ abs/1408.5882
[43] Liu, P., Qiu, X., & Huang, X. (2016). Recurrent neural network for text classification with multi-task learning. arXiv preprint arXiv:1605.05101.
[44] Lai, S., Xu, L., Liu, K., & Zhao, J. (2015, February). Recurrent convolutional neural networks for text classification. In Proceedings of the AAAI conference on artificial intelligence (Vol. 29, No. 1).
[45] A. Joulin, E. Grave, P. Bojanowski, and T. Mikolov, “Bag of tricks for efficient text classification,” in Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers. Valencia, Spain: Association for Computational Linguistics, Apr. 2017, pp. 427–431. [Online]. Available: https://aclanthology.org/E17-2068
[46] Devlin, J. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
[47] S. Poria, E. Cambria, D. Hazarika, N. Majumder, A. Zadeh, and L.-P. Morency, “Context-dependent sentiment analysis in user-generated videos,” in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Vancouver, Canada: Association for Computational Linguistics, Jul. 2017, pp. 873–883. [Online]. Available: https://aclanthology.org/P17-1081
[48] Majumder, N., Poria, S., Hazarika, D., Mihalcea, R., Gelbukh, A., & Cambria, E. (2019, July). Dialoguernn: An attentive rnn for emotion detection in conversations. In Proceedings of the AAAI conference on artificial intelligence (Vol. 33, No. 01, pp. 6818-6825).
[49] D. Ghosal, N. Majumder, S. Poria, N. Chhaya, and A. Gelbukh, “DialogueGCN: A graph convolutional neural network for emotion recognition in conversation,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Hong Kong, China: Association for Computational Linguistics, Nov. 2019, pp. 154–164. [Online]. Available: https: //aclanthology.org/D19-1015
[50] T. Kim and P. Vossen, “Emoberta: Speaker-aware emotion recognition in conversation with roberta,” CoRR, vol. abs/2108.12009, 2021. [Online]. Available: https://arxiv.org/abs/2108.12009
[51] W. Shen, J. Chen, X. Quan, and Z. Xie, “Dialogxl: All-inone xlnet for multi-party conversation emotion recognition,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 15, 2021, pp. 13 789–13 797. [Online]. Available: https://ojs.aaai.org/index.php/AAAI/article/view/17625
[52] Wang, Y., Wang, B., Zhao, Y., Zhao, D., Jin, X., Zhang, J., ... & Hou, Y. (2024, May). Emotion Recognition in Conversation via Dynamic Personality. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) (pp. 5711-5722).