| 研究生: |
張書瑀 Chang, Jeremy |
|---|---|
| 論文名稱: |
動態即插即用理性行為推論框架之情感支持對話系統 A Dynamic Plug-and-Play Rational Speech Act Framework for Emotional Support Dialogue System |
| 指導教授: |
吳宗憲
Wu, Chung-Hsien |
| 學位類別: |
博士 Doctor |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2026 |
| 畢業學年度: | 114 |
| 語文別: | 英文 |
| 論文頁數: | 120 |
| 中文關鍵詞: | 情感支持對話系統 、即插即用語言模型 、理性行為推論 、情緒因果蘊含 |
| 外文關鍵詞: | Emotional support dialogue system, Plug-and-Play language model, Rational speech act, Emotion-cause entailment |
| 相關次數: | 點閱:3 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
情感支持對話系統(Emotional Support Dialogue Systems)是人工智慧於心理健康與情緒輔助領域中極具潛力的研究方向。此類系統不僅需生成語言流暢且語義正確的回應,亦必須準確理解使用者的情緒狀態、適當表達同理心,並於合適時機提供具建設性的支持與引導。現有多數方法仰賴監督式微調或多階段管線式架構(pipeline architecture),雖然能在特定任務下取得良好效能,其回應生成卻普遍缺乏可解釋性與可控性。
即插即用語言模型(Plug-and-Play Language Models, PPLM)提供了一種有別於傳統微調的生成方法,透過在測試階段直接對模型施加控制,使語言模型能朝向特定屬性(例如情緒理解、話題掌握或策略選擇)進行回應生成,而無需重新訓練整個模型。此一架構具備高度彈性和可解釋性,特別適合應用於需同時考量多種屬性的情感支持對話任務。然而,本論文指出,現有即插即用方法在實務應用上仍存在兩項關鍵瓶頸:其一,當多個情緒與策略屬性需同時納入考量時,缺乏有效的協調機制,容易導致回應在同理心表達與策略選擇之間產生不一致甚至衝突;其二,其回應生成延遲過長,因而難以滿足即時互動情境的需求。
為回應上述挑戰,本論文提出一個新穎的整合式框架—動態即插即用理性行為推論框架(Dynamic Plug-and-Play Rational Speech Acts, DPPRSA),旨在為情感支持對話系統建立一個兼具彈性、效率與可解釋性的回應生成機制。DPPRSA 的核心理念在於,回應屬性的控制不應是靜態或固定權重的,而應根據使用者的情緒狀態及其形成原因進行動態調整。為此,本論文引入情緒原因蘊含(Emotion Cause Entailment, ECE)作為動態屬性控制的關鍵依據,透過分析使用者情緒狀態及其因果脈絡,判斷當前互動階段中使用者更需要情緒安撫,或是具體的建議與引導,並據此動態調整不同屬性在生成過程中的影響程度。
此外,本論文進一步將理性行為推論(Rational Speech Acts, RSA)融入該框架之中,將回應生成與屬性控制視為一種具明確目標的機率推論過程。透過 RSA 推理,系統能以較少的步驟就能評估生成回應在不同屬性上的表現,從而顯著降低傳統即插即用方法所帶來的計算成本。實驗結果顯示,所提出的方法在維持甚至提升回應品質的同時,將回應生成時間縮短超過70%,大幅提升其於即時互動情境中的實用性。
DPPRSA 不僅為情感支持對話系統提供了一種具高度延展性與可解釋性的解決方案,也為未來在大型語言模型上進行高階語言控制與人機情感互動研究奠定了重要基礎。
Emotional Support Dialogue Systems represent a promising research direction for artificial intelligence in the domains of mental health and emotional assistance. Beyond generating fluent and semantically accurate responses, such systems must accurately understand users' emotion, appropriately express empathy, and provide constructive support or guidance at suitable moments. Most existing approaches rely on supervised fine-tuning or multi-stage pipeline architectures. Although effective in specific task settings, their response generation processes generally lack interpretability and controllability.
Plug-and-Play Language Models (PPLMs) offer an alternative generation paradigm to traditional fine-tuning by enabling direct control over the model at inference time. This allows language models to generate responses conditioned on specific attributes—such as emotion understanding, topic awareness, or strategy selection—without retraining the entire model. Owing to its flexibility and interpretability, this framework is particularly well suited to emotional support dialogue tasks that require simultaneous consideration of multiple attributes. However, this dissertation identifies two key limitations of existing plug-and-play approaches in applications. First, when multiple attributes must be considered concurrently, the lack of effective coordination often leads to inconsistent or even conflicting generation between empathy expression and strategy selection. Second, response generation latency remains excessively high, making such methods difficult to deploy in real-time interactive scenarios.
To address these challenges, this dissertation proposes a novel unified framework—Dynamic Plug-and-Play Rational Speech Acts (DPPRSA)—designed to establish a response generation mechanism for emotional support dialogue systems that is flexible, efficient, and interpretable. The core principle of DPPRSA is that attribute control should not be static or fixed-weighted, but instead dynamically adjusted according to users' emotional states and their underlying causes. To this end, Emotion Cause Entailment (ECE) is introduced for dynamic attribute control. By analyzing users' emotional states and their causal contexts, the framework determines whether emotional comfort, or strategic guidance is most needed at a given interaction stage, and dynamically modulates the influence of different attributes during generation.
Furthermore, this dissertation integrates Rational Speech Acts (RSA) inference into the proposed framework, formulating response generation and attribute control as a probabilistic reasoning process. Through RSA-based inference, the system can evaluate generated responses with respect to attributes using fewer inference steps, thereby significantly reducing the computational cost associated with traditional plug-and-play methods. Experimental results demonstrate that the proposed approach reduces response generation time by more than 70% while maintaining or even improving response quality, greatly enhancing its practicality for real-time interactive applications.
Overall, DPPRSA provides a highly scalable and interpretable solution for emotional support dialogue systems, and establishes an important foundation for future research on high-level language control and human–machine emotional interaction in large language models.
[1] Francisca AdomaAcheampong, HenryNunoo-Mensah, andWenyuChen. Transformer models for text-based emotion detection: a review of bert-based approaches. Artificial Intelligence Review, 54(8):5789–5829, 2021.
[2] Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Floren cia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anad kat, et al. Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023.
[3] Manish Agnihotri, SB Pooja Rao, Dinesh Babu Jayagopi, Sushranth Hebbar, Sowmya Rasipuram, Anutosh Maitra, and Shubhashis Sengupta. Towards generating topic driven and affective responses to assist mental wellness. In International Conference on Pattern Recognition, pages 129–143. Springer, 2021.
[4] Sammy Yap Xiang Bang, Syed M Raza, Huigyu Yang, and Hyunseung Choo. Emp gan: Encoder-decoder generative adversarial network for mobility prediction. In IEEE INFOCOM 2023-IEEE Conference on Computer Communications Workshops (INFO COMWKSHPS),pages 1–6. IEEE, 2023.
[5] Antoine Bosselut, Hannah Rashkin, Maarten Sap, Chaitanya Malaviya, Asli Celikyil maz, and Yejin Choi. Comet: Commonsense transformers for automatic knowledge graph construction. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4762–4779, 2019.
[6] Hua Cai, Xuli Shen, Qing Xu, Weilin Shen, Xiaomei Wang, Weifeng Ge, Xiaoqing Zheng, andXiangyangXue. Improvingempatheticdialoguegeneration by dynamically infusing commonsense knowledge. In Findings of the Association for Computational Linguistics: ACL 2023, pages 7858–7873, 2023.
[7] Robert R Carkhuff. Helping and human relations: A primer for lay and professional helpers: I. selection and training. 1969.
[8] JeremyChang,Kuan-YuChen,andChung-HsienWu. Applyingreinforcementlearning and multi-generators for stage transition in an emotional support dialogue system. In Proc. Interspeech 2024, pages 3545–3549, 2024.
[9] Jeremy Chang and Chung-Hsien Wu. Applying emotional keyphrase correlation for di versity enhancement in empathetic dialogue response generation. In 2022 International Conference on Asian Language Processing (IALP), pages 286–291. IEEE, 2022.
[10] Jeremy Chang and Chung-Hsien Wu. Pprsa: A plug-and-play language model with ra tional speech act inference for generating empathetic and engaging dialogue responses. IEEE Transactions on Audio, Speech and Language Processing, 2025.
[11] Mao Yan Chen, Siheng Li, and Yujiu Yang. Emphi: Generating empathetic responses with human-like intents. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technolo gies, pages 1063–1074, 2022.
[12] Wei Chen, Hengxu Lin, Qun Zhang, Xiaojin Zhang, Xiang Bai, Xuanjing Huang, and Zhongyu Wei. Cauesc: A causal aware model for emotional support conversation. CoRR, 2024.
[13] Jiale Cheng, Sahand Sabour, Hao Sun, Zhuang Chen, and Minlie Huang. Pal: Persona augmented emotional support conversation generation. In Findings of the Association for Computational Linguistics: ACL 2023, pages 535–554, 2023.
[14] Yi Cheng, Wenge Liu, Wenjie Li, Jiashuo Wang, Ruihui Zhao, Bang Liu, Xiaodan Liang, and Yefeng Zheng. Improving multi-turn emotional support dialogue generation with lookahead strategy planning. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 3014–3026, 2022.
[15] HyoJinChinandMunYongYi. Exploringtheinfluenceofusercharacteristicsonverbal aggression towards social chatbots. Behaviour & Information Technology, 44(8):1576 1594, 2025.
[16] Henriette Cramer, Jorrit Goddijn, Bob Wielinga, and Vanessa Evers. Effects of (in) accurate empathy and situational valence on attitudes towards robots. In 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI), pages 141 142. IEEE, 2010.
[17] Benjamin MP Cuff, Sarah J Brown, Laura Taylor, and Douglas J Howat. Empathy: A review of the concept. Emotion review, 8(2):144–153, 2016.
[18] SumanthDathathri, Andrea Madotto, Janice Lan, Jane Hung, Eric Frank, Piero Molino, Jason Yosinski, and Rosanne Liu. Plug and play language models: A simple approach to controlled text generation. In International Conference on Learning Representations, 2020.
[19] Berardina De Carolis, Stefano Ferilli, and Giuseppe Palestra. Simulating empathic be havior in a social assistive robot. Multimedia Tools and Applications, 76(4):5073–5094, 2017.
[20] Dorottya Demszky, Dana Movshovitz-Attias, Jeongwoo Ko, Alan Cowen, Gaurav Ne made, and Sujith Ravi. Goemotions: A dataset of fine-grained emotions. In Proceed ings of the 58thAnnualMeetingoftheAssociationforComputationalLinguistics, pages 4040–4054, 2020.
[21] Yang Deng, Wenxuan Zhang, Yifei Yuan, and Wai Lam. Knowledge-enhanced mixed initiative dialogue system for emotional support conversations. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4079–4095, 2023.
[22] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 conference of the North American chapter of the association for compu tational linguistics: human language technologies, volume 1 (long and short papers), pages 4171–4186, 2019.
[23] Pablo L.a. R.Ferrer. Confidence intervals for evaluation in machine learning. https: //github.com/luferrer/ConfidenceIntervals. Accessed: 2024-04-25.
[24] Sarah E Finch and Jinho D Choi. Towards unified dialogue system evaluation: A com prehensive analysis of current evaluation protocols. In Proceedings of the 21th An nual Meeting of the Special Interest Group on Discourse and Dialogue, pages 236–245, 2020.
[25] Fengyi Fu, Lei Zhang, Quan Wang, and Zhendong Mao. E-core: Emotion correlation enhanced empathetic dialogue generation. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 10568–10586, 2023.
[26] Yao Fu, Shaoyang Yuan, Chi Zhang, and Juan Cao. Emotion recognition in conversa tions: A survey focusing on context, speaker dependencies, and fusion methods. Elec tronics, 12(22):4714, 2023.
[27] PanGao,DonghongHan,RuiZhou,XuejiaoZhang,andZikunWang. Cab: empathetic dialogue generation with cognition, affection and behavior. In International Conference on Database Systems for Advanced Applications, pages 597–606. Springer, 2023.
[28] Deepanway Ghosal, Navonil Majumder, Alexander Gelbukh, Rada Mihalcea, and Sou janya Poria. Cosmic: Commonsense knowledge for emotion identification in conver sations. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 2470–2481, 2020.
[29] Deepanway Ghosal, Navonil Majumder, Soujanya Poria, Niyati Chhaya, and Alexan der Gelbukh. Dialoguegcn: A graph convolutional neural network for emotion recog nition in conversation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 154–164, 2019.
[30] Noah D Goodman and Michael C Frank. Pragmatic language interpretation as proba bilistic inference. Trends in cognitive sciences, 20(11):818–829, 2016.
[31] Xiaojie Gu, Renze Lou, Lin Sun, and Shangxin Li. Page: A position-aware graph based modelfor emotion cause entailment in conversation. In ICASSP 2023-2023 IEEE 92 International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5. IEEE, 2023.
[32] Clara E Hill. Helping skills: Facilitating exploration, insight, and action. American Psychological Association, 1999.
[33] Jia-Hao Hsu, Jeremy Chang, Min-Hsueh Kuo, and Chung-Hsien Wu. Empathetic response generation based on plug-and-play mechanism with empathy perturbation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 31:2032–2042, 2023.
[34] Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, and Tuo Zhao. Smart: Robust and efficient fine-tuning for pre-trained natural language models through principled regularized optimization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 2177–2190, 2020.
[35] Qiaolei Jiang, Yadi Zhang, and Wenjing Pian. Chatbot as an emergency exist: Me diated empathy for resilience via human-ai interaction during the covid-19 pandemic. Information processing & management, 59(6):103074, 2022.
[36] Dongjin Kang, Sunghwan Kim, Taeyoon Kwon, Seungjun Moon, Hyunsouk Cho, Youngjae Yu, Dongha Lee, and Jinyoung Yeo. Can large language models be good emotional supporter? mitigating preference bias on emotional support conversation. In 62nd Annual Meeting of the Association for Computational Linguistics, ACL 2024, pages 15232–15261. Association for Computational Linguistics (ACL), 2024.
[37] Hyunwoo Kim, Byeongchang Kim, and Gunhee Kim. Perspective-taking and prag matics for generating empathetic responses focused on emotion causes. arXiv preprint arXiv:2109.08828, 2021.
[38] Hyunwoo Kim, Youngjae Yu, Liwei Jiang, Ximing Lu, Daniel Khashabi, Gunhee Kim, Yejin Choi, and Maarten Sap. Prosocialdialog: A prosocial backbone for conversa tional agents. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 4005–4029, 2022.
[39] Siwon Kim, Shuyang Dai, Mohammad Kachuee, Shayan Ray, Tara Taghavi, and Sun groh Yoon. Groundial: Human-norm grounded safe dialog response generation. In Findings of the Association for Computational Linguistics: EACL 2024, pages 1582 1588, 2024.
[40] PKuppensandPVerduyn. Emotiondynamics. Current Opinion in Psychology, 17:22 26, 2017.
[41] Shanglin Lei, Guanting Dong, Xiaoping Wang, Keheng Wang, and Sirui Wang. In structerc: Reforming emotion recognition in conversation with a retrieval multi-task llms framework. CoRR, 2023.
[42] Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mo hamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Com putational Linguistics, pages 7871–7880, 2020.
[43] Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and William B Dolan. A diversity-promoting objective function for neural conversation models. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Compu tational Linguistics: Human Language Technologies, pages 110–119, 2016.
[44] Qintong Li, Hongshen Chen, Zhaochun Ren, Pengjie Ren, Zhaopeng Tu, and Zhumin Chen. Empdg: Multi-resolution interactive empathetic dialogue generation. In Pro ceedings of the 28th International Conference on Computational Linguistics, pages 4454–4466, 2020.
[45] Yanran Li, Hui Su, Xiaoyu Shen, Wenjie Li, Ziqiang Cao, and Shuzi Niu. Dailydialog: A manually labelled multi-turn dialogue dataset. In Proceedings of the Eighth Inter national Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 986–995, 2017.
[46] Zhaojiang Lin, Andrea Madotto, Jamin Shin, Peng Xu, and Pascale Fung. Moel: Mix ture of empathetic listeners. In Proceedings of the 2019 Conference on Empirical Meth ods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 121–132, 2019.
[47] Zhaojiang Lin, Peng Xu, Genta Indra Winata, Farhad Bin Siddique, Zihan Liu, Jamin Shin, and Pascale Fung. Caire: An end-to-end empathetic chatbot. In Proceedings of the AAAI conference on artificial intelligence, volume 34, pages 13622–13623, 2020.
[48] Siyang Liu, Chujie Zheng, Orianna Demasi, Sahand Sabour, Yu Li, Zhou Yu, Yong Jiang, and Minlie Huang. Towards emotional support dialog systems. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 95 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 3469–3483, 2021.
[49] Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. Roberta: A robustly op timized bert pretraining approach. arXiv preprint arXiv:1907.11692, 2019.
[50] YuhanLiu, Jiachen Du, XiangLi, andRuifengXu. Generatingempathetic responses by injecting anticipated emotion. In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 7403–7407. IEEE, 2021.
[51] Yuhan Liu, Jun Gao, Jiachen Du, Lanjun Zhou, and Ruifeng Xu. Empathetic response generation with state management. CoRR, 2022.
[52] Sam Lowe. Samlowe/roberta-base-go_emotions. https://huggingface.co/ SamLowe/roberta-base-go_emotions, 2024. Accessed: April 25, 2024.
[53] Yukun Ma, Khanh Linh Nguyen, Frank Z Xing, and Erik Cambria. A survey on empa thetic dialogue systems. Information Fusion, 64:50–70, 2020.
[54] Navonil Majumder, Pengfei Hong, Shanshan Peng, Jiankun Lu, Deepanway Ghosal, Alexander Gelbukh, Rada Mihalcea, and Soujanya Poria. Mime: Mimicking emotions for empathetic response generation. In Proceedings of the 2020 Conference on Empir ical Methods in Natural Language Processing (EMNLP), pages 8968–8979, 2020.
[55] Mary L McHugh. Interrater reliability: the kappa statistic. Biochemia medica, 22(3):276–282, 2012.
[56] Nicholas Meade, Spandana Gella, Devamanyu Hazarika, Prakhar Gupta, Di Jin, Siva Reddy, Yang Liu, and Dilek Hakkani-Tur. Using in-context learning to improve dia logue safety. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 11882–11910, 2023.
[57] Saif Mohammad. Obtaining reliable human ratings of valence, arousal, and dominance for 20,000 english words. In Proceedings of the 56th annual meeting of the association for computational linguistics (volume 1: Long papers), pages 174–184, 2018.
[58] Oluwatobi Olabiyi, Alan O Salimov, Anish Khazane, and Erik Mueller. Multi-turn dialogue response generation in an adversarial learning framework. In Proceedings of the First Workshop on NLP for Conversational AI, pages 121–132, 2019.
[59] Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pages 311–318, 2002.
[60] Wei Peng, Yue Hu, Yuqiang Xie, Luxi Xing, and Yajing Sun. Cogintac: Modeling the relationships between intention, emotion and action in interactive process from cogni tive perspective. In 2022 IEEE Congress on Evolutionary Computation (CEC), pages 1–8. IEEE, 2022.
[61] Wei Peng, Ziyuan Qin, Yue Hu, Yuqiang Xie, and Yunpeng Li. Fado: Feedback-aware double controlling network for emotional support conversation. Knowledge-Based Sys tems, 264:110340, 2023.
[62] Soujanya Poria, Navonil Majumder, Devamanyu Hazarika, Deepanway Ghosal, Rishabh Bhardwaj, Samson Yu Bai Jian, Pengfei Hong, Romila Ghosh, Abhinaba Roy, 97 Niyati Chhaya, et al. Recognizing emotion cause in conversations. Cognitive Compu tation, 13(5):1317–1332, 2021.
[63] Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, et al. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
[64] Hannah Rashkin, Eric Michael Smith, Margaret Li, and Y-Lan Boureau. Towards em pathetic open-domain conversation models: A new benchmark and dataset. In Pro ceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 5370–5381, 2019.
[65] Nils Reimers and Iryna Gurevych. Sentence-bert: Sentence embeddings using siamese bert-networks. In Proceedings of the 2019 ConferenceonEmpiricalMethodsinNatural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3982–3992, 2019.
[66] Harry T Reis, Michael R Maniaci, Peter A Caprariello, Paul W Eastwick, and Eli J Finkel. Familiarity does indeed promote attraction in live interaction. Journal of per sonality and social psychology, 101(3):557, 2011.
[67] StephenRoller, Emily Dinan, NamanGoyal, DaJu, MaryWilliamson, YinhanLiu, Jing Xu, Myle Ott, Eric Michael Smith, Y-Lan Boureau, et al. Recipes for building an open domain chatbot. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 300–325, 2021.
[68] Sahand Sabour, Chujie Zheng, and Minlie Huang. Cem: Commonsense-aware empa thetic response generation. In Proceedings of the AAAI Conference on Artificial Intel ligence, volume 36, pages 11229–11237, 2022.
[69] Tulika Saha and Sophia Ananiadou. Emotion-aware and intent-controlled empathetic response generation using hierarchical transformer network. In 2022 International Joint Conference on Neural Networks (IJCNN), pages 1–8. IEEE, 2022.
[70] Tulika Saha, Vaibhav Gakhreja, Anindya Sundar Das, Souhitya Chakraborty, and Sri parna Saha. Towards motivational and empathetic response generation in online mental health support. In Proceedings of the 45th international ACM SIGIR conference on re search and development in information retrieval, pages 2650–2656, 2022.
[71] Adam Smith. Cognitive empathy and emotional empathy in human behavior and evo lution. The Psychological Record, 56(1):3–21, 2006.
[72] Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher D Manning, An drew Y Ng, and Christopher Potts. Recursive deep models for semantic composition ality over a sentiment treebank. In Proceedings of the 2013 conference on empirical methods in natural language processing, pages 1631–1642, 2013.
[73] Haoyu Song, Yan Wang, Weinan Zhang, Xiaojiang Liu, and Ting Liu. Generate, delete and rewrite: A three-stage framework for improving persona consistency of dialogue generation. In Proceedings of the 58th Annual Meeting of the Association for Compu tational Linguistics, pages 5821–5831, 2020.
[74] GeminiTeam,RohanAnil,SebastianBorgeaud,Jean-BaptisteAlayrac, JiahuiYu, Radu Soricut, Johan Schalkwyk, Andrew M Dai, Anja Hauth, Katie Millican, et al. Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805, 2023.
[75] HugoTouvron,ThibautLavril, GautierIzacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, 99 et al. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023.
[76] Quan Tu, Yanran Li, Jianwei Cui, Bin Wang, Ji-Rong Wen, and Rui Yan. Misc: A mixed strategy-aware model integrating comet for emotional support conversation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguis tics (Volume 1: Long Papers), pages 308–319, 2022.
[77] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017.
[78] Chenwei Wan, Matthieu Labeau, and Chloé Clavel. Emodynamix: Emotional support dialogue strategy prediction by modelling mixed emotions and discourse dynamics. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the As sociation for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 1678–1695, 2025.
[79] Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel Bowman. Glue: A multi-task benchmark and analysis platform for natural language understanding. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyz ing and Interpreting Neural Networks for NLP, pages 353–355, 2018.
[80] Lanrui Wang, Jiangnan Li, Zheng Lin, Fandong Meng, Chenxu Yang, Weiping Wang, and Jie Zhou. Empathetic dialogue generation via sensitive emotion recognition and sensible knowledge selection. In Findings of the Association for Computational Lin guistics: EMNLP 2022, pages 4634–4645, 2022.
[81] Yi-Hsuan Wang, Jia-Hao Hsu, Chung-Hsien Wu, and Tsung-Hsien Yang. Transformer based empathetic response generation using dialogue situation and advanced-level defi nition of empathy. In 2021 12th International Symposium on Chinese Spoken Language Processing (ISCSLP), pages 1–5. IEEE, 2021.
[82] Anuradha Welivita and Pearl Pu. A taxonomy of empathetic response intents in human social conversations. In Proceedings of the 28th International Conference on Compu tational Linguistics, pages 4886–4899, 2020.
[83] Anuradha Welivita, Yubo Xie, and Pearl Pu. A large-scale dataset for empathetic re sponse generation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 1251–1264, 2021.
[84] Kevin Yang and Dan Klein. Fudge: Controlled text generation with future discrim inators. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3511–3535, 2021.
[85] Zhou Yang, Zhaochun Ren, Wang Yufeng, Xiaofei Zhu, Zhihao Chen, Tiecheng Cai, Wu Yunbing, Yisong Su, Sibo Ju, and Xiangwen Liao. Exploiting emotion-semantic correlations for empathetic response generation. In The 2023 Conference on Empirical Methods in Natural Language Processing, 2023.
[86] Rohola Zandie and Mohammad H Mahoor. Emptransfo: A multi-head transformer architecture for creating empathetic dialog systems. In FLAIRS, pages 276–281, 2020.
[87] Chengkun Zeng, Guanyi Chen, Chenghua Lin, Ruizhe Li, and Zhi Chen. Affective decoding for empathetic response generation. In Proceedings of the 14th International Conference on Natural Language Generation, pages 331–340, 2021.
[88] Tenggan Zhang, Xinjie Zhang, Jinming Zhao, Li Zhou, and Qin Jin. Escot: Towards interpretable emotional support dialogue systems. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 13395–13412, 2024.
[89] Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q Weinberger, and Yoav Artzi. Bertscore: Evaluating text generation with bert. In International Conference on Learn ing Representations, 2019.
[90] Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q Weinberger, and Yoav Artzi. Bertscore: Evaluating text generation with bert. In International Conference on Learn ing Representations, 2020.
[91] Yizhe Zhang, Siqi Sun, Michel Galley, Yen-Chun Chen, Chris Brockett, Xiang Gao, Jianfeng Gao, Jingjing Liu, and William B Dolan. Dialogpt: Large-scale generative pre-training for conversational response generation. In Proceedings of the 58th An nual Meeting of the Association for Computational Linguistics: System Demonstra tions, pages 270–278, 2020.
[92] Chujie Zheng, Sahand Sabour, Jiaxin Wen, Zheng Zhang, and Minlie Huang. Augesc: Dialogue augmentation with large language modelsfor emotionalsupport conversation. In Findings of the Association for Computational Linguistics: ACL 2023, pages 1552 1568, 2023.
[93] Jinfeng Zhou, Zhuang Chen, Bo Wang, and Minlie Huang. Facilitating multi-turn emo tional support conversation with positive emotion elicitation: A reinforcement learning approach. In The 61st Annual Meeting Of The Association For Computational Linguis tics, 2023.
[94] LiZhou, Jianfeng Gao, DiLi, andHeung-YeungShum. Thedesignandimplementation of xiaoice, anempatheticsocialchatbot. ComputationalLinguistics, 46(1):53–93, 2020.
[95] Zixiao Zhu, Junlang Qian, Zijian Feng, Hanzhang Zhou, and Kezhi Mao. Edentail: An entailment-based few-shot text classification with extensional definition. In Findings of the Association for Computational Linguistics: NAACL 2024, pages 1124–1137, 2024.