簡易檢索 / 詳目顯示

研究生: 陳昕渝
Chen, Hsin-Yu
論文名稱: 應用圖形思考於情感支持回應生成:融合情緒強化知識擴展的方法
Applying Graph of Thought to Emotional Support Response Generation with Emotion-Enriched Knowledge Expansion
指導教授: 吳宗憲
Wu, Chung-Hsien
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2025
畢業學年度: 113
語文別: 英文
論文頁數: 118
中文關鍵詞: 情感支持回話策略圖型思考
外文關鍵詞: Emotional Support Skill, Response Strategies, Graph of Thoughts
相關次數: 點閱:17下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 與機器人助理的對話已成為人們日常生活的一部分,對話生成系統也被廣泛應用於各個領域。其中,在心理諮商領域中,除了需要具備流暢自然的對話能力外,更重要的是提供情感支持的能力。透過具備情感理解與同理能力的模型,系統能夠成為更好的傾聽者,進一步理解並同理使用者所面臨的困難與情緒。
    近年來,越來越多研究開始關注情感支持對話的生成能力,然而多數方法僅著重於提升模型回應的語言品質,較少關注回應過程的可解讀性。為了解決此問題,有研究引入「鏈結思考(Chain-of-Thought)」來設計提示框架,模擬心理諮商過程中諮商師的思考路徑,以提高模型回應的透明度與可理解性。
    本研究將重新設計「圖型思考(Graph-of-Thought)」的提示框架,結合圖形化結構以提升模型的思考複雜度,克服傳統鏈結思考在處理多重情境與情緒時的侷限性。由於情感支持對話需同時考量使用者的多種情緒狀態與背景情境,單一線性的思考方式往往無法充分捕捉這些資訊。我們的方法結合多源資訊與圖式推理機制,讓模型得以更全面地理解問題,進而產生更具同理心與針對性的回應,同時保有可解釋性。本研究設計兩個知識擴展模組,擴展情感知識以及關鍵事件的因果關聯,協助模型獲取更豐富、多樣的資訊來源。
    為提升策略應對的靈活性,我們亦導入 Helping Skills Theory 中的三個階段:探索(Exploration)、安慰(Comfort)、建議(Action),並擴充策略使用範圍,從原本八種策略延伸至十四種,進一步強化系統在不同情境下的應對能力。
    在無需額外微調的情況下,我們所提出的架構在三大評估面向 —— 連貫性、資訊性與同理心 —— 上在 G-Eval 評分框架皆達到最佳表現,並獲得人工偏好評測的正向肯定。比起基線系統,我們系統在連貫性提升0.73,資訊性提升1.36,在同理性則提升1.23,顯示本系統所生成的回應內容更符合人類對於高品質情緒支持的期待。
    在傳統指標方面,如 ESCoT 論文中所指出,該類指標多偏重與標準答案的一致性,未能全面評估對話系統的真實表現。然而,即便在此限制下,本研究在 Rouge-L提升3.08分,在BLUE-1和BLUE-2是第二高分,僅次於本研究使用之主要模型 LLaMA3.2_3b_instrcut差距分別為 1.0和0.25。雖然本系統在 Distinct-1 與 Distinct-2 指標上的表現略低,顯示多樣性與準確性間存在取捨,但整體而言,仍展現出顯著的性能提升。
    此外,在 Emotional Support Conversation Dataset 上亦展現良好泛化能力,顯示本方法具備穩定且實用的應用潛力。

    Conversations with virtual assistants are now common in daily life, and dialogue systems are increasingly applied in sensitive domains such as psychological counseling. In this setting, beyond fluent language generation, the ability to understand and respond with empathy is essential.
    In recent years, growing attention has been paid to developing models capable of generating emotional support dialogues. However, most existing approaches focus primarily on improving the linguistic quality of responses, with relatively little emphasis on the interpretability of the response generation process. To address this issue, some studies have adopted Chain-of-Thought (CoT) prompting frameworks that simulate the reasoning paths of counselors in real therapy sessions, thereby improving the transparency and interpretability of model outputs.
    Building on this idea, our study proposes a redesigned Graph-of-Thought (GoT) prompting framework, which leverages a graph-based structure to enhance the Wmodel’s reasoning complexity. This approach overcomes the limitations of linear CoT in handling multiple emotional states and contextual factors. Since emotional support dialogues often involve complex and layered user experiences, linear reasoning alone is insufficient to fully capture these nuances. Our method integrates multi-source knowledge with a graph-based reasoning mechanism, enabling the model to develop a more comprehensive understanding of the user's situation and generate responses that are both empathetic and contextually appropriate, while maintaining interpretability. To further enrich the model’s reasoning capabilities, we introduce two knowledge expansion modules: one for emotional knowledge enrichment and the other for identifying causal relationships between key events.
    Additionally, to improve the flexibility of strategy selection, we incorporate the three stages of Helping Skills Theory—Exploration, Comfort, and Action—and expand the system’s strategic repertoire from the original eight strategies to fourteen. This allows the model to respond more effectively across a wider range of emotional and situational contexts.
    Without additional fine-tuning, our proposed framework achieves the best performance in coherence, informativeness, and empathy under the G-Eval evaluation and receives positive results in human preference assessments. Compared to the baseline, it improves coherence by 0.73, informativeness by 1.36, and empathy by 1.23, indicating more human-aligned emotional support responses.
    While traditional metrics focus on reference overlap and may not fully reflect system quality, our method still shows a 3.08-point gain in Rouge-L and ranks second in BLEU-1 and BLEU-2, just behind LLaMA3.2_3B-Instruct by 1.0 and 0.25, respectively. The performance on Distinct-1/2 is slightly lower due to the trade-off between diversity and precision.
    Overall, the system also generalizes well on the Emotional Support Conversation Dataset, demonstrating strong stability and practical potential.

    摘要 III Abstract V 致謝 VII Content VIII List of Tables XI List of Figures XIII Chapter 1 Introduction 1 1.1 Background 1 1.2 Motivation 2 1.3 Literature Review 5 1.3.1 Tasks-Oriented Dialogue Systems 5 1.3.2 Open-Domain Dialogue Systems 7 1.3.3 Empathetic Dialogue Systems 9 1.3.4 Emotional Support Dialogue Systems 11 1.3.5 Large Language Model 12 1.3.6 Prompt Engineering 14 1.4 Problems 16 1.5 Proposed Methods 18 Chapter 2 System Framework 20 2.1 Framework Overview 23 2.2 Graph of Thoughts (GoT) 25 2.3 Emotion Analysis 30 2.3.1 Keyword Extraction 31 2.3.2 Causal Reasoning 33 2.3.3 Emotional Knowledge Expansion 36 2.3.4 Graph Summarization 38 2.4 System Response Generation 41 2.4.1 Strategy Chosen 41 2.4.2 Mix Strategy Response Generation 43 Chapter 3 Dataset 45 3.1 Emotional Support Conversation Dataset (ESConv) 46 3.2 Emotional Support Dialogue with CoT Dataset (ESD-CoT) 52 Chapter 4 Experiments 59 4.1 Evaluation Metrics 59 4.1.1 BLEU 60 4.1.2 Rogue-L 61 4.1.3 Distinct-n 62 4.1.4 G-Eval 63 4.1.5 Human Evaluation 72 4.2 Experiment Result and Discussion 75 4.2.1 Cross-Model Validation and Capability Assessment 76 4.2.2 Baseline Comparison and Performance Benchmarking 77 4.2.3 Cross-Dataset Generalization: ESConv Evaluation 80 4.2.4 Strategy-Focused GoT Framework Evaluation 81 4.2.5 Ablation Study and Component Analysis 82 4.2.6 Scale-Variant GoT Framework Comparison 83 4.2.7 Human Evaluation 85 4.2.8 Dialogue Example 87 Chapter 5 Conclusion and Future Work 88 Reference 89 Appendix 93 A. Prompts for Keyphrase Extraction 93 B. Prompts for Causal Reasoning 95 C. Prompts for Emotional Knowledge Expansion 97 D. Prompts for Graph Summarization 99 E. Prompts for Strategy Chosen 101 F. Prompts for Mix Strategy Response Generation 103 G. Interpretability 105

    [1] C. Sackett, D. Harper, and A. Pavez, "Do We Dare Use Generative AI for Mental Health?," IEEE Spectrum, vol. 61, no. 6, pp. 42-47, 2024, doi: 10.1109/MSPEC.2024.10551790.
    [2] C. E. Hill, Helping skills: Facilitating, exploration, insight, and action, 3rd ed (Helping skills: Facilitating, exploration, insight, and action, 3rd ed.). Washington, DC, US: American Psychological Association, 2009, pp. xix, 430-xix, 430.
    [3] S. Liu et al., "Towards emotional support dialog systems," arXiv preprint arXiv:2106.01144, 2021.
    [4] M. Huang, X. Zhu, and J. Gao, "Challenges in Building Intelligent Open-domain Dialog Systems," ACM Trans. Inf. Syst., vol. 38, no. 3, p. Article 21, 2020, doi: 10.1145/3383123.
    [5] T.-H. Wen et al., "A Network-based End-to-End Trainable Task-oriented Dialogue System," Valencia, Spain, April 2017: Association for Computational Linguistics, in Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, pp. 438-449. [Online]. Available: https://aclanthology.org/E17-1042/. [Online]. Available: https://aclanthology.org/E17-1042/
    [6] B. Liu, G. Tur, D. Hakkani-Tur, P. Shah, and L. Heck, "End-to-end optimization of task-oriented dialogue model with deep reinforcement learning," arXiv preprint arXiv:1711.10712, 2017.
    [7] H.-D. Xu, X.-L. Mao, P. Yang, F. Sun, and H. Huang, "Rethinking Task-Oriented Dialogue Systems: From Complex Modularity to Zero-Shot Autonomous Agent," Bangkok, Thailand, August 2024: Association for Computational Linguistics, in Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2748-2763, doi: 10.18653/v1/2024.acl-long.152. [Online]. Available: https://aclanthology.org/2024.acl-long.152/ https://doi.org/10.18653/v1/2024.acl-long.152
    [8] N. Bang, J. Lee, and M.-W. Koo, "Task-optimized adapters for an end-to-end task-oriented dialogue system," arXiv preprint arXiv:2305.02468, 2023.
    [9] W. Chung, S. Cahyawijaya, B. Wilie, H. Lovenia, and P. Fung, "InstructTODS: Large language models for end-to-end task-oriented dialogue systems," arXiv preprint arXiv:2310.08885, 2023.
    [10] T. Adewumi, F. Liwicki, and M. Liwicki, "State-of-the-art in Open-domain Conversational AI: A Survey," Information, vol. 13, no. 6, p. 298, 2022.
    [11] Y. Zhang et al., "Dialogpt: Large-scale generative pre-training for conversational response generation," arXiv preprint arXiv:1911.00536, 2019.
    [12] E. Dinan, S. Roller, K. Shuster, A. Fan, M. Auli, and J. Weston, "Wizard of wikipedia: Knowledge-powered conversational agents," arXiv preprint arXiv:1811.01241, 2018.
    [13] L. Ma, M. Li, W.-N. Zhang, J. Li, and T. Liu, "Unstructured text enhanced open-domain dialogue system: A systematic survey," ACM Transactions on Information Systems (TOIS), vol. 40, no. 1, pp. 1-44, 2021.
    [14] I. Serban, A. Sordoni, Y. Bengio, A. Courville, and J. Pineau, "Building end-to-end dialogue systems using generative hierarchical neural network models," in Proceedings of the AAAI conference on artificial intelligence, 2016, vol. 30, no. 1.
    [15] W. Zhang et al., "A static and dynamic attention framework for multi turn dialogue generation," ACM Transactions on Information Systems, vol. 41, no. 1, pp. 1-30, 2023.
    [16] T. Ji, Y. Graham, G. J. Jones, C. Lyu, and Q. Liu, "Achieving reliable human assessment of open-domain dialogue systems," arXiv preprint arXiv:2203.05899, 2022.
    [17] H. Rashkin, E. M. Smith, M. Li, and Y.-L. Boureau, "Towards empathetic open-domain conversation models: A new benchmark and dataset," arXiv preprint arXiv:1811.00207, 2018.
    [18] Z. Lin, A. Madotto, J. Shin, P. Xu, and P. Fung, "Moel: Mixture of empathetic listeners," arXiv preprint arXiv:1908.07687, 2019.
    [19] P. Gao, D. Han, R. Zhou, X. Zhang, and Z. Wang, "CAB: empathetic dialogue generation with cognition, affection and behavior," in International Conference on Database Systems for Advanced Applications, 2023: Springer, pp. 597-606.
    [20] A. S. Raamkumar and Y. Yang, "Empathetic conversational systems: A review of current advances, gaps, and opportunities," IEEE Transactions on Affective Computing, vol. 14, no. 4, pp. 2722-2739, 2022.
    [21] J. Zhou, Z. Chen, B. Wang, and M. Huang, "Facilitating multi-turn emotional support conversation with positive emotion elicitation: A reinforcement learning approach," arXiv preprint arXiv:2307.07994, 2023.
    [22] D. Kang et al., "Can large language models be good emotional supporter? mitigating preference bias on emotional support conversation," arXiv preprint arXiv:2402.13211, 2024.
    [23] T. Zhang, X. Zhang, J. Zhao, L. Zhou, and Q. Jin, "Escot: Towards interpretable emotional support dialogue systems," arXiv preprint arXiv:2406.10960, 2024.
    [24] E. Dinan et al., "The second conversational intelligence challenge (convai2)," in The NeurIPS'18 Competition: From Machine Learning to Intelligent Conversations, 2020: Springer, pp. 187-208.
    [25] T. Brown et al., "Language models are few-shot learners," Advances in neural information processing systems, vol. 33, pp. 1877-1901, 2020.
    [26] S. Sabour, C. Zheng, and M. Huang, "Cem: Commonsense-aware empathetic response generation," in Proceedings of the AAAI Conference on Artificial Intelligence, 2022, vol. 36, no. 10, pp. 11229-11237.
    [27] K. Shuster et al., "Blenderbot 3: a deployed conversational agent that continually learns to responsibly engage," arXiv preprint arXiv:2208.03188, 2022.
    [28] C. Si et al., "Prompting gpt-3 to be reliable," arXiv preprint arXiv:2210.09150, 2022.
    [29] J. Wei et al., "Chain-of-thought prompting elicits reasoning in large language models," Advances in neural information processing systems, vol. 35, pp. 24824-24837, 2022.
    [30] M. Besta et al., "Graph of thoughts: Solving elaborate problems with large language models," in Proceedings of the AAAI Conference on Artificial Intelligence, 2024, vol. 38, no. 16, pp. 17682-17690.
    [31] X. Wang et al., "Self-consistency improves chain of thought reasoning in language models," arXiv preprint arXiv:2203.11171, 2022.
    [32] S. Yao et al., "Tree of thoughts: Deliberate problem solving with large language models," Advances in neural information processing systems, vol. 36, pp. 11809-11822, 2023.
    [33] R. Liu, J. Geng, A. J. Wu, I. Sucholutsky, T. Lombrozo, and T. L. Griffiths, "Mind your step (by step): Chain-of-thought can reduce performance on tasks where thinking makes humans worse," arXiv preprint arXiv:2410.21333, 2024.
    [34] Y. Wen, Z. Wang, and J. Sun, "Mindmap: Knowledge graph prompting sparks graph of thoughts in large language models," arXiv preprint arXiv:2308.09729, 2023.
    [35] R. S. Lazarus, "Psychological stress and the coping process," 1966.
    [36] S. Schmidt, C. Tinti, L. J. Levine, and S. Testa, "Appraisals, emotions and emotion regulation: An integrative approach," Motivation and emotion, vol. 34, pp. 63-72, 2010.
    [37] K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, "Bleu: a method for automatic evaluation of machine translation," in Proceedings of the 40th annual meeting of the Association for Computational Linguistics, 2002, pp. 311-318.
    [38] C.-Y. Lin, "Rouge: A package for automatic evaluation of summaries," in Text summarization branches out, 2004, pp. 74-81.
    [39] J. Li, M. Galley, C. Brockett, J. Gao, and B. Dolan, "A diversity-promoting objective function for neural conversation models," arXiv preprint arXiv:1510.03055, 2015.
    [40] Y. Liu, D. Iter, Y. Xu, S. Wang, R. Xu, and C. Zhu, "G-eval: NLG evaluation using gpt-4 with better human alignment," arXiv preprint arXiv:2303.16634, 2023.

    無法下載圖示 校內:2026-08-07公開
    校外:2026-08-07公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE