| 研究生: |
薛辰昇 Xue, Chen-Sheng |
|---|---|
| 論文名稱: |
融合個人化與資料增強之對話式雙相情感障礙評估方法 A Dialogue-Based Assessment of Bipolar Disorder Incorporating Personalization and Data Augmentation |
| 指導教授: |
吳宗憲
Wu, Chung-Hsien |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2025 |
| 畢業學年度: | 113 |
| 語文別: | 英文 |
| 論文頁數: | 79 |
| 中文關鍵詞: | 雙相障礙 、對話生成 、資料擴增 、個性化情感支持 、HY評分 |
| 外文關鍵詞: | Bipolar disorder, Dialogue generation, Data augmentation, Personalized emotional support, HY scoring |
| 相關次數: | 點閱:18 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
雙相障礙,也稱為雙向情感障礙,是既有躁症發作、又有憂鬱發作,或呈現躁狂與抑鬱混合狀態的複雜心理障礙。專業護理人員普遍使用漢密爾頓憂鬱量表(HAMD)以及楊氏躁狂量表(YMRS)對患者進行憂鬱及躁狂狀態的臨床評定。但面對龐大之評估需求與有限之人力資源,往往難以實現全天候、個人化之動態監測,進而影響治療成效。
為此,本研究提出一種基於對話生成與自動評分之雙相障礙評估系統。本系統會根據病患輸入,自動檢索其個人檔案庫及HY量表,根據我們所提出的選擇機制,挑選出適宜的個人資料內容或HY項目資訊,而後將其連同真實輸入包裝成prompt一併輸入基於GLM架構的對話生成模型,以生成符合病患背景的個性化回應或HY相關評估問題。
在病患針對HY評估問題進行回答後,系統將文本輸入基於MacBERT建立的評分模型進行心理評估。由於訓練資料來源於醫院真實評估記錄,具有數量嚴重稀缺且類別不平均的問題,故我們採取多步驟進行資料擴增。單次對話結束後,系統會基於GLM模型進行關鍵事件總結及患者側寫,並更新病患個人檔案,為後續評估與治療提供持續支持。
實驗結果顯示,與基線系統相比,本系統於多項指標上實現顯著提升:其中,回应生成模型困惑度(PPL)降低118.06;Distinct 1與Distinct 2分別提升0.07及0.91;句子流暢度及資訊含量評分分別提高0.92及0.90;HY量表條目選擇之F1值提升0.17;Personal-F1與Persona-Coverage分別提升0.023和0.001。在HAMD與YRMS評分模型方面,利用資料擴增後的資料集進行訓練,MAE達0.4194,低於兩個量表中的最小的臨床重要差異(MCID)。研究顯示,該系統不僅能為雙相障礙患者提供高效且準確之臨床評定,亦能於對話中持續提供個性化情感支持,為心理治療流程之進行提供有力保障。
Bipolar disorder, also known as manic-depressive illness, is a complex psychiatric condition characterized by alternating episodes of mania and depression, or mixed affective states. While healthcare professionals routinely employ the Hamilton Depression Rating Scale (HAMD) and Young Mania Rating Scale (YMRS) for clinical assessments, the growing demand for evaluations and limited human resources often hinder continuous personalized monitoring, consequently compromising treatment efficacy.
To address this challenge, we propose a novel bipolar disorder assessment system integrating dialogue generation and automated scoring. The system automatically retrieves patient-specific archival data and HY scale parameters (HAMD/YMRS) based on user inputs. Through our proposed selection mechanism, it dynamically assembles contextually appropriate personal records or HY assessment items, which are then combined with real-time inputs into structured prompts for the GLM architecture-based dialogue generation model. This process enables personalized responses and HY-related evaluation questions tailored to individual patient profiles.
Following patient responses to HY assessment queries, the system employs a MacBERT-based scoring model for psychological evaluation. To overcome severe data scarcity and class imbalance in hospital-derived training records, we implement multi-step data augmentation strategies. Post-dialogue, the system performs key event summarization and patient profiling through GLM modeling, while continuously updating individual medical archives to support longitudinal assessment and therapeutic interventions.
Experimental results demonstrate that, compared to the baseline system, our proposed system achieves significant improvements across multiple metrics. Specifically, the perplexity (PPL) of the response generation model decreased by 118.06, while Distinct-1 and Distinct-2 scores increased by 0.07 and 0.91, respectively. Sentence fluency and information content scores improved by 0.92 and 0.90. For the HY scale item selection task, the F1 score increased by 0.17. Furthermore, Personal-F1 and Persona-Coverage rose by 0.023 and 0.001, respectively. In terms of the HAMD and YMRS scoring models, training on an augmented dataset resulted in a mean absolute error (MAE) of 0.4194—lower than the minimal clinically important difference (MCID) for both scales. These findings indicate that the system not only delivers efficient and accurate clinical assessments for individuals with bipolar disorder but also provides ongoing, personalized emotional support during conversations, offering strong support for the psychotherapy process.
[1] I. Grande, M. Berk, B. Birmaher, E. Vieta, Bipolar disorder, The Lancet 387(10027) (2016) 1561-1572.
[2] J. Lai, S. Li, C. Wei, J. Chen, Y. Fang, P. Song, S. Hu, Mapping the global, regional and national burden of bipolar disorder from 1990 to 2019: trend analysis on the Global Burden of Disease Study 2019, The British Journal of Psychiatry 224(2) (2024) 36-46.
[3] B. Müller-Oerlinghausen, A. Berghöfer, M. Bauer, Bipolar disorder, The Lancet 359(9302) (2002) 241-247.
[4] R.M. Bagby, A.G. Ryder, D.R. Schuller, M.B. Marshall, The Hamilton Depression Rating Scale: has the gold standard become a lead weight?, American Journal of Psychiatry 161(12) (2004) 2163-2177.
[5] R. Young, J. Biggs, V. Ziegler, D. Meyer, Young mania rating scale, Journal of Affective Disorders (2000).
[6] V. Oliva, G. Fico, M. De Prisco, X. Gonda, A.R. Rosa, E. Vieta, Bipolar disorders: an update on critical aspects, The Lancet Regional Health–Europe 48 (2025).
[7] R.C. Kessler, H.S. Akiskal, J. Angst, M. Guyer, R.M. Hirschfeld, K.R. Merikangas, P.E. Stang, Validity of the assessment of bipolar spectrum disorders in the WHO CIDI 3.0, Journal of affective disorders 96(3) (2006) 259-269.
[8] M. Yerushalmi, A. Sixsmith, A.P. Star, D.B. King, N. O'Rourke, Ecological momentary assessment of bipolar disorder symptoms and partner affect: longitudinal pilot study, JMIR Formative Research 5(9) (2021) e30472.
[9] H. Li, D. Mukherjee, V.B. Krishnamurthy, C. Millett, K.A. Ryan, L. Zhang, E.F. Saunders, M. Wang, Use of ecological momentary assessment to detect variability in mood, sleep and stress in bipolar disorder, BMC research notes 12 (2019) 1-7.
[10] A. Gershon, C.N. Kaufmann, J. Torous, C. Depp, T.A. Ketter, Electronic ecological Momentary assessment (EMA) in youth with bipolar disorder: demographic and clinical predictors of electronic EMA adherence, Journal of Psychiatric Research 116 (2019) 14-18.
[11] E.M. Boucher, N.R. Harake, H.E. Ward, S.E. Stoeckl, J. Vargas, J. Minkel, A.C. Parks, R. Zilca, Artificially intelligent chatbots in digital mental health interventions: a review, Expert Review of Medical Devices 18(sup1) (2021) 37-49.
[12] M. Suárez, A.M. Torres, P. Blasco-Segura, J. Mateo, Application of the Random Forest Algorithm for Accurate Bipolar Disorder Classification, Life 15(3) (2025) 394.
[13] B. Metin, Ç. Uyulan, T.T. Ergüzel, S. Farhad, E. Çifçi, Ö. Türk, N. Tarhan, The deep learning method differentiates patients with bipolar disorder from controls with high accuracy using EEG data, Clinical EEG and neuroscience 55(2) (2024) 167-175.
[14] I. Sekulić, M. Gjurković, J. Šnajder, Not just depressed: Bipolar disorder prediction on reddit, arXiv preprint arXiv:1811.04655 (2018).
[15] Q.B. Saeed, I. Ahmed, Early Detection of Mental Health Issues Using Social Media Posts, arXiv preprint arXiv:2503.07653 (2025).
[16] G. Anmella, A. Mas, M. Sanabra, C. Valenzuela-Pascual, M. Valentí, I. Pacchiarotti, A. Benabarre, I. Grande, M. De Prisco, V. Oliva, Electrodermal activity in bipolar disorder: Differences between mood episodes and clinical remission using a wearable device in a real-world clinical setting, Journal of Affective Disorders 345 (2024) 43-50.
[17] M. Chen, X. Xia, Z. Kang, Z. Li, J. Dai, J. Wu, C. Chen, Y. Qiu, T. Liu, Y. Liu, Distinguishing schizophrenia and bipolar disorder through a multiclass classification model based on multimodal neuroimaging data, Journal of Psychiatric Research 172 (2024) 119-128.
[18] S.M. Lim, C.W.C. Shiau, L.J. Cheng, Y. Lau, Chatbot-delivered psychotherapy for adults with depressive and anxiety symptoms: a systematic review and meta-regression, Behavior Therapy 53(2) (2022) 334-347.
[19] Z. Zhong, Z. Wang, Intelligent Depression Prevention via LLM-Based Dialogue Analysis: Overcoming the Limitations of Scale-Dependent Diagnosis through Precise Emotional Pattern Recognition, arXiv preprint arXiv:2504.16504 (2025).
[20] C. Lee, S. Seo, H. Do, G.G. Lee, Multi-aspect Depression Severity Assessment via Inductive Dialogue System, arXiv preprint arXiv:2410.21836 (2024).
[21] P. Baki, H. Kaya, E. Çiftçi, H. Güleç, A.A. Salah, Speech analysis for automatic mania assessment in bipolar disorder, arXiv preprint arXiv:2202.06766 (2022).
[22] Y. Wu, G. Wan, J. Li, S. Zhao, L. Ma, T. Ye, I. Pop, Y. Zhang, J. Chen, WiseMind: Recontextualizing AI with a Knowledge-Guided, Theory-Informed Multi-Agent Framework for Instrumental and Humanistic Benefits, arXiv preprint arXiv:2502.20689 (2025).
[23] H. Sun, Z. Lin, C. Zheng, S. Liu, M. Huang, Psyqa: A chinese dataset for generating long counseling text for mental health support, 2021.findings-acl.130, 2021.
[24] W. Zhong, L. Guo, Q. Gao, H. Ye, Y. Wang, Memorybank: Enhancing large language models with long-term memory, Proceedings of the AAAI Conference on Artificial Intelligence, 2024, pp. 19724-19731.
[25] J. Johnson, M. Douze, H. Jégou, Billion-scale similarity search with GPUs, IEEE Transactions on Big Data 7(3) (2019) 535-547.
[26] T. GLM, A. Zeng, B. Xu, B. Wang, C. Zhang, D. Yin, D. Zhang, D. Rojas, G. Feng, H. Zhao, Chatglm: A family of large language models from glm-130b to glm-4 all tools, arXiv preprint arXiv:2406.12793 (2024).
[27] C.-Y. Lin, Rouge: A package for automatic evaluation of summaries, Text summarization branches out, 2004, pp. 74-81.
[28] T. Zhang, V. Kishore, F. Wu, K.Q. Weinberger, Y. Artzi, Bertscore: Evaluating text generation with bert, arXiv preprint arXiv:1904.09675 (2019).
[29] Y. Liu, D. Iter, Y. Xu, S. Wang, R. Xu, C. Zhu, G-eval: Nlg evaluation using gpt-4 with better human alignment, 2023.emnlp-main.153, 2023.
[30] H. Zhong, Z. Dou, Y. Zhu, H. Qian, J.-R. Wen, Less is More: Learning to Refine Dialogue History for Personalized Dialogue Generation, Association for Computational Linguistics, Seattle, United States, 2022, pp. 5808-5820.
[31] H. Song, W.-N. Zhang, Y. Cui, D. Wang, T. Liu, Exploiting persona information for diverse generation of conversational responses, arXiv preprint arXiv:1905.12188 (2019).
[32] J. Wang, Y. Huang, Z. Liu, D. Xu, C. Wang, X. Shi, R. Guan, H. Wang, W. Yue, Y. Huang, STAMPsy: Towards SpatioTemporal-Aware Mixed-Type Dialogues for Psychological Counseling, Proceedings of the AAAI Conference on Artificial Intelligence, 2025, pp. 25371-25379.
[33] Y. Tang, B. Wang, M. Fang, D. Zhao, K. Huang, R. He, Y. Hou, Enhancing personalized dialogue generation with contrastive latent variables: Combining sparse and dense persona, arXiv preprint arXiv:2305.11482 (2023).
[34] Y. Tang, B. Wang, D. Zhao, X. Jin, J. Zhang, R. He, Y. Hou, Morpheus: Modeling role from personalized dialogue history by exploring and utilizing latent space, arXiv preprint arXiv:2407.02345 (2024).
[35] H. Zhong, Z. Dou, Y. Zhu, H. Qian, J.-R. Wen, Less is more: Learning to refine dialogue history for personalized dialogue generation, arXiv preprint arXiv:2204.08128 (2022).
[36] M. Lukasiewicz, S. Gerard, A. Besnard, B. Falissard, E. Perrin, H. Sapin, M. Tohen, C. Reed, J.M. Azorin, Young Mania Rating Scale: how to interpret the numbers? Determination of a severity threshold and of the minimal clinically significant difference in the EMBLEM cohort, Int J Methods Psychiatr Res 22(1) (2013) 46-58.
校內:2026-08-31公開