簡易檢索 / 詳目顯示

研究生: 董錦松
Tung, Chin-Sung
論文名稱: 以混合人工智慧模型實現自動化腦電圖背景分析與報告生成系統
Implementation of a Hybrid Artificial Intelligence System for Automated EEG Background Analysis and Report Generation
指導教授: 楊中平
Young, Chung-Ping
梁勝富
Liang, Sheng-Fu
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2025
畢業學年度: 113
語文別: 中文
論文頁數: 105
中文關鍵詞: 腦電圖人工智慧深度學習異常檢測大型語言模型枕部優勢節律自動化診斷
外文關鍵詞: Electroencephalography, Artificial Intelligence, Deep Learning, Abnormality Detection, Large Language Models, Posterior Dominant Rhythm, Automated Diagnosis
相關次數: 點閱:79下載:3
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 背景與目的:腦電圖(EEG)背景活動分析是神經科診斷的重要工具,但小型醫療機構常因缺乏專業人力與先進分析系統而面臨診斷挑戰。本研究旨在開發一個混合人工智慧系統,實現腦電圖背景異常的自動化檢測與結構化報告生成,以提升診斷效率與準確性。
    方法:本研究整合深度學習、基礎演算法與大型語言模型技術,使用2,494筆腦電圖記錄進行前處理,其中1,530筆用於訓練枕部優勢節律(PDR)預測模型。系統採用集成深度學習架構進行PDR預測,結合專家規則檢測廣泛背景減慢與局部異常,並利用Google Gemini 1.5 Pro生成結構化報告。透過自訂驗證資料集(100筆)與公開TUAB資料集(276筆)進行性能驗證,並與神經科醫師表現進行比較。
    結果:集成深度學習模型在PDR預測中達到卓越性能,平均絕對誤差(MAE)為0.237 Hz,均方根誤差(RMSE)為0.359 Hz,決定係數(r²)為0.952,0.6 Hz誤差內準確率達91.8%。在異常檢測方面,系統在廣泛背景減慢檢測上顯著優於神經科醫師(F1分數0.93 vs. 0.82,p = 0.02),局部異常檢測的F1分數達0.71。大型語言模型生成的512份報告達到100%準確性,經多模型驗證(Gwet AC1 > 0.97)確認無AI幻覺現象。系統在跨機構驗證中展現良好泛化能力,TUAB與本院資料集的F1分數分別為0.835與0.887,差異無統計顯著性(p > 0.59)。透過1,296組參數網格搜尋,優化後閾值組合使兩個資料集的最低F1分數達0.878。
    結論:本研究成功開發的混合AI系統在腦電圖背景分析中展現優異性能,僅需約1,500筆訓練資料即可達到高精度分析,相較國際先進系統的資料需求降低近20倍。系統具備良好的跨機構泛化能力、多語言報告生成功能與低硬體需求特性,為資源有限的醫療環境提供實用解決方案。未來可擴展至癲癇檢測、多中心驗證與實時監護等應用,推動腦電圖AI輔助診斷技術的臨床普及。

    Electroencephalogram (EEG) background activity analysis is crucial for neurological diagnosis, but small healthcare facilities often lack specialized personnel and advanced analysis systems. This study developed a hybrid artificial intelligence system integrating deep learning, rule-based algorithms, and large language models for automated EEG background abnormality detection and structured report generation. Using 2,494 EEG records, with 1,530 for posterior dominant rhythm (PDR) prediction model training, the ensemble deep learning architecture achieved excellent performance: mean absolute error (MAE) of 0.237 Hz, root mean square error (RMSE) of 0.359 Hz, coefficient of determination (r²) of 0.952, and 91.8% accuracy within 0.6 Hz. The system significantly outperformed neurologists in generalized background slowing detection (F1 score 0.93 vs. 0.82, p = 0.02) and achieved 0.71 F1 score for focal abnormality detection. Large language model-generated reports demonstrated 100% accuracy across 512 test cases with no AI hallucination phenomena. Cross-institutional validation showed robust generalization capability with F1 scores of 0.835 and 0.887 for TUAB and institutional datasets respectively. The system requires only 1,500 training samples, representing a 20-fold reduction compared to international advanced systems, while providing multilingual report generation and low hardware requirements for resource-limited healthcare environments.

    摘要 ii Extended Abstract iii 誌謝 viii 目錄 ix 表目錄 xiii 圖目錄 xiv 符號 xvi Chapter 1簡介 1 1.1 研究背景與動機 1 1.2 研究目的與目標 2 1.2.1 開發高效能枕部優勢節律預測模型 2 1.2.2 實現非監督式干擾波移除與異常檢測的整合演算法 2 1.2.3 實現結構化腦電圖報告的自動生成 2 1.2.4 驗證系統的準確性、穩定性與泛化能力 3 1.3 研究創新性與貢獻 3 Chapter 2研究背景及文獻回顧 4 2.1 腦電圖的臨床應用 4 2.2 腦電圖干擾波與非監督式移除技術 4 2.3 腦電圖自動化分析的發展 5 2.4 腦電圖報告生成與大型語言模型 6 2.5 現有研究的不足與本研究定位 7 Chapter 3研究方法 9 3.1 研究設計與倫理考量 9 3.2 系統架構與工作流程 9 3.2.1 第一階段:數據前處理與干擾波移除 (3.3-3.5章節) 10 3.2.2 第二階段:PDR預測與異常檢測 (3.6-3.8章節) 10 3.2.3 第三階段:結構化報告生成 (3.10章節) 10 3.3 腦電圖數據 11 3.3.1 數據採集 11 3.3.2 腦電圖設備與記錄 11 3.4 腦電圖數據預處理 11 3.4.1 腦電圖文件轉換 11 3.4.2 重建腦電圖參考電極及片段分割 12 3.4.3 腦電圖頻段定義 13 3.4.4 清醒閉眼腦電圖片段選擇 13 3.5 腦電圖干擾波處理 14 3.5.1 Neighbor-HBOS干擾波檢測方法 14 3.5.2 鄰近電極定義與空間關係 16 3.5.3 干擾波修復策略 16 3.6 獲取用於解釋腦電圖背景的特徵:枕部優勢節律 24 3.6.1 數據集 24 3.6.2 標記 24 3.6.3 特徵提取 30 3.6.4 模型架構 31 3.6.5 訓練環境與參數 33 3.6.6 驗證方法 33 3.6.7 性能指標 34 3.6.8 統計分析 35 3.7 其他腦電圖特徵 36 3.7.1 前後梯度(AP gradient) 36 3.7.2 左右半球總功率 36 3.7.3 慢頻帶功率比(1.5至8 Hz) 37 3.7.4 左右α、θ和δ頻帶功率比 37 3.8 解釋腦電圖異常的演算法 38 3.8.1 廣泛背景減慢(Generalized Background Slowing, GBS) 38 3.8.2 背景不對稱(Background Asymmetry) 39 3.8.3 局部慢波(Focal Slow Wave) 40 3.9 混合AI系統準確性驗證 43 3.9.1 驗證資料來源與方法 43 3.9.2 性能評估指標與統計分析 44 3.9.3 跨資料集閾值一致性驗證 45 3.10 生成腦電圖背景報告 46 3.10.1 輸入資料處理 46 3.10.2 提示詞工程 47 3.10.3 報告生成流程 48 3.10.4 不同提示詞長度對報告品質之影響分析 48 3.10.5 多語言輸出能力驗證 53 3.11 大型語言模型報告準確性驗證 56 3.11.1 驗證方法設計 56 3.11.2 驗證流程 56 3.11.3 準確性評估 56 3.11.4 驗證提示詞設計 57 Chapter 4研究結果 58 4.1 個案臨床背景與基本資料分析 58 4.1.1 個案診斷特徵分析 58 4.1.2 年齡分布特徵 58 4.1.3 性別分布分析 61 4.2 完整資料集特徵分析結果 61 4.3 TUAB資料集特徵分析結果 62 4.4 枕部優勢節律預測結果 63 4.4.1 K-fold Cross-validation結果 64 4.4.2 不同資料集大小的驗證結果 65 4.4.3 不同訓練資料集(干擾修復)的性能比較 66 4.4.4 不同模型架構的性能 67 4.4.5 干擾波移除對腦電圖資料集的影響 69 4.5 混合AI系統與神經科醫師解讀的準確性比較 69 4.6 TUAB資料集性能驗證 70 4.7 單變數性能趨勢分析 71 4.7.1 跨機構平衡閾值篩選 74 4.7.2 最終參數決定 74 4.8 跨機構平衡閾值驗證 74 4.9 大型語言模型報告生成準確性驗證 75 4.10 研究結果總結 76 Chapter 5結論與未來方向 77 5.1 研究結論 77 5.1.1 技術成果與性能表現 77 5.1.2 創新貢獻與突破 78 5.1.3 閾值優化與跨機構驗證 78 5.1.4 系統實用性與可部署性 78 5.2 研究限制與未來方向 79 5.2.1 當前研究限制 79 5.2.2 技術改進方向 80 5.2.3 倫理與安全考量 80 5.3 總結展望 80 References 82

    尤香玉, 王婕雄, 吳禹利, 周宜卿, 周頭彬, 林光麒, 林秀娜, 施養性, 徐崇堯, 張明裕, 莊曜鴻, 郭綜合, 陳大成, 曾元孚, 黃欽威, 蔡孟哲, 謝良上, 謝良博, 顏得復, & 關尚勇. (2021). 腦電圖指引 (第三版 ed.). 台灣癲癇醫學會.
    Adams, L. C., Truhn, D., Busch, F., Kader, A., Niehues, S. M., Makowski, M. R., & Bressem, K. K. (2023). Leveraging GPT-4 for post hoc transformation of free-text radiology reports into structured reporting: A multilingual feasibility study. Radiology, 307(4), e230725.
    Al-Qazzaz, N. K., Ali, S. H. B. M., Ahmad, S. A., Islam, M. S., & Escudero, J. (2014). Role of EEG as biomarker in the early detection and classification of dementia. The Scientific World Journal, 2014, 906038.
    Annarumma, M., Withey, S. J., Bakewell, R. J., Pesce, E., Goh, V., & Montana, G. (2019). Automated Triaging of Adult Chest Radiographs with Deep Artificial Neural Networks. Radiology, 291(1), 196–202.
    Anthropic. (2024, June 21). Claude 3.5 Sonnet. https://www.anthropic.com/news/claude-3-5-sonnet
    Bazanova, O. M., & Vernon, D. (2014). Interpreting EEG alpha activity. Neuroscience & Biobehavioral Reviews, 44, 94–110.
    Benbadis, S. R. (2020). The role of EEG in patients with suspected epilepsy. Epileptic Disorders, 22(2):143-155.
    Craik, A., He, Y., & Contreras-Vidal, J. L. (2019). Deep learning for electroencephalogram (EEG) classification tasks: A review. Journal of Neural Engineering, 16(3), 031001.
    Dong, L., Li, F., Liu, Q., Wen, X., Lai, Y., Xu, P., & Yao, D. (2017). MATLAB toolboxes for reference electrode standardization technique (REST) of scalp EEG. Frontiers in Neuroscience, 11, 601.
    Emmady, P. D., Asuncion, R. M. D., & Anilkumar, A. C. (2024). EEG abnormal waveforms. In StatPearls. StatPearls Publishing. https://www.ncbi.nlm.nih.gov/books/NBK557655/
    Feigin, V. L., Vos, T., Nichols, E., Owolabi, M. O., Carroll, W. M., Dichgans, M., Deuschl, G., Parmar, P., Brainin, M., & Murray, C. J. (2020). The global burden of neurological disorders: Translating evidence into policy. The Lancet Neurology, 19(3), 255–265.
    Gemini Team, Georgiev, P., Lei, V. I., Burnell, R., Bai, L., Gulati, A., ... & Batsaikhan, B. O. (2024). Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context. arXiv. https://arxiv.org/abs/2403.05530
    Goldstein, M., & Dengel, A. (2012). Histogram-based outlier score (HBOS): A fast unsupervised anomaly detection algorithm. KI 2012: Poster and demo track (pp. 59–63).
    Gramfort, A., Luessi, M., Larson, E., Engemann, D. A., Strohmeier, D., Brodbeck, C., Parkkonen, L., & Hämäläinen, M. (2013). MEG and EEG data analysis with MNE-Python. Frontiers in Neuroscience, 7, 267.
    Grant, A. C., Abdel-Baki, S. G., Weedon, J., Arnedo, V., Chari, G., Koziorynska, E., Lushbough, C., Maus, D., McSween, T., Mortati, K. A., Reznikov, A., & Omurtag, A. (2014). EEG interpretation reliability and interpreter confidence: A large single-center study. Epilepsy & Behavior, 32, 102–107.
    Gwet, K. L. (2008). Computing inter-rater reliability and its variance in the presence of high agreement. British Journal of Mathematical and Statistical Psychology, 61(1), 29–48.
    Halford, J. J., Shiau, D., Desrochers, J. A., Kolls, B. J., Dean, B. C., Waters, C. G., Azar, N. J., Haas, K. F., Kutluay, E., Martz, G. U., Sinha, S. R., Kern, R. T., Kelly, K. M., Sackellares, J. C., & LaRoche, S. M. (2015). Inter-rater agreement on identification of electrographic seizures and periodic discharges in ICU EEG recordings. Clinical Neurophysiology, 126(9), 1661–1669.
    He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770–778.
    Hirsch, L. J., Fong, M. W., Leitinger, M., LaRoche, S. M., Beniczky, S., Abend, N. S., Akiyama, T., Bathula, S., Borror, W., Carpenter, J. L., Gaspard, N., Halford, J. J., Hopp, J. L., Jahodova, A., Kang, J. Y., McDonough, T. L., Poduri, A., Raucci, U., Schmitt, S. E., Singh, K., Tan, R. Y., Tobochnik, S., Vespa, P. M., Szaflarski, J. P., Herman, S. T., & Sutter, R. (2021). American Clinical Neurophysiology Society’s standardized critical care EEG terminology: 2021 version. Journal of Clinical Neurophysiology, 38(1), 1–29.
    International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use. (2016). Integrated addendum to ICH E6(R1): Guideline for good clinical practice E6(R2). ICH. https://www.ich.org/page/efficacy-guidelines
    Jasper, H. H. (1958). The ten-twenty electrode system of the International Federation. Electroencephalography and Clinical Neurophysiology, 10, 371–375.
    Jiang, X., Bian, G.-B., & Tian, Z. (2019). Removal of artifacts from EEG signals: A review. Sensors, 19(5), 987.
    Kaplan, P. W., & Rossetti, A. O. (2011). EEG patterns and imaging correlations in encephalopathy: Encephalopathy part II. Journal of Clinical Neurophysiology, 28(3), 233–251.
    Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. Proceedings of the 14th International Joint Conference on Artificial Intelligence, 2, 1137–1143.
    Lawhern, V. J., Solon, A. J., Waytowich, N. R., Gordon, S. M., Hung, C. P., & Lance, B. J. (2018). EEGNet: A compact convolutional neural network for EEG-based brain–computer interfaces. Journal of Neural Engineering, 15, 056013.
    Libenson, M. H. (2024). Electroencephalographic artifacts. In M. H. Libenson (Ed.), Practical approach to electroencephalography (2nd ed., pp. 149-176). Elsevier.
    Lin, P.-J., Zhai, X., Li, W., Li, T., Cheng, D., Li, C., Pan, Y., & Ji, L. (2022). A transferable deep learning prognosis model for predicting stroke patients' recovery in different rehabilitation trainings. IEEE Journal of Biomedical and Health Informatics, 26(12), 6003–6011.
    Lodder, S. S., & van Putten, M. J. A. M. (2011). Automated EEG background analysis: Characterizing the posterior dominant rhythm. Journal of Neuroscience Methods, 200(1), 86-93.
    Lodder, S. S., & van Putten, M. J. A. M. (2013). Quantification of the adult EEG background pattern. Clinical Neurophysiology, 124(2), 228–237.
    Lopez, S. (2017). Automated identification of abnormal EEGs [Master's thesis, Temple University]. https://www.isip.piconepress.com/publications/ms_theses/2017/abnormal/thesis/
    The MathWorks Inc. (2023). MATLAB (Version R2023b).
    Nichols, E., Steinmetz, J. D., Vollset, S. E., Fukutaki, K., Chalek, J., Abd-Allah, F., Abdoli, A., Abualhasan, A., Abu-Gharbieh, E., Akram, T. T., ... Vos, T. (2022). Estimation of the global prevalence of dementia in 2019 and forecasted prevalence in 2050: An analysis for the Global Burden of Disease Study 2019. The Lancet Public Health, 7(2), e105–e125.
    Nuwer, M. R. (1998). Assessing digital and quantitative EEG in clinical settings. Journal of Clinical Neurophysiology, 15(6), 458–463.
    O'shea, K., & Nash, R. (2015). An introduction to convolutional neural networks. arXiv. https://arxiv.org/abs/1511.08458
    Oostenveld, R., Fries, P., Maris, E., & Schoffelen, J.-M. (2011). FieldTrip: Open source software for advanced analysis of MEG, EEG, and invasive electrophysiological data. Computational Intelligence and Neuroscience, 2011, Article 156869.
    OpenAI. (2024, May 13). GPT-4o. https://openai.com/index/hello-gpt-4o/
    Percival, D. B., & Walden, A. T. (1993). Spectral analysis for physical applications: Multitaper and conventional univariate techniques. Cambridge University Press.
    Prerau, M. J., Bianchi, M. T., Brown, R. E., Ellenbogen, J. M., & Patrick, P. L. (2017). Sleep neurophysiological dynamics through the lens of multitaper spectral analysis. Physiology, 32(1), 60–92.
    Roustan, D., & Bastardot, F. (2025). The clinicians' guide to large language models: A general perspective with a focus on hallucinations. Interactive Journal of Medical Research, 14, e59823.
    Roy, Y., Banville, H., Albuquerque, I., Gramfort, A., Falk, T. H., & Faubert, J. (2019). Deep learning-based electroencephalography analysis: A systematic review. Journal of Neural Engineering, 16(5), 051001.
    Saba-Sadiya, S., Chantland, E., Alhanai, T., Liu, T., & Ghassemi, M. M. (2021). Unsupervised EEG artifact detection and correction. Frontiers in Digital Health, 2, 608920.
    Schirrmeister, R. T., Springenberg, J. T., Fiederer, L. D. J., Glasstetter, M., Eggensperger, K., Tangermann, M., Hutter, F., Burgard, W., & Ball, T. (2017). Deep learning with convolutional neural networks for EEG decoding and visualization. Human Brain Mapping, 38(11), 5391–5420.
    Shi, Z., Liao, Z., & Tabata, H. (2023). Enhancing performance of convolutional neural network-based epileptic electroencephalogram diagnosis by asymmetric stochastic resonance. IEEE Journal of Biomedical and Health Informatics, 27(9), 4228 - 4239.
    St. Louis, E. K., & Frey, L. C. (Eds.). (2016). Electroencephalography (EEG): An introductory text and atlas of normal and abnormal findings in adults, children, and infants. American Epilepsy Society.
    Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., & Rabinovich, A. (2015). Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1–9.
    Tatum, W. O., Rubboli, G., Kaplan, P. W., Mirsatari, S. M., Radhakrishnan, K., Gloss, D., Caboclo, L. O., Drislane, F. W., Koutroumanidis, M., Schomer, D. L., Kasteleijn-Nolst Trenité, D., Cook, M., & Beniczky, S. (2018). Clinical utility of EEG in diagnosing and monitoring epilepsy in adults. Clinical Neurophysiology, 129(5), 1056-1082.
    Thomson, D. J. (2005). Spectrum estimation and harmonic analysis. Proceedings of the IEEE, 70(9), 1055–1096.
    Tveit, J., Aurlien, H., Plis, S., Calhoun, V. D., Tatum, W. O., Schomer, D. L., Arntsen, V., Cox, F., Fahoum, F., Gallentine, W. B., Gardella, E., Hahn, C. D., Husain, A. M., Kessler, S., Kural, M. A., Nascimento, F. A., Tankisi, H., Ulvin, L. B., Wennberg, R., & Beniczky, S. (2023). Automated interpretation of clinical electroencephalograms using artificial intelligence. JAMA Neurology, 80(8), 805–812.
    Urigüen, J. A., & Garcia-Zapirain, B. (2015). EEG artifact removal—state-of-the-art and guidelines. Journal of Neural Engineering, 12(3), 031001.
    Wu, X., Jiang, S., Li, G., Liu, S., Metcalfe, B., Chen, L., & Zhang, D (2023). Deep learning with convolutional neural networks for motor brain-computer interfaces based on stereo-electroencephalography (SEEG). IEEE Journal of Biomedical and Health Informatics, 27(5), 2387–2398..
    Wei, J., Wang, X., Schuurmans, D., Bosma, M., Chi, E., Le, Q., & Zhou, D. (2022). Chain of thought prompting elicits reasoning in large language models. arXiv. https://arxiv.org/abs/2201.11903
    World Health Organization. (2023, March 15). Dementia. https://www.who.int/news-room/fact-sheets/detail/dementia
    World Medical Association. (2013). World Medical Association Declaration of Helsinki: Ethical principles for medical research involving human subjects. https://www.wma.net/policies-post/wma-declaration-of-helsinki-ethical-principles-for-medical-research-involving-human-subjects/
    Yang, X., Chen, A., PourNejatian, N., Shin, H. C., Smith, K. E., Parisien, C., Compas, C., Martin, C., Costa, A. B., Flores, M. G., Zhang, Y., Magoc, T., Harle, C. A., Lipori, G., Mitchell, D. A., Hogan, W. R., Shenkman, E. A., Bian, J., & Wu, Y. (2022). A large language model for electronic health records. npj Digital Medicine, 5, 194.
    Yao, D. (2001). A method to standardize a reference of scalp EEG recordings to a point at infinity. Physiological Measurement, 22(4), 693–711.
    Zaghir, J., Naguib, M., Bjelogrlic, M., Névéol, A., Tannier, X., & Lovis, C. (2024). Prompt engineering paradigms for medical applications: Scoping review and recommendations for better practices. arXiv. https://arxiv.org/abs/2405.01249
    Zheng, J., Liang, M., Sinha, S., Ge, L., Yu, W., Ekstrom, A., & Hsieh, F. (2022). Time-frequency analysis of scalp EEG with Hilbert-Huang transform and deep learning. IEEE Journal of Biomedical and Health Informatics, 26(4), 1549–1559.
    Zibrandtsen, I. C., & Kjaer, T. W. (2021). Fully automatic peak frequency estimation of the posterior dominant rhythm in a large retrospective hospital EEG cohort. Clinical Neurophysiology Practice, 6:1-9.

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE