| 研究生: |
許中瑋 Hsu, Zhong-Wei |
|---|---|
| 論文名稱: |
使用老師學生SANet結合半監督學習於聯邦學習架構下的3D肺結節偵測 Federated Learning-Based 3D Lung Nodule Detection Using Teacher-Student SANet with Semi-Supervised Learning |
| 指導教授: |
連震杰
Lien, Jenn-Jier |
| 共同指導教授: |
張超群
Chang, Chao-Chun 顏亦廷 Yen, Yi-Ting |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2024 |
| 畢業學年度: | 112 |
| 語文別: | 英文 |
| 論文頁數: | 83 |
| 中文關鍵詞: | 肺結節偵測 、聯邦式學習 、半監督式學習 、老師-學生架構 |
| 外文關鍵詞: | Lung Nodule Detection, Federated Learning, Semi-Supervised Learning, Teacher-Student Architecture |
| 相關次數: | 點閱:49 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
肺癌是全球癌症死因之首,接受低劑量電腦斷層 (Low-Dose Computed Tomography, LDCT) 掃描,在肺癌早期發現病徵並接受治療能有效降低死亡風險。使用深度學習 (Deep learning) 模型對電腦斷層影像進行肺結節偵測 (Lung nodule detection),提供醫師能迅速對肺結節是否存在及狀況有初步了解。然而足夠準確的深度學習模型往往需要使用大量資料,單一間醫院能收集的資料數量有限,並且若全部都要標記更是費時費力,同時醫療影像屬於高隱私性的資料,使用與傳輸受到法規限制,使得聚集多間醫院資料於一處的傳統深度學習方法變得難以進行。本論文分成兩個部份來訓練深度學習模型,第一個部份使用半監督式學習 (Semi-Supervised Learning, SSL)在醫院內部使用院內的資料訓練模型,使用半監督式學習,僅針對少部份的資料進行肺結節的標記,加上大部分的未標記資料用於訓練,使醫師不需對所有訓練資料進行標記,減少對每一張電腦斷層影像手動標記所需要的時間與精力。最後在評估方面使用了結合召回率 (Recall) 及準確率 (Precision) 的F1-score作為指標,在追求高召回率的同時,也盡量減少偽陽性 (False positive) 的數量。第二部份基於第一部份的深度學習算法,使用聯邦式學習 (federated learning) 串聯國立成功大學附設醫院 (HCKUH)、嘉義基督教醫院(CYCH)以及台南市立醫院(TMH)等三家醫院作為聯邦式學習中的合作器 (Collaborator, CO) 來訓練模型,並由一台位於成功大學醫學院的伺服器作為聚合器 (Aggregator, AG) 來聚合全域模型以及傳送初始模型到三個CO中。聯邦式學習不需要CO之間互相交換訓練資料,而是交換訓練的模型參數,避免了隱私洩漏的問題,也達到類似用大量資料訓練的成果。使用本論文提出的方法進行模型訓練,並使用成功大學附設醫院提供的訓練及測試資料,最後我們的模型在所有大小的平均召回率方面的表現為72.2%,若是專注在直徑大於4毫米的肺結節偵測上,忽略小於4毫米的肺結節,召回率可達到74.2%。
Lung cancer is the leading cause of cancer deaths worldwide. Undergoing Low-Dose Computed Tomography (LDCT) scans can effectively reduce the risk of death by detecting early signs of lung cancer and enabling timely treatment. Deep learning models are used to perform lung nodule detection on CT images, providing doctors with a preliminary understanding of the presence and condition of lung nodules quickly. However, highly accurate deep learning models often require large amounts of data, and the data that a single hospital can collect is limited. Furthermore, labeling all this data is time-consuming and labor-intensive. Additionally, medical images are highly private and subject to regulatory restrictions on use and transmission, making it difficult to aggregate data from multiple hospitals using traditional deep learning methods. This thesis trains deep learning models in two parts. The first part uses Semi-Supervised Learning (SSL) to train models within a hospital using internal data, with only a small portion of the data being labeled for lung nodules, and the majority being unlabeled data used for training. Semi-supervised learning reduces the time and effort required for doctors to manually label each CT image. Finally, the evaluation uses the F1-score, which combines recall and precision, to aim for high recall while minimizing the number of false positives. The second part builds on the deep learning algorithm from the first part and employs federated learning, involving collaboration among National Cheng Kung University Hospital (HCKUH), Chiayi Christian Hospital (CYCH), and Tainan Municipal Hospital (TMH) as collaborators (CO) to train the model. A server at the College of Medicine, National Cheng Kung University, acts as the aggregator (AG) to aggregate the global model and send the initial model to the three CO. Federated learning does not require the exchange of training data between CO but rather the exchange of model parameters, preventing privacy breaches and achieving results like those obtained with large amounts of data. Using the method proposed in this thesis for model training and using the training and testing data provided by National Cheng Kung University Hospital, our model achieved an average recall rate of 72.2% for nodules of all sizes. When focusing on detecting lung nodules larger than 4mm in diameter and ignoring those smaller than 4mm, the recall rate reached 74.2%.
[1] H. Chen and H. Vikalo, “The Best of Both Worlds: Accurate Global and Personalized Models through Federated Learning with Data-Free Hyper-Knowledge Distillation,” International Conference on Learning Representation, 2023.
[2] Y. Chen, X. Qin, J. Wang, C. Yu, and W. Gao, “Fedhealth: A Federated Transfer Learning Framework for Wearable Healthcare,” IEEE Intelligent Systems, pp. 83-93, 2020.
[3] M. Dolejší and J. Kybic, “Automatic Two-Step Detection of Pulmonary Nodules,” In Medical Imaging 2007: Computer-Aided Diagnosis, Vol 6514, pp. 1093-1104, 2007, March.
[4] T. Li, A. Kumar Sahu, A. Talwalkar and V. Smith, “Federated Learning Challenges, Methods, And Future Directions,” In IEEE Signal Processing Magazine, pp. 50-60, 2020.
[5] T. Li, M. Sanjabi, A. Beirami and V. Smith, “Fair Resource Allocation in Federated Learning,” arXiv preprint arXiv:190510497, 2019.
[6] T. Li, A. K. Sahu, M. Zaheer, M. Sanjabi, A. Talwalkar and V. Smith, “Federated Optimization in Heterogeneous Networks,” Proceedings of Machine Learning and Systems, 2, pp. 429-450, 2020.
[7] T.Y. Lin, P. Goyal, R. Girshick, K. He and P. Dollár, “Focal Loss for Dense Object Detection,” In Proceedings of the IEEE International Conference on Computer Vision, pp. 2980-2988, 2017.
[8] B. McMahan, E. Moore, D. Ramage, S. Hampson and B. A. Arcas, “Communication-Efficient Learning of Deep Networks from Decentralized Data,” In Proceedings of Machine Learning Research, pp. 1273-1282, 2017.
[9] J. Mei, M.M. Cheng, G. Xu, L.R. Wan and H. Zhang, “SANet: A Slice-Aware Network for Pulmonary Nodule Detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 44, pp. 4374-4387, 2021.
[10] A. Shrivastava, A. Gupta and R. Girshick, “Training Region-Based Object Detectors with Online Hard Example Mining,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 761-769, 2016.
[11] K. Sohn, Z. Zhang, C. Li, H. Zhang, C. Lee and T. Pfister, "A simple semi-supervised learning framework for object detection," ArXiv2020.
[12] J. Wang, Q. Liu, H. Liang, G. Joshi and H.V. Poor, “Tackling the Objective Inconsistency Problem in Heterogeneous Federated Optimization,” Advances in Neural Information Processing Systems, 33, pp. 7611-7623, 2020.
[13] T. Wang, L. Yuan, X. Zhang and J. Feng, “Distilling Object Detectors with Fine-Grained Feature Imitation,” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4933-4942, 2019.
[14] M. Xu, Z. Zhang, H. Hu, J. Wang, L. Wang, F. Wei and Z. Liu, “End-to-End Semi-Supervised Object Detection with Soft Teacher,” In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3060-3069, 2021.
[15] Z. Yang, Z. Li, X. Jiang, Y. Gong, Z. Yuan, D. Zhao and C. Yuan, “Focal and Global Knowledge Distillation for Detectors,” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4643-4652, 2022.
[16] L. Zhang, L. Shen, L. Ding, D. Tao and L.Y. Duan, “Fine-tuning Global Model via Data-Free Knowledge Distillation for Non-Iid Federated Learning,” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10174-10183, 2022.
[17] L. Zhang and K. Ma, “Improve Object Detection with Feature-Based Knowledge Distillation: Towards Accurate and Efficient Detectors,” In International Conference on Learning Representations, 2020, April.
[18] J. Zhang, S. Guo, X. Ma, H. Wang, W. Xu and F. Wu, “Parameterized Knowledge Transfer For Personalized Federated Learning,” Advances in Neural Information Processing Systems, 34, pp. 10092-10104, 2021.
[19] Y. Zhao, M. Li, L. Lai, N. Suda, D. Civin and V. Chandra, “Federated Learning with Non-IID Data,” arXiv preprint arXiv:180600582, 2018.
[20] 阮青龍, "使用改進的Transformer和時空對應網絡之交互式分割技術來標記3D肺結節," 成功大學碩圖書館, 2023. [Online] Available: https://thesis.lib.ncku.edu.tw/thesis/detail/14416de677104d1d36ffbe423c6d9e0f/. [Accessed 15 08 2024]
[21] "112年國人死因統計結果-衛生福利部," 衛生福利部, 17 06 2024. [Online]. Available: https://www.mohw.gov.tw/cp-16-79055-1.html. [Accessed 15 08 2024].
[22] "Lung cancer," WHO, 26 06 2023. [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/lung-cancer. [Accessed 15 08 2024]
[23] "衛生福利部國民健康署-肺癌早期偵測計畫," 衛生福利部國民健康署, 31 05 2022. [Online]. Available: https://www.hpa.gov.tw/Pages/List.aspx?nodeid=4619. [Accessed 15 08 2024].
[24] "Open Federated Learning Documentation," Intel OpenFL, 2024. [Online]. Available: https://openfl.readthedocs.io/en/latest/. [Accessed 16 08 2024].
[25] "Intel Distribution of Openvino Toolkit," Intel Openvino, 2024. [Online]. Available: https://www.intel.com/content/www/us/en/developer/tools/openvino-toolkit/overview.html. [Accessed 16 08 2024].
[26] "LUng Nodule Analysis 2016," Grand Challenge, 2016. [Online]. Available: https://luna16.grand-challenge.org/. [Accessed 16 08 2024].
校內:2029-08-22公開