簡易檢索 / 詳目顯示

研究生: 賴亭諭
Lai, Ting-Yu
論文名稱: 肺結節偵測基於多重注意力機制與多尺度特徵融合之殘差架構U-Net
Lung Nodule Detection Based on The Residual U-Net with Multi-attention and Multi-scale Feature Fusion
指導教授: 徐禕佑
Hsu, Yi-Yu
學位類別: 碩士
Master
系所名稱: 敏求智慧運算學院 - 智慧科技系統碩士學位學程
MS Degree Program on Intelligent Technology Systems
論文出版年: 2023
畢業學年度: 112
語文別: 中文
論文頁數: 69
中文關鍵詞: 肺結節醫學影像處理電腦輔助診斷系統
外文關鍵詞: Lung nodules, Medical image processing, Computer-aided diagnosis system
相關次數: 點閱:103下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近幾年來,電腦輔助診斷系統被應用於數種疾病,在肺癌早期階段的檢測中,我們可以藉由影像處理與深度學習技術來達成,本研究目標為偵測並標示肺部電腦斷層掃瞄影像的肺結節位置,在本研究中,將提出具備篩選以及標示肺結節位置的模型,此模型使用殘差架構U-Net,結合多尺度特徵融合(Multi-scale Feature Fusion)、空間和通道注意力機制(Spatial and Channel Attention)與Vision Transformer進行影像分割(Image segmentation)在肺影像資料庫聯盟和影像資料庫資源倡議(The lung image database consortium and image database resource initiative, LIDC-IDRI)資料集表現優良, IOU達到0.82,並且相比對照組模型,成功降低最多達39.61%的假陰性樣本,提高肺結節存在與否之判讀準確度至94.63%。

    In recent years, computer-aided diagnosis systems have been applied to various diseases. In the early detection of lung cancer, we can achieve this through image processing and deep learning techniques. The objective of this study is to detect and label the positions of lung nodules in computed tomography (CT) scan images. In this research, we propose a model with screening and marking capabilities for lung nodule positions. The model utilizes a residual architecture U-Net, incorporating multi-scale feature fusion, spatial and channel attention mechanisms, and Vision Transformer for image segmentation.

    The performance of the proposed model is evaluated on the Lung Image Database Consortium and Image Database Resource Initiative (LIDC-IDRI) dataset, achieving excellent results. The Intersection over Union (IoU) reaches 0.82. Furthermore, compared to the control group model, our model successfully reduces up to 39.61% of false negative samples, improving the accuracy of lung nodule detection to 94.63%.

    摘要 i ABSTRACT ii 目錄 x 表目錄 xiii 圖目錄 xiv 第一章 緒論 1 1-1 研究動機與目的 1 1-2 論文架構 2 1-3 論文貢獻 3 第二章 背景介紹 4 2-1 肺結節(Pulmonary Nodule) 4 2-2 多切面電腦斷層掃描(Computerized Tomography, CT) 6 2-3 亨氏單位(Hounsfield Unit, HU) 8 第三章 相關研究 10 3-1 深度學習(Deep Learning) 10 3-1-1 人工神經網路(Artificial Neural Networks, ANN) 10 3-1-2 卷積神經網路(Convolutional Neural Networks, CNN) 11 3-1-3 循環神經網路(Recurrent Neural Networks, RNN) 13 3-2 語意分割(Semantic Segmentation) 16 3-3 U-Net 16 3-3-1 編碼器(Encoder) 17 3-3-2 解碼器(Decoder) 17 3-3-3 跳躍連接(Skip Connection) 18 3-4 Residual U-Net 18 3-5 Spatial and Channel-wise Squeeze and Excitation(scSE) 20 3-6 Vision transformer(ViT) 20 3-7 多尺度特徵融合(Multi-scale Feature Fusion) 21 第四章 研究方法 23 4-1 資料前處理 23 4-1-1 影像像素單位轉換 23 4-1-2 影像尺寸大小切割 25 4-2 肺結節分割模型之架構 26 4-3 殘差單元(Residual Unit) 27 4-4 多尺度特徵融合(Multi-scale Feature Fusion) 29 4-5 空間和通道的注意力機制(Spatial & Channel Attention) 31 4-5-1 空間擠壓與通道激發(Spatial Squeeze and Channel Excitation, cSE) 33 4-5-2 通道擠壓與空間激發(Channel Squeeze and Spatial Excitation, sSE) 34 4-6 Vision Transformer(ViT) 35 4-7 損失函數(Loss Function) 37 第五章 實驗結果 39 5-1 前言 39 5-2 資料集介紹 39 5-3 效能評估標準 40 5-3-1 Dice係數(Dice Coefficient) 40 5-3-2 IoU(Intersection over Union) 40 5-3-3 混淆矩陣(Confusion Matrix) 41 5-4 針對注意力機制之消融研究 42 5-5 針對ViT之研究 43 5-6 針對影像剪切大小之研究 44 5-7 與其他方法之比較 45 5-8 本論文二種模型之比較 47 第六章 結論 48 參考文獻 49

    [1] 維基百科. "癌症." https://zh.wikipedia.org/zh-tw/%E7%99%8C%E7%97%87 (accessed.
    [2] 衛生福利部統計處. "110年國人死因統計結果." https://www.mohw.gov.tw/cp-16-70314-1.html (accessed.
    [3] 中華民國內政部. "特定死因除外簡易生命表." https://www.moi.gov.tw/cl.aspx?n=2948 (accessed.
    [4] 陳晉興、梁惠雯. "肺結節是肺癌前兆嗎?台大醫師:一表從「肺結節大小」看處理方式." 天下文化. https://www.edh.tw/article/31374 (accessed.
    [5] A. Dosovitskiy et al., "AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE," in 9th International Conference on Learning Representations, ICLR 2021, May 3, 2021 - May 7, 2021, Virtual, Online, 2021: International Conference on Learning Representations, ICLR, in ICLR 2021 - 9th International Conference on Learning Representations, p. Amazon; DeepMind; et al.; Facebook AI; Microsoft; OpenAI.
    [6] O. Ronneberger, P. Fischer, and T. Brox, "U-net: Convolutional networks for biomedical image segmentation," in 18th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2015, October 5, 2015 - October 9, 2015, Munich, Germany, 2015, vol. 9351: Springer Verlag, in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), pp. 234-241, doi: 10.1007/978-3-319-24574-4_28. [Online]. Available: http://dx.doi.org/10.1007/978-3-319-24574-4_28
    [7] O. Oktay et al., "Attention u-net: Learning where to look for the pancreas," arXiv preprint arXiv:1804.03999, 2018.
    [8] J. Chen et al., "Transunet: Transformers make strong encoders for medical image segmentation," arXiv preprint arXiv:2102.04306, 2021.
    [9] A. C. o. Radiology. "Lung CT Screening Reporting & Data System (Lung-RADS®)." https://www.acr.org/Clinical-Resources/Reporting-and-Data-Systems/Lung-Rads (accessed.
    [10] M. Walden. "Anatomical Planes Of Motion." https://www.teachpe.com/anatomy-physiology/planes-of-movement (accessed.
    [11] "電腦斷層掃描." https://zh.wikipedia.org/zh-tw/%E8%AE%A1%E7%AE%97%E6%9C%BA%E4%BD%93%E5%B1%82%E6%88%90%E5%83%8F (accessed.
    [12] 維基百科. "亨氏單位." https://zh.wikipedia.org/wiki/%E4%BA%A8%E6%B0%8F%E5%96%AE%E4%BD%8D#cite_note-19 (accessed.
    [13] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.
    [14] J. L. Elman, "Finding structure in time," Cognitive science, vol. 14, no. 2, pp. 179-211, 1990.
    [15] S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural computation, vol. 9, no. 8, pp. 1735-1780, 1997.
    [16] K. Cho et al., "Learning phrase representations using RNN encoder-decoder for statistical machine translation," arXiv preprint arXiv:1406.1078, 2014.
    [17] J. Long, E. Shelhamer, and T. Darrell, "Fully convolutional networks for semantic segmentation," in IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, June 7, 2015 - June 12, 2015, Boston, MA, United states, 2015, vol. 07-12-June-2015: IEEE Computer Society, in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 431-440, doi: 10.1109/CVPR.2015.7298965. [Online]. Available: http://dx.doi.org/10.1109/CVPR.2015.7298965
    [18] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, "Semantic image segmentation with deep convolutional nets and fully connected CRFs," in 3rd International Conference on Learning Representations, ICLR 2015, May 7, 2015 - May 9, 2015, San Diego, CA, United states, 2015: International Conference on Learning Representations, ICLR, in 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings.
    [19] K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," in 3rd International Conference on Learning Representations, ICLR 2015, May 7, 2015 - May 9, 2015, San Diego, CA, United states, 2015: International Conference on Learning Representations, ICLR, in 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings.
    [20] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in 29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, June 26, 2016 - July 1, 2016, Las Vegas, NV, United states, 2016, vol. 2016-December: IEEE Computer Society, in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 770-778, doi: 10.1109/CVPR.2016.90. [Online]. Available: http://dx.doi.org/10.1109/CVPR.2016.90
    [21] A. G. Howard et al., "Mobilenets: Efficient convolutional neural networks for mobile vision applications," arXiv preprint arXiv:1704.04861, 2017.
    [22] Z. Zhang, Q. Liu, and Y. Wang, "Road extraction by deep residual u-net," IEEE Geoscience and Remote Sensing Letters, vol. 15, no. 5, pp. 749-753, 2018.
    [23] A. G. Roy, N. Navab, and C. Wachinger, "Concurrent spatial and channel 'squeeze and excitation' in fully convolutional networks," arXiv, 2018.
    [24] A. Vaswani et al., "Attention is all you need," Advances in neural information processing systems, vol. 30, 2017.
    [25] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, "DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 4, pp. 834-848, 2018, doi: 10.1109/TPAMI.2017.2699184.
    [26] S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, "CBAM: Convolutional block attention module," arXiv, 2018.
    [27] J. Hu, L. Shen, S. Albanie, G. Sun, and E. Wu, "Squeeze-and-excitation networks," arXiv, 2017.
    [28] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of deep bidirectional transformers for language understanding," in 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2019, June 2, 2019 - June 7, 2019, Minneapolis, MN, United states, 2019, vol. 1: Association for Computational Linguistics (ACL), in NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, pp. 4171-4186.
    [29] J. Hou, C. Yan, R. Li, Q. Huang, X. Fan, and F. Lin, "Lung nodule segmentation algorithm with SMR-UNet," IEEE Access, 2023.
    [30] G. Zhou et al., "Deep interest network for click-through rate prediction," in Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, 2018, pp. 1059-1068.

    無法下載圖示 校內:不公開
    校外:不公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE