簡易檢索 / 詳目顯示

研究生: 莊庭睿
Zhuang, Ting-Rui
論文名稱: 使用卷積-曼巴網路於三維低劑量電腦斷層掃描影像進行肺結節偵測
3D LDCT Lung Nodule Detection Using Convolution-Mamba Network
指導教授: 連震杰
Lien, Jenn-Jier James
郭淑美
Guo, Shu-Mei
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 人工智慧科技碩士學位學程
Graduate Program of Artificial Intelligence
論文出版年: 2025
畢業學年度: 113
語文別: 英文
論文頁數: 94
中文關鍵詞: 肺結節肺癌電腦斷層掃描曼巴卷積狀態空間模型
外文關鍵詞: Pulmonary nodule, lung cancer, computed tomography, Mamba, convolution, state space model
相關次數: 點閱:7下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 為提升肺癌早期篩查中低劑量電腦斷層(Low Dose Computed Tomography, LDCT)影像的結節偵測效率與準確性,本研究旨在解決現有深度學習方法的瓶頸。傳統卷積神經網路(CNN)難以捕捉三維全局上下文,而Transformer模型則因其在處理高解析度影像時二次方複雜度所導致的龐大計算成本而應用受限。為解決上述挑戰,我們提出一個高效的單階段三維肺結節偵測框架,其核心為一創新的卷積-曼巴網路(Convolution-Mamba Network)。此架構透過並行雙分支設計,協同融合了卷積卓越的局部特徵提取能力與曼巴(Mamba)模型在建模長程依賴關係時的線性時間複雜度優勢。我們透過通道分離(Channel Split)與重排機制 (Channel Shuffle),實現了兩種互補特徵的深度資訊交互。更重要的是,為解決Mamba模型原生於一維序列的局限性,我們設計了三視圖曼巴 (Tri-View Mamba) 機制。該機制模擬放射科醫師的多視圖閱片習慣,沿著軸狀面(Axial)、冠狀面(Coronal)和矢狀面(Sagittal)三個正交平面進行六路並行雙向掃描,使模型能夠捕獲完整且各向同性的三維空間上下文資訊。在成大醫院的臨床資料集上驗證,本模型於臨床常用工作點(平均每掃描2個假陽性)下,召回率達到83.9%;在8個假陽性時,召回率更高達92.8%。實例分析亦證實,本方法對於先前難以識別的、與周圍組織粘連的困難結節具有更強的辨識能力。本研究成功將曼巴架構的優勢擴展至3D醫學影像偵測任務。對先前方法難以識別的、與周圍組織粘連的困難結節具有更強的辨識能力。綜上所述,本研究成功地將曼巴架構的優勢應用於三維肺結節偵測,並通過的三視圖曼巴和卷積-曼巴模塊解決了其在三維空間資料上的適配性問題。

    To enhance the efficiency and accuracy of nodule detection in low-dose computed tomography (LDCT) for early lung cancer screening, this study aims to address the bottlenecks of existing deep learning methodologies. While traditional Convolutional Neural Networks (CNNs) struggle to capture 3D global context, Transformer-based models are constrained by the substantial computational costs associated with their quadratic complexity, limiting their application in high-resolution imaging.
    To overcome these challenges, we propose an efficient, one-stage 3D pulmonary nodule detection framework centered on a novel Convolution-Mamba Network. This architecture employs a parallel dual-branch design to synergistically fuse the superior local feature extraction capabilities of convolution with the linear-time complexity advantage of the Mamba model for modeling long-range dependencies. Through channel split and channel shuffle mechanisms, we facilitate deep information exchange between these two complementary feature types. More importantly, to address Mamba's inherent limitation as a 1D sequential model, we designed an innovative Tri-View Mamba mechanism. Inspired by the multi-view reading practice of radiologists, this mechanism performs six-path, parallel, and bidirectional scanning along the three orthogonal planes—axial, coronal, and sagittal—enabling the model to capture complete and isotropic 3D spatial context.
    Validated on a clinical dataset from National Cheng Kung University Hospital, our model achieved a recall of 83.9% at a clinically common operating point of 2 false positives per scan, with the recall increasing to 92.8% at 8 false positives per scan. Case studies further demonstrate that our method exhibits a superior recognition capability for challenging nodules, particularly those adhering to surrounding tissues, which were previously difficult to identify. In summary, this study successfully extends the advantages of the Mamba architecture to 3D medical image detection tasks and resolves its spatial adaptation challenges through the proposed Tri-View Mamba and Convolution-Mamba modules.

    摘要 I ABSTRACT II 誌謝 III 目錄 V LIST OF TABLES VII LIST OF FIGURES VIII CHAPTER 1 INTRODUCTION 1 1.1 MOTIVATION AND OBJECTIVE 1 1.2 LUNG NODULE DIAGNOSIS PROCESS 2 1.2.1 The Doctor’s Lung Nodule Diagnosis Process 2 1.2.2 Computer-Aided Lung Nodule Diagnosis Process 5 1.3 GLOBAL FRAMEWORK 8 1.3.1 Data Preprocessing and Cube Set Extraction 8 1.3.2 Candidate Lung Nodule Detection using Convolution-Mamba Network 9 1.3.3 Post-processing: Lung Nodule Decision 10 1.4 RELATED WORKS 12 1.4.1 Methods Based on Convolutional Neural Networks 12 1.4.2 Methods Based on Transformers 13 1.4.3 Methods Based on Mamba 14 1.5 CONTRIBUTION 16 CHAPTER 2 SYSTEM SETUP AND SPECIFICATION 19 2.1 HARDWARE SPECIFICATIONS 19 2.2 GUI: THE RELABEL TOOL 19 CHAPTER 3 DATA PREPROCESSING AND POST-PROCESSING 24 3.1 DATA PREPROCESSING 24 3.1.1 Resample Spacing using Cubic Spline Interpolation 24 3.1.2 Lobe Segmentation Using 3D Connected Component Labeling 25 3.1.3 Scan Cropping 29 3.1.4 HU Normalization 30 3.2 DATA POST-PROCESSING 31 3.2.1 Maps Decoding 31 3.2.2 3D Non-Maximum Suppression (NMS) 35 CHAPTER 4 3D NODULE DETECTION: DATA PREPROCESSING AND POSTPROCESSING 38 4.1 MAMBA [6] 38 4.1.1 State Space Model (SSM) 39 4.1.2 Discretization Process 40 4.1.3 SSM Operational Example 40 4.1.4 Selective State Space Model (S6) [6] 43 4.2 CONVOLUTION-MAMBA NETWORK 44 4.2.1 Procedure of the Convolution-Mamba Network 45 4.2.2 Ground Truth Map Creation[18] 49 4.2.3 Loss Function 52 4.2.4 Conv-Mamba Module 56 CHAPTER 5 EXPERIMENT 61 5.1 DATA COLLECTION AND METRICS 61 5.1.1 Data Collection 61 5.1.2 Metrics 66 5.2 EXPERIMENTAL RESULT 69 5.2.1 Overall Performance Evaluation 69 5.2.2 Ablation Studies 71 5.3 RESULT ANALYSIS 74 5.3.1 Hard False Negative Analysis 74 5.3.2 Hard False Positive Analysis 76 CHAPTER 6 CONCLUSION AND FUTURE WORK 80 6.1 CONCLUSION 80 6.2 FUTURE WORK 80 REFERENCE 82

    [1] H. Cao, Y. Wang, J. Chen, D. Jiang, X. Zhang, Q. Tian, and M. Wang, “Swin-Unet: Unet-Like Pure Transformer for Medical Image Segmentation,” in Proceedings of the European conference on computer vision, pp. 205-218, 2022.
    [2] J. Chen, Y. Lu, Q. Yu, X. Luo, E. Adeli, Y. Wang, L. Lu, A. L. Yuille, and Y. Zhou, “Transunet: Transformers Make Strong Encoders for Medical Image Segmentation, ”arXiv preprint arXiv:2102.04306, 2021.
    [3] J. Christensen, A. E. Prosper, C. C. Wu, J. Chung, E. Lee, B. Elicker, A. R. Hunsaker, M. Petranovic, K. L. Sandler, and B. Stiles, “Acr Lung-Rads V2022: Assessment Categories and Management Recommendations, ”Journal of the American College of Radiology, 21(3), pp. 473-488, 2024.
    [4] M. Dolejší, and J. Kybic, “Automatic Two-Step Detection of Pulmonary Nodules,” in Proceedings of the Medical Imaging 2007: Computer-Aided Diagnosis, pp. 1093-1104, 2007.
    [5] R. Girshick, “Fast R-Cnn,” in Proceedings of the Proceedings of the IEEE international conference on computer vision, pp. 1440-1448, 2015.
    [6] A. Gu, and T. Dao, “Mamba: Linear-Time Sequence Modeling with Selective State Spaces, ”arXiv preprint arXiv:2312.00752, 2023.
    [7] Z. Guo, L. Zhao, J. Yuan, and H. Yu, “Msanet: Multiscale Aggregation Network Integrating Spatial and Channel Information for Lung Nodule Detection, ”IEEE Journal of Biomedical and Health Informatics, 26(6), pp. 2547-2558, 2021.
    [8] A. Hatamizadeh, and J. Kautz, “Mambavision: A Hybrid Mamba-Transformer Vision Backbone,” in Proceedings of the Proceedings of the Computer Vision and Pattern Recognition Conference, pp. 25261-25270, 2025.
    [9] K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” in Proceedings of the Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770-778, 2016.
    [10] V. T. Hu, S. A. Baumann, M. Gui, O. Grebenkova, P. Ma, J. Fischer, and B. Ommer, “Zigma: A Dit-Style Zigzag Mamba Diffusion Model,” in Proceedings of the European conference on computer vision, pp. 148-166, 2024.
    [11] T. Huang, X. Pei, S. You, F. Wang, C. Qian, and C. Xu, “Localmamba: Visual State Space Model with Windowed Selective Scan,” in Proceedings of the European Conference on Computer Vision, pp. 12-22, 2024.
    [12] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal Loss for Dense Object Detection,” in Proceedings of the Proceedings of the IEEE international conference on computer vision, pp. 2980-2988, 2017.
    [13] Y. Liu, Y. Tian, Y. Zhao, H. Yu, L. Xie, Y. Wang, Q. Ye, J. Jiao, and Y. Liu, “Vmamba: Visual State Space Model, ”Advances in neural information processing systems, 37, pp. 103031-103063, 2024.
    [14] Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows,” in Proceedings of the Proceedings of the IEEE/CVF international conference on computer vision, pp. 10012-10022, 2021.
    [15] J. Mei, M.-M. Cheng, G. Xu, L.-R. Wan, and H. Zhang, “Sanet: A Slice-Aware Network for Pulmonary Nodule Detection, ”IEEE transactions on pattern analysis and machine intelligence, 44(8), pp. 4374-4387, 2021.
    [16] O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional Networks for Biomedical Image Segmentation,” in Proceedings of the International Conference on Medical image computing and computer-assisted intervention, pp. 234-241, 2015.
    [17] A. A. A. Setio, A. Traverso, T. De Bel, M. S. Berens, C. Van Den Bogaard, P. Cerello, H. Chen, Q. Dou, M. E. Fantacci, and B. Geurts, “Validation, Comparison, and Combination of Algorithms for Automatic Detection of Pulmonary Nodules in Computed Tomography Images: The Luna16 Challenge, ”Medical image analysis, 42, pp. 1-13, 2017.
    [18] T. Song, J. Chen, X. Luo, Y. Huang, X. Liu, N. Huang, Y. Chen, Z. Ye, H. Sheng, and S. Zhang, “Cpm-Net: A 3d Center-Points Matching Network for Pulmonary Nodule Detection in Ct Scans,” in Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 550-559, 2020.
    [19] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention Is All You Need, ”Advances in neural information processing systems, 30, 2017.
    [20] J. Wang, J. Chen, D. Chen, and J. Wu, “Large Window-Based Mamba Unet for Medical Image Segmentation: Beyond Convolution and Self-Attention, ”CoRR, 2024.
    [21] Z. Xing, T. Ye, Y. Yang, G. Liu, and L. Zhu, “Segmamba: Long-Range Sequential Modeling Mamba for 3d Medical Image Segmentation,” in Proceedings of the International conference on medical image computing and computer-assisted intervention, pp. 578-588, 2024.
    [22] Z. Xu, T. Li, Y. Liu, Y. Zhan, J. Chen, and T. Lukasiewicz, “Pac-Net: Multi-Pathway Fpn with Position Attention Guided Connections and Vertex Distance Iou for 3d Medical Image Detection, ”Frontiers in Bioengineering and Biotechnology, 11, pp. 1049555, 2023.
    [23] Y. Yue, and Z. Li, “Medmamba: Vision Mamba for Medical Image Classification, ”arXiv preprint arXiv:2403.03849, 2024.
    [24] S. Zheng, L. J. Cornelissen, X. Cui, X. Jing, R. N. Veldhuis, M. Oudkerk, and P. M. van Ooijen, “Deep Convolutional Neural Networks for Multiplanar Lung Nodule Detection: Improvement in Small Nodule Identification, ”Medical physics, 48(2), pp. 733-744, 2021.
    [25] D. Zhou, J. Fang, X. Song, C. Guan, J. Yin, Y. Dai, and R. Yang, “Iou Loss for 2d/3d Object Detection,” in Proceedings of the 2019 international conference on 3D vision (3DV), pp. 85-94, 2019.
    [26] L. Zhu, B. Liao, Q. Zhang, X. Wang, W. Liu, and X. Wang, “Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model, ”arXiv preprint arXiv:2401.09417, 2024.

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE