簡易檢索 / 詳目顯示

研究生: 謝昌達
Hsieh, Chang-Ta
論文名稱: 一個應用於息肉分割的有效深度學習模型
An Effective Deep Learning Model for Polyp Segmentation
指導教授: 戴顯權
Tai, Shen-Chuan
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2022
畢業學年度: 110
語文別: 英文
論文頁數: 66
中文關鍵詞: 醫學影像息肉分割深度學習Transformer注意力機制
外文關鍵詞: medical image, polyp segmentation, deep learning, Transformer,, attention mechanism
相關次數: 點閱:106下載:18
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 結腸鏡檢查是一種用來檢測大腸息肉的方法,可以幫助人們提早預 防癌變。然而,息肉形狀多樣,外型與腸道周邊黏膜組織相似,要準確 的檢測出息肉所在的位置及形狀,需消耗大量時間及人力,而且也常有 誤判的情況發生。
    本篇論文提出一種新的深度學習模型架構。使用 Transformer 與 CNN 作為模型的雙骨架,訓練出從低階到高階不同層級的特徵,並使用 Fusion Block 以及 Cross-Features Attention 整合上述特徵來提高準確率。 其中 Fusion Block 負責處理低階層的特徵,Cross-Features Attention 則負 責將處理過的低階特徵及高階特徵做一個注意力機制的結合。所使用的 訓練集為 Kvasir-SEG、CVC-ClinicDB,測試集為 Kvasir-SEG、CVCClinicDB、CVC-ColonDB、ETIS、EndoScene。實驗結果顯示本篇論文 的方法能更好的預測出息肉的位置及形狀。

    Colonoscopy is considered to be an effective technique for detecting colon polyps. It can help people to prevent cancer at an early stage. However, polyps have various shapes, the boundary between the polyp and the surrounding mucosa is not obvious, it usually takes a lot of time and labor to accurately detect the location and the shape of polyps, also, misclassification often occurs.
    This Thesis proposes a deep learning model that uses Transformer and CNN as the backbones of the model, trying to train different levels of features from lower-level to higher-level, then integrate these features by using Fusion Block and Cross-Features Attention. Fusion Block is responsible for processing low-level features, while Cross-Features Attention is responsible for combining the processed lower-level features and higher-level features. The training sets are Kvasir-SEG, and CVC-ClinicDB, and the testing sets are Kvasir-SEG, CVCClinicDB, CVC-ColonDB, ETIS, and EndoScene. The experimental results show that the proposed method could perform better prediction.

    摘 要 i Abstract ii Acknowledgments iii Contents iv List of Tables vii List of Figures ix Chapter 1 Introduction 1 1.1 Colorectal cancer 1 1.2 Colonoscopy 2 1.3 Overview 2 Chapter 2 Background and Related Works 4 2.1 Traditional Methods 4 2.2 Deep learning-based Methods 5 2.3 Attention Mechanism 7 2.3.1 Convolutional Block Attention Module 9 2.3.2 Channel attention module 11 2.3.3 Spatial attention module 11 2.3.4 Coordinate Attention 12 2.4 Vision Transformer 14 2.4.1 Multi-head Self-attention 17 2.4.2 Cross Attention 19 2.4.3 Pyramid Vision Transformer 19 2.5 Partial Decoder 23 2.6 Deep Supervision 24 Chapter 3 The Proposed Algorithm 26 3.1 Data Preprocessing 29 3.1.1 Data Augmentation 29 3.1.2 Multi-scale Training Strategy 30 3.2 Proposed Network Architecture 31 3.2.1 Encoder Block 34 3.2.2 Fusion Block 36 3.2.3 Partial Decoder 37 3.2.4 Cross Feature Attention 38 3.3 Loss Function 39 3.3.1 Deep Supervision 39 3.3.2 Binary Cross Entropy Loss 40 3.3.3 Intersection over Union Loss 40 3.3.4 Dice Loss 41 Chapter 4 Experiment 42 4.1 Experimental Dataset 42 4.2 Experimental Setting and Implemental Details 44 4.3 Experimental Results. 45 4.3.1 Evaluation Metrics 45 4.3.2 Comparison With State-Of-The-Arts 47 4.4 Ablation Experimental Result 53 4.4.1 Ablation study of loss hyper parameters 53 4.4.2 Ablation study of features selection 54 4.4.3 Ablation study of proposed network architecture 56 4.4.4 Ablation study of Coordinate Attention 58 Chapter 5 Conclusion and Future Work 60 5.1 Conclusion 60 5.2 Future Work 60 References 61

    [1] Health Promotion Administration Ministry of Health and Welfare, National Cancer Incidence Data, January 2022.

    [2] Fatima A. Haggar, and Robin P. Boushey, Colorectal Cancer Epidemiology: Incidence, Mortality, Survival, and Risk Factor. Clin Colon Rectal Surg 2009, 191-197, 2009.

    [3] Takahisa M., et al., Advances in image enhancement in colonoscopy for detection of adenomas. Nature Reviews Gastroenterology & Hepatology Volume 14, 305–314, 2017.

    [4] Sang Bong A. and Dong Soo H. The Miss Rate for Colorectal Adenoma Determined by Quality-Adjusted, Back-to-Back Colonoscopies. Gut and Liver, Vol. 6, No. 1, 64-70, 2012.

    [5] Katharina Z., F., and Susanne, S., Right-Sided Location Not Associated With Missed Colorectal Adenomas in an Individual-Level Reanalysis of Tandem Colonoscopy Studies. Gastroenterology Volume 157, Issue 3, September 2019, 660-671, 2019.

    [6] Alexander V., and Mamonov Isabel N., Automated Polyp Detection in Colon Capsule Endoscopy. IEEE Transactions on Medical Imaging, 1488-1502, 2014.

    [7] Marcelo Fiori P., and Guillermo S., A Complete System for Candidate Polyps
    Detection in Virtual Colonoscopy. International Journal of Pattern Recognition and
    Artificial Intelligence Vol. 28, No. 07, 1460014, 2014.

    [8] Nima T., Automated Polyp Detection in Colonoscopy Videos Using Shape and
    Context Information. IEEE Transactions on Medical Imaging, 630 - 644, 2016.

    [9] Omid Haji M., Superpixel Based Segmentation and Classification of Polyps in
    Wireless Capsule Endoscopy. 2017 IEEE Signal Processing in Medicine and Biology
    Symposium (SPMB), 2017.

    [10] Mojtaba A., and Majid M., Polyp Segmentation in Colonoscopy Images Using Fully Convolutional Network. 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2018.

    [11] Yun Bo Guo, and Bogdan J. M., GIANA Polyp Segmentation with Fully
    Convolutional Dilation Neural Networks. 14th International Joint Conference on
    Computer Vision, Imaging and Computer Graphics Theory and Applications, 25 - 27, 2019.

    [12] Brandao, P., Fully convolutional neural networks for polyp segmentation in
    colonoscopy. Proceedings Volume 10134, Medical Imaging 2017: Computer-Aided
    Diagnosis; 101340F (2017), 2017.

    [13] Olaf R., Philipp F., and T., U-Net: Convolutional Networks for Biomedical Image
    Segmentation. Medical Image Computing and Computer-Assisted Intervention –
    MICCAI 2015, 234–241, 2015.

    [14] Zongwei Z., UNet++: A Nested U-Net Architecture for Medical Image Segmentation. Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, 3–11, 2018.

    [15] Debesh J., ResUNet++: An Advanced Architecture for Medical Image Segmentation. IEEE International Symposium on Multimedia (ISM), 2019.

    [16] Debesh J., and Pia H. S., A Comprehensive Study on Colorectal Polyp Segmentation With ResUNet++, Conditional Random Field and Test-Time Augmentation. IEEE Journal of Biomedical and Health Informatics ( Volume: 25, Issue: 6, June 2021), 2029 - 2040, 2021.

    [17] Alex Sherstinsky., Fundamentals of Recurrent Neural Network (RNN) and Long
    Short-Term Memory (LSTM) Network. Physica D: Nonlinear Phenomena Volume
    404, March 2020, 132306, 2020.

    [18] Deng-P. F., PraNet: Parallel Reverse Attention Network for Polyp Segmentation.
    Medical Image Computing and Computer Assisted Intervention – MICCAI 2020,
    263–273, 2020.

    [19] Ange L., CaraNet: Context Axial Reverse Attention Network for Segmentation of Small Medical Objects. Proceedings Volume 12032, Medical Imaging 2022: Image Processing; 120320D (2022), 2022.

    [20] Nikhil K. T., DDANet: Dual Decoder Attention Network for Automatic Polyp
    Segmentation. ICPR 2021: Pattern Recognition. ICPR International Workshops and
    Challenges, 307–314, 2021.

    [21] Lukas L., and Marco K., Auxiliary Tasks in Multi-task Learning. arXiv:1805.06334 , 2018.

    [22] Sanghyun W., Jongchan P., CBAM: Convolutional Block Attention Module.
    Proceedings of the European Conference on Computer Vision (ECCV), 2018, 3-19,
    2018.

    [23] Qibin H., Coordinate Attention for Efficient Mobile Network Design. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, 13713-13722, 2018.

    [24] Jie H., Squeeze-and-Excitation Networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, 7132-7141, 2018.

    [25] Ashish V., Noam S., Niki P., Jakob U., and Llion J., Attention Is All You Need.
    Advances in Neural Information Processing Systems 30 (NIPS 2017), 2017.

    [26] Alexey D., and Lucas B., An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv:2010.11929, 2020.

    [27] Jieneng C., Yongyi L., TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation. arXiv:2102.04306, 2021.

    [28] Yundong Z., Huiye L., TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation. Medical Image Computing and Computer Assisted
    Intervention – MICCAI 2021, 14–24, 2021.

    [29] Bo D., Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers.
    arXiv:2108.06932, 2021.

    [30] Jacob D., and Ming-W., BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805, 2018.

    [31] Satyam M., FusAtNet: Dual Attention-based SpectroSpatial Multimodal Fusion
    Network for Hyperspectral and LiDAR Classification.Proceedings of the IEEE/CVF
    Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2020,
    92-93, 2020.

    [32] Wenhai W., Enze X., Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions.Proceedings of the IEEE/CVF International
    Conference on Computer Vision (ICCV), 568-578, 2021.

    [33] Wenhai W., Enze X., PVTv2: Improved Baselines with Pyramid Vision Transformer. Computational Visual Media volume 8, 415–424, 2022.

    [34] Zhe W., Li S., and Qingming H., Cascaded Partial Decoder for Fast and Accurate
    Salient Object Detection. Proceedings of the IEEE/CVF Conference on Computer
    Vision and Pattern Recognition (CVPR), 3907-3916, 2019.

    [35] Chen-Yu L., Saining X., Deeply-Supervised Nets. Proceedings of the Eighteenth
    International Conference on Artificial Intelligence and Statistics, 562-570, 2015.

    [36] Shih-En W., Convolutional Pose Machines. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 4724-4732, 2016.

    [37] MA Y., LIU Q., QIAN Z., Automated Image Segmentation Using Improved PCNN
    Model Based on Cross-entropy. Proceedings of 2004 International Symposium on
    Intelligent Multimedia, Video and Speech Processing, 2004.

    [38] Jiahui Y., UnitBox: An Advanced Object Detection Network. Proceedings of the 24th ACM international conference on Multimedia, 516–520, 2016.

    [39] Debesh J., Kvasir-SEG: A Segmented Polyp Dataset. MMM 2020: MultiMedia
    Modeling, 451–462, 2019

    [40] N. Tajbakhsh, S. R. Gurudu, and J. Liang, “Automated polyp detection in
    colonoscopy videos using shape and context information,” IEEE TMI, vol. 35, no. 2,
    630–644, 2015.

    [41] D. Vazquez, J. Bernal, F. J. S anchez, G. F., A. M. Lopez, A. Romero, M. Drozdzal,
    and A. Courville, A benchmark for endoluminal scene segmentation of colonoscopy images, JHE, vol.2017, 2017.

    [42] J. Silva, A. Histace, O. Romain, X. Dray, and B. Granado, Toward embedded
    detection of polyps in wce images for early diagnosis of colorectal cancer, IJCARS,
    vol. 9, no. 2, 283–293, 2014.

    [43] Pogorelov, K., Randel, K., Griwodz, C., Sigrun, E., Lange, T., Johansen,D.,
    Spampinato, C., Dang-Nguyen, D., Lux, M., Schmidt, P., Riegler, M.,Halvorsen, P.:
    Kvasir: A Multi-Class Image Dataset for Computer AidedGastrointestinal Disease
    Detection. In: Proceedings of Multimedia SystemsConference (MMSYS), 164–169.
    ACM, 2017.

    [44] Debesh J., Kvasir-SEG: Real-Time Polyp Detection, Localization and Segmentation in Colonoscopy Using Deep Learning. IEEE Access (Volume: 9), 40496-40510, 2021.

    [45] Nikhil K. T., DDANet: Dual Decoder Attention Network for Automatic Polyp
    Segmentation. ICPR 2021: Pattern Recognition. ICPR International Workshops and
    Challenges, 307–314, 2021.

    [46] Fan, D.P., Cheng, M.M., Liu, Y., Li, T., Borji, A.: Structure-measure: A new way to
    evaluate foreground maps. In: IEEE ICCV, 4548–4557, 2017.

    [47] Fan, D.P., Gong, C., Cao, Y., Ren, B., Cheng, M.M., Borji, A.: Enhanced-alignment
    Measure for Binary Foreground Map Evaluation. In: IJCAI, 2018.

    [48] T. Kim, H. Lee, and D. Kim, “Uacanet: Uncertainty augmented context attention for polyp segmentation,” in ACM MM, 2167-2175,2021.

    [49] Fausto M., V-Net: Fully Convolutional Neural Networks for Volumetric Medical
    Image Segmentation. 2016 Fourth International Conference on 3D Vision (3DV),
    2016.

    [50] Yanda M., Hongrun Z., Graph-Based Region and Boundary Aggregation for
    Biomedical Image Segmentation. IEEE Transactions on Medical Imaging, 690-710,
    2022

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE