| 研究生: |
裴氏俠 Bui, Thi-Hiep |
|---|---|
| 論文名稱: |
使用具備可分離空洞卷積之編解碼器學習網路進行肺結節分割 Lung Nodule Segmentation Using Encoder-Decoder Learning Network with Atrous Separable Convolution |
| 指導教授: |
連震杰
Lien, Jenn-Jier James |
| 共同指導: |
郭淑美
Guo, Shu-Mei |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2020 |
| 畢業學年度: | 108 |
| 語文別: | 英文 |
| 論文頁數: | 99 |
| 中文關鍵詞: | 肺結節分割 、編解碼器學習網絡 、空洞可分離卷積 、深度卷積神經網絡 、深度學習 |
| 外文關鍵詞: | Lung nodule segmentation, encoder-decoder learning network, atrous separable convolution, deep convolution neural network, deep learning |
| 相關次數: | 點閱:114 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在計算機體層攝影術(CT)中,肺結節的準確分割在肺癌的診斷和治療過程中扮演著重要角色。它可以幫助醫生監測,分析和評估肺結節的惡性程度。然而,由於肺結節的多樣性以及肺結節及其背景之間視覺特徵的相似性,肺結節的有效分割成為肺癌中一個具挑戰性的問題。近年來有許多用於圖像分割和肺結節分割的深度學習方法被提出。而在所有方法中,DeepLabV3plus具有良好的分割結果。該方法基於空間金字塔池化提出了一種具有不同空洞率的新型並行多重卷積,稱為空洞空間金字塔池化,該方法會使網絡將重點放在那些難以訓練的樣本上。在這項研究中,重點將會放在網絡體系本身的結構上。為了實現更高的精度、召回率以及並交比(IoU),提出了兩種修改方式以改進網絡架構。首先,由於模型的參數量會影響計算速度,我們提出了一種空洞可分離空間金字塔池化方法,該方法應用於修改後ResNet-101模塊4的輸出特徵圖。多尺度特徵圖可隨後用於分割具有不同尺度的對象。在本研究中,原始空洞空間金字塔池化內的空洞卷積被空洞可分離卷積代替。這種設計仍然可以得到網絡中的多尺度特徵圖並且無需調整輸入圖像的大小,同時也減少參數數量。其次,雙空洞可分離空間金字塔池化也在該模型中被應用,我們再添加一次空洞空間金字塔池化以從網絡中捕獲更豐富的語義信息。這樣的設計使網絡可以更好地學習肺結節數據集中不規則或非常小的結節。根據實驗結果,上述兩個修改可以真正改善在小型肺結節測試用資料集中的分割結果以及計算時間。此外,這項研究還展現使用區塊特徵抽取方法,可以使我們的網絡更專注在肺結節的周圍環境,並且減少肺部背景的雜訊。
Accurate segmentation of lung nodule in computed tomography (CT) play important role to diagnosis and treatment processing of lung cancer. It can assist the doctor to monitor, analysis and evaluate the malignancy of the lung nodules. However, due to the variety of lung nodules and the similarity of lung visual characteristics between lung nodules and their background, a robust segmentation of lung nodules becomes a challenging problem in lung cancer. There are many deep learning methods for image segmentation and lung nodule segmentation proposed in recent years. Among all the methods, DeepLabV3plus has the good segmentation results. This method proposed a novel parallel multiple convolution with different rate of atrous based on spatial pyramid pooling called atrous spatial pyramid pooling, which forces the network to focus on hard training examples. In this research, the emphasis is put on the network architecture itself instead. Two modifications are proposed to improve the network architecture in order to achieve higher precision, recall and intersection over union (IoU). First, due to the number of parameters of model effect to computation speed, so we proposed a atrous separable spatial pyramid pooling applied for output feature maps of block 4 of modified ResNet-101. The multi-scale feature maps can later be used to segment objects with different scales. The atrous convolution inside the original atrous spatial pyramid pooling is replaced by atrous separable convolution in this research. This design still can get multiple scale of feature maps in network without explicitly resizing the input image, also can reduce number of parameters as well. Second, dual atrous separable spatial pyramid pooling is applied, in this model, we add one more atrous spatial pyramid pooling to catch up richer semantic information from network. Such design makes the network learn better on the complex nodules or very small nodules in lung nodule dataset. According to the experimental results, the two modifications mentioned above can truly improve the segmentation results on small nodule test case dataset as well as improve the computational time. Besides, this research also shows a patch extraction method to make our network focus better on lung nodule’s surroundings, also reduce noise of lung background as well.
[1] V. Badrinarayanan, A. Kendall, and R. Cipolla, "Segnet: A deep convolutional encoder-decoder architecture for image segmentation", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 39, no. 12, pp. 2481-2495, 2017.
[2] F. Bray, J. Ferlay, S. Isabelle, L.-S. Rebecca, Lindsey, A. Torre, and J. Ahmedin, "Globalcancer statistics 2018: GLOBOCAN estimates ofincidence and mortality worldwide for 36 cancersin 185 countries", CA: A Cancer Journal for Clinicians, pp. 394-424, 2018.
[3] H.-C. Cao, H. Liu, E. Song, C.-C. Hung, G.-Z. Ma, X.-Y. Xu, R.-C. Jin, and J.-G. Lu, "Dual-branch residual network for lung nodule segmentation", arXiv:1905.08413, 2019.
[4] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A.-L. Yuille, "Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 40, no. 4, pp. 834-848, 2017.
[5] L.-C. Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adam, "Encoder-decoder with Atrous separable convolution for semantic image segmentation", European Conference on Computer Vision (ECCV), pp. 833-851, 2018.
[6] L.-C. Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adam, "Rethinking atrous convolution for semantic image segmentation", arXiv:1706.05587, 2017.
[7] F. Chollet, "Xception: Deep learning with depthwise separable convolutions", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1251-1258, 2017.
[8] J. Dai, H. Qi, Y. Xiong, Y. Li, G. Zhang, H. Hu, and Y. Wei, "Deformable convolutional networks", IEEE International Conference on Computer Vision (ICCV), pp. 764-773, 2017.
[9] K. Grauman, and T. Darrell, "The pyramid match kernel: Discriminative classification with sets of image features", IEEE International Conference on Computer Vision (ICCV), 2005.
[10] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770-778, 2016.
[11] K. He, X. Zhang, S. Ren, and J. Sun, "Spatial pyramid pooling in deep convolutional networks for visual recognition", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 37, no. 9, pp. 1904-1916, 2015.
[12] S. Ioffe, and C. Szegedy, "Batch normalization: Accelerating deep network training byreducing internal covariate shift", International Conference on Machine Learning (ICML), 2015.
[13] A. Kirillov, R. Girshick, K. He, and P. Dollar, "Panoptic feature pyramid networks", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6399-6408, 2019.
[14] L. Konstantinos, F. Andreas, K. Chrysoula, I. Marianthi, and G. Mina, "Lung nodules: A comprehensive review on current approach and management", Annals of Thoracic Medicine, 2019.
[15] A. Krizhevsky, I. Sutskever, and G.-E. Hinton, "Imagenet classification with deep convolutional neural networks", Conference on Neural Information Processing Systems (NIPS), pp. 1097-1105, 2012.
[16] S. Lazebnik, C. Schmid, and J. Ponce, "Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2169-2178, 2006.
[17] C. Ledig, R.-A. Heckemann, A. Hammers, J.-C. Lopez, V.-F. Newcombe, A. Makropoulos, J. L?tj?nen, D.-K. Menon, and D. Rueckert, "Robust whole-brain segmentation: application to traumatic brain injury Med", Medical Image Analysis, 2015.
[18] Z. Liu, L. Wang, G. Hua, Q.-L. Zhang, Z.-X. Niu, Y. Wu, and N.-Y. Zheng, "Joint video object discovery and segmentation by coupled dynamic markov networks", IEEE Transactions on Image Processing (TIP), vol. 27, no. 12, pp. 5840-5853, 2018.
[19] P. Luo, G. Wang, L. Lin, and X. Wang, "Deep dual learning for semantic image segmentation", IEEE International Conference on Computer Vision (ICCV), pp. 2718-2726, 2017
[20] O. Ronneberger, P. Fischer, and T. Brox "U-net: Convolutional networks for biomedical image segmentation", Medical Image Computing and Computer Assisted Intervention (MICCAI), pp. 234-241, 2015.
[21] C. Rother, V. Kolmogorov, and A. Blake, "Grabcut: Interactive foreground extraction using iterated graph cuts", Association for Computing Machinery (ACM) Transactions on Graphics (TOG), International Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), vol. 23, pp. 309-314, 2004.
[22] C.-Y. Salih, H.-S. Yusuf, and U. Gozde, "Semantic segmentation with extended DeepLabv3 architecture", Signal Processing and Communications Applications Conference (SIU), 2019.
[23] R.-L. Siegel, K.-D. Miller, and A. Jemal, "Cancer statistics 2018", CA: A Cancer Journal for Clinicians, 2018.
[24] V. Sivakumar, and V. Murugesh, "A brief study of image segmentation using thresholding technique on a noisy image", International Conference on Information Communication and Embedded Systems (ICICES), 2014.
[25] S.-F.-Y. Stephen, P. Chintan, B. Daniel, S.-J.-E. Raul, P. Steve, K. John, J.-W. Hugo, and L. Aerts, "Application of the 3D slicer chest imaging platform segmentation algorithm for large lung nodule delineation", Plos One, 2017.
[26] P. Wang, P. Chen, Y. Yuan, D. Liu, Z. Huang, X. Hou, and G. Cottrell, "Understanding convolution for semantic segmentation", arXiv:1702.08502, 2017.
[27] N. Xu, B. Price, S. Cohen, J. Yang, and T. Huang, "Deep grabcut for object selection", British Machine Vision Conference (BMVC), pp. 182.1-182.12, 2017.
[28] H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia, "Pyramid scene parsing network", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2881-2890, 2017.