簡易檢索 / 詳目顯示

研究生: 戴勝捷
Tai, Shen-Chieh
論文名稱: 體積空間-切片融合模型:提升醫學影像的體積超解析
Volumetric Spatial-Slice Fusion Transformer: Advancing Image Super-Resolution for Medical Imaging
指導教授: 許志仲
Hsu, Chih-Chung
蘇佩芳
Su, Pei-Fang
學位類別: 碩士
Master
系所名稱: 管理學院 - 數據科學研究所
Institute of Data Science
論文出版年: 2024
畢業學年度: 112
語文別: 英文
論文頁數: 44
中文關鍵詞: 3D CT 超解析度醫學影像深度學習變換器架構空間-切片融合
外文關鍵詞: 3D CT super-resolution, medical imaging,, deep learning, transformer architecture, spatial-slice fusion
相關次數: 點閱:57下載:5
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文提出了一種新穎的體積空間-切片融合模型(Volumetric Spatial-Slice FusionTransformer, VSSFT) 架構,專為3D CT 超解析度設計。VSSFT 旨在解決醫學體積數據中提高解析度的挑戰,特別聚焦於改善CT 掃描中切片方向的解析度。受到最近高光譜超解析度和基於變換器架構的進展啟發,本論文提出了一個雙視角淺層特徵提取機制和一個深層特徵提取機制,後者結合了殘差密集群組和雙重注意力模塊的概念。VSSFT 的關鍵創新在於其空間-切片注意力融合模塊,該模塊在空間和切片維度之間交替,以捕捉全面的3D 上下文。這種方法有效地處理了CT 體積的異向性特徵,並同時利用了切片內的空間依賴性和切片間的關係。我們在醫學分割十項全能(Medical Segmentation Decathlon)的多樣化CT 數據集上評估了VSSFT,涵蓋了肝臟(Liver)、結腸(Colon) 和肝血管(HepaticVessel)。我們的實驗結果顯示,相較於現有的最先進方法,VSSFT 取得了顯著的改進,在肝臟CT 掃描中PSNR 提高了高達4.57 dB,SSIM 提高了0.0071。通過消融研究進一步驗證了我們的空間-切片融合方法的有效性。雖然VSSFT 表現出優越的性能,但與先前的方法相比,它擁有更多的參數,這可能會增加計算需求。未來的工作將專注於優化模型架構、探索在其他醫學影像模態中的應用、研究對下游任務的影響,以及同時擴展到三個維度的超解析度能力。VSSFT 代表了3D CT 超解析度的重大進展,提供了改善的影像品質,有潛力增強診斷能力、改進治療計劃,並可能減少臨床實踐中對高劑量CT 掃描的需求。

    This paper introduces the Volumetric Spatial-Slice Fusion Transformer (VSSFT), a novel architecture for 3D CT super-resolution (SR). VSSFT addresses the challenges of enhancing resolution in medical volumetric data, particularly in the slice-direction of CT scans. Inspired by recent advances, our method incorporates dual-perspective shallow feature extraction and deep feature extraction combining Residual Deep feature extraction Groups and Dual Attention Blocks. VSSFT’s key innovation is its spatial-slice attention fusion module, alternating between spatial and slice-wise dimensions to capture comprehensive 3D context. This approach effectively handles the anisotropic nature of CT volumes, leveraging both intra-slice spatial dependencies and inter-slice relationships. Evaluated on diverse CT datasets from the Medical Segmentation Decathlon, VSSFT demonstrates significant improvements over existing state-of-the-art methods, with gains of up to 4.57 dB in PSNR and 0.0071 in SSIM for liver CT scans. Ablation studies further validate the effectiveness of our approach. While VSSFT shows superior performance, it has a larger parameter count compared to previous methods. Future work will focus on optimizing the model architecture and expanding SR capabilities to all three dimensions simultaneously. VSSFT represents a significant advancement in 3D CT SR, offering improved image quality that could enhance diagnostic capabilities, improve treatment planning, and potentially reduce the need for high-dose CT scans in clinical practice.

    中文摘要i Abstract ii Acknowledgments iii Table of Contents iv List of Tables v List of Figures vi Chapter 1. Introduction 1 1.1. Problem Statement 2 Chapter 2. Related Work 4 2.1. SISR for Natural Images 4 2.1.1. CNN-based architectures 4 2.1.2. GAN-based architectures 5 2.1.3. Transformer-based architectures 5 2.2. SISR for Medical Images 9 2.3. SISR for Hyperspectral Images 9 Chapter 3. Methodology 12 3.1. Dual-Perspective Feature Extraction 15 3.2. Deep feature extraction 16 3.2.1. 3D Context Integration Module 16 3.2.2. 3D Feature Refinement Module 18 3.3. Volumetric Reconstruction 19 Chapter 4. Experiment Results 23 4.1. Dataset 23 4.2. Implementation Details 25 4.2.1. Data Processing and Augmentation 26 4.3. Evaluation Metrics 28 4.4. Comparison with State-of-the-Art Methods 29 4.5. Comparison of Model Architectures 30 4.6. Ablation Study 31 Chapter 5. Conclusion 32 References 35

    [1] Michela Antonelli, Annika Reinke, Spyridon Bakas, Keyvan Farahani, Annette Kopp-Schneider, Bennett A. Landman, Geert Litjens, Bjoern Menze, Olaf Ronneberger, Ronald M. Summers, Bram van Ginneken, Michel Bilello, Patrick Bilic, Patrick F. Christ, Richard K. G. Do, Marc J. Gollub, Stephan H. Heckers, Henkjan Huisman, William R. Jarnagin, Maureen K. McHugo, Sandy Napel, Jennifer S. Golia Pernicka, Kawal Rhode, Catalina Tobon-Gomez, Eugene Vorontsov, James A. Meakin, Sebastien Ourselin, Manuel Wiesenfarth, Pablo Arbeláez, Byeonguk Bae, Sihong Chen, Laura Daza, Jianjiang Feng, Baochun He, Fabian Isensee, Yuanfeng Ji, Fucang Jia, Ildoo Kim, Klaus Maier-Hein, Dorit Merhof, Akshay Pai, Beomhee Park, Mathias Perslev, Ramin Rezaiifar, Oliver Rippel, Ignacio Sarasua, Wei Shen, Jaemin Son, Christian Wachinger, Liansheng Wang, Yan Wang, Yingda Xia, Daguang Xu, Zhanwei Xu, Yefeng Zheng, Amber L. Simpson, Lena Maier-Hein, and M. Jorge Cardoso. The medical segmentation decathlon. Nature Communications, 13(1), July 2022.
    [2] Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Chao Xu, and Wen Gao. Pre-trained image processing transformer, 2021.
    [3] Xiangyu Chen, Xintao Wang, Wenlong Zhang, Xiangtao Kong, Yu Qiao, Jiantao Zhou, and Chao Dong. Hat: Hybrid attention transformer for image restoration. arXiv preprint arXiv:2309.05239, 2023.
    [4] Yuhua Chen, Feng Shi, Anthony G. Christodoulou, Zhengwei Zhou, Yibin Xie, and Debiao Li. Efficient and accurate mri super-resolution using a generative adversarial network and 3d multi-level densely connected network, 2018.
    [5] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255, 2009.
    [6] Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. Image super-resolution using deep convolutional networks, 2015.
    [7] Chao Dong, Chen Change Loy, and Xiaoou Tang. Accelerating the super-resolution convolutional neural network, 2016.
    [8] Jinglong Du, Zhongshi He, Lulu Wang, Ali Gholipour, Zexun Zhou, Dingding Chen, and Yuanyuan Jia. Super-resolution reconstruction of single anisotropic 3d mr images using residual convolutional neural network. Neurocomputing, 392:209–220, 2020.
    [9] Changlu Guo, Márton Szemenyei, Yugen Yi, Wei Zhou, and Haodong Bian. Residual spatial attention network for retinal vessel segmentation, 2020.
    [10] Chih-Chung Hsu, Chia-Ming Lee, and Yi-Shiuan Chou. Drct: Saving image superresolution away from information bottleneck, 2024.
    [11] Jiwon Kim, Jung Kwon Lee, and Kyoung Mu Lee. Accurate image super-resolution using very deep convolutional networks, 2016.
    [12] Christian Ledig, Lucas Theis, Ferenc Huszar, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, and Wenzhe Shi. Photo-realistic single image super-resolution using a generative adversarial network, 2017.
    [13] Wenbo Li, Xin Lu, Shengju Qian, Jiangbo Lu, Xiangyu Zhang, and Jiaya Jia. On efficient transformer-based image pre-training for low-level vision, 2022.
    [14] Jingyun Liang, Jiezhang Cao, Guolei Sun, Kai Zhang, Luc Van Gool, and Radu Timofte. Swinir: Image restoration using swin transformer. In 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), pages 1833–1844, 2021.
    [15] Bee Lim, Sanghyun Son, Heewon Kim, Seungjun Nah, and Kyoung Mu Lee. Enhanced deep residual networks for single image super-resolution, 2017.
    [16] Chia-Hsiang Lin, Chih-Chung Hsu, Si-Sheng Young, Cheng-Ying Hsieh, and Shen- Chieh Tai. Qrcode: Quasi-residual convex deep network for fusing misaligned hyperspectral and multispectral images. IEEE Transactions on Geoscience and Remote Sensing, 62:1–15, 2024.
    [17] Qing Ma, Junjun Jiang, Xianming Liu, and Jiayi Ma. Multi-task interaction learning for spatiospectral image super-resolution. IEEE Transactions on Image Processing, 31:2950–2961, 2022.
    [18] Shaohui Mei, Ruituo Jiang, Xu Li, and Qian Du. Spatial and spectral joint superresolution using convolutional neural network. IEEE Transactions on Geoscience and Remote Sensing, 58(7):4590–4603, 2020.
    [19] Cheng Peng, Wei-An Lin, Haofu Liao, Rama Chellappa, and S. Kevin Zhou. Saint: Spatially aware interpolation network for medical slice synthesis. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 7747–7756, 2020.
    [20] Wenzhe Shi, Jose Caballero, Ferenc Huszár, Johannes Totz, Andrew P. Aitken, Rob Bishop, Daniel Rueckert, and Zehan Wang. Real-time single image and video superresolution using an efficient sub-pixel convolutional neural network, 2016.
    [21] Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, and Ling Shao. Cycleisp: Real image restoration via improved data synthesis, 2020.
    [22] Yulun Zhang, Kunpeng Li, Kai Li, Lichen Wang, Bineng Zhong, and Yun Fu. Image super-resolution using very deep residual channel attention networks, 2018.
    [23] Yulun Zhang, Yapeng Tian, Yu Kong, Bineng Zhong, and Yun Fu. Residual dense network for image super-resolution, 2018.
    [24] Jin Zhu, Guang Yang, and Pietro Lio. A residual dense vision transformer for medical image super-resolution with segmentation-based perceptual loss fine-tuning, 2023.

    下載圖示 校內:2025-07-25公開
    校外:2025-07-25公開
    QR CODE