簡易檢索 / 詳目顯示

研究生: 張子泓
Chang, Tzu-Hung
論文名稱: 利用基於條件擴散模型與特徵對齊的框架進行專家級錐狀射束電腦斷層影像增強
DiffuCE: Expert-Level CBCT Image Enhancement using a Novel Conditional Denoising Diffusion Model with Latent Alignment
指導教授: 蔣榮先
Chiang, Jung-Hsien
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 醫學資訊研究所
Institute of Medical Informatics
論文出版年: 2024
畢業學年度: 112
語文別: 英文
論文頁數: 83
中文關鍵詞: 錐狀射束電腦斷層影像品質增強擴散模型條件引導特徵對齊
外文關鍵詞: Cone-Beam Computed Tomography, Image Enhancement, Diffusion Models, Conditional Guidance, Latent Alignment
相關次數: 點閱:89下載:4
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 錐狀射束電腦斷層影像作為一種電腦斷層影像技術,有著快速且低放射劑量優勢,被廣泛地應用於牙科診斷與影像導引放射治療。其中,影像導引放射治療意指在為患者進行放射治療前,先透過醫學影像掃描進行器官定位,讓醫師得以將輻射精準地集中照射在病灶區,同時避開週遭的健康組織,以達到最佳的治療效果。而錐狀射束電腦斷層影像得益於其快速掃描的優勢,早已成為影像導引放射治療常用的手段之一,在臨床治療中被廣泛地使用。然而在錐狀射束電腦斷層影像的成像中,因諸多外在影響,經常出現條狀假影,破壞圖中的影像細節,導致醫師無法準確地判讀。儘管早已有基於傳統訊號處理的演算法存在,但由於其有限的品質提升,更多基於機器學習與深度學習的算法仍不斷被提出。

    錐狀射束電腦斷層影像強化屬於醫學影像強化任務的一種,主流策略為圖像到圖像轉換。藉由生成式人工智慧,例如變分編碼器與生成對抗網路等,以卷積神經網路堆疊出深層的網路架構來進行這個任務。在不同的醫學影像類別上,例如電腦斷層影像、核磁共振影像與核子醫學影像上,皆有多篇研究透過深度卷積網路進行相似的品質強化任務,並取得良好成果。然而,這些研究所提出的模型往往僅能在特定的儀器參數下運作,通用性較低。面對來自不同醫學中心的資料,其中帶有多樣的雜訊樣式,這些通用性較低的方法往往需要再次調適,無法直接投入使用。這也意味著使用者必須自行準備充足的訓練資料與運算資源,才有辦法對模型進行調適。
    近年來,大型預訓練生成式人工智慧與相關的低成本訓練策略一一問世,在各個指標性任務上擊敗了基於傳統訓練策略的模型。其中,擴散模型是影像生成領域中的佼佼者。本研究旨在透過精確的條件控制,讓模型在移除影像中假影的同時,保持原始影像中的重要生理特徵,以提升本研究的臨床意義。另外,本研究亦針對資料分布與訓練成本等重要議題進行探討與改進,以進一步提升模型在理論上與實務上的性能與穩定性。

    在實驗結果中,本研究在大林慈濟醫院放射科的電腦斷層影像私人資料集上獲得了最佳的綜合成績,尤其是在視覺認知的指標上,遠超過去的生成對抗網路;同時,在公開資料集上,面對與私人資料集相異的雜訊樣式,其表現也優於官方基準演算法(Stratified 與 Water 演算法),具有一定程度的泛化能力。在消融實驗中,資料分佈一致性與像素空間引導等設計帶來的表現提升被證實,尤其在分佈指標與結構相似性上獲得最多的改善。

    Cone-beam computed tomography (CBCT) is a type of computed tomography (CT) imaging technology known for its speed and low radiation dose. It is widely used in dental diagnostics and image-guided radiotherapy (IGRT). IGRT involves using medical imaging to precisely locate organs before administering radiation therapy, allowing physicians to accurately target the affected area while avoiding surrounding healthy tissues, thereby achieving optimal treatment outcomes.

    Due to its rapid scanning capabilities, CBCT has become a common method in IGRT and is widely used in clinical treatments. However, CBCT imaging often suffers from streak artifacts caused by various external factors, which degrade image details and hinder accurate interpretation by physicians. Although traditional signal processing algorithms exist, their limited quality improvement has led to the development of more advanced algorithms based on machine learning and deep learning.

    CBCT image enhancement is a task within medical image enhancement, with the mainstream strategy being image-to-image translation. Generative artificial intelligence (AI) techniques, such as variational autoencoders (VAEs) and generative adversarial networks (GANs), use deep convolutional neural network architectures to perform this task. Numerous studies have applied similar quality enhancement techniques using deep convolutional networks to various medical imaging modalities, including CT, magnetic resonance imaging (MRI), and nuclear medicine, achieving promising results.

    However, the models proposed in these studies often operate under specific instrument parameters, limiting their generalizability. When dealing with data from different medical centers, which contain various noise patterns, these less universal methods often require re-adaptation and cannot be used directly. This also means that users must prepare sufficient training data and computational resources to adapt the model.

    Large pre-trained generative AI models and corresponding low-cost training strategies have emerged in recent years, outperforming models based on traditional training strategies across various benchmark tasks. Among them, diffusion models stand out in the field of image generation. This study aims to enhance the clinical significance of the diffusion models by using precise conditional control to remove artifacts from images while preserving essential physiological features of the original images. Additionally, this research addresses and improves important issues such as data distribution and training costs to enhance further the model's theoretical and practical performance and stability.

    Experimental results show that this study achieved the best overall performance on the CT private dataset from the Department of Radiology at Dalin Tzu Chi Hospital, particularly excelling in visual perception metrics compared to previous GANs. Additionally, on public datasets with noise patterns different from those in the Dalin Tzu Chi Hospital CT dataset, the model outperformed the official benchmark algorithms(Stratified and Water baselines), demonstrating a certain degree of generalization capability. Ablation experiments confirmed the performance improvements brought by data distribution consistency and pixel space guidance designs, especially in distribution metrics and structural similarity.

    中文摘要 i Abstract iii 誌謝 vi Content viii List of Tables xi List of Figures xii 1 Introduction 1 1-1. Background 1 1-2. Motivation 2 1-3. Research Objective 3 1-4. Thesis Organization 3 2 Literature Review 4 2-1. Cone-Beam Computed Tomography 4 2-2. Low-dose CT image noise reduction 5 2-2.1 Statistical Analysis Approaches 6 2-2.2 Digital Signal Processing (DSP) Techniques 6 2-3. Image-to-Image Translation 7 2-3.1 Advanced Image-to-Image Translation Techniques 7 2-3.2 Applications in CT Image Reconstruction 8 2-4. Diffusion Models 9 2-4.1 Overview of Diffusion Models 10 2-4.2 Advances and Applications 11 2-4.3 Diffusion Models in Medical Image Processing 12 3 Preliminary Study 15 3-1. Pair Dataset: Artifact Generation Algorithm 15 3-2. Diffusion Models 17 3-2.1 Theoretical Background 17 3-2.2 Efficient Fine-Tuning(LoRA) 18 3-2.3 Over-Smooth with L2 Loss in Ordinary CNN-based approaches 19 3-2.4 Over-Distorted in the Pre-trained Diffusion Model 20 3-2.5 Experiment Results 21 4 DiffuCE: Diffusion Model Framework for CBCT Image Enhancement 26 4-1. Conditional Diffusion Denoiser(CDD) 26 4-1.1 Module Overview 26 4-1.2 Conditional Guidance 28 4-1.3 Constraints Preprocessing 30 4-1.4 Remaining Challenges 30 4-2 Domain Bridging Encoder(DBE) 32 4-2.1 Module Overview 32 4-2.2 Latent Alignment 33 4-2.3 Domain Gap Compensation 35 4-3. Conditional Refinement Decoder(CRD) 35 4-3.1 Module Overview 36 4-3.2 Conditional Branch 37 4-3.3 Adaptive DualScope Loss(ADL) 38 4-4. Overview: DiffuCE 39 5 Experiments 43 5-1. Datasets 43 5-1.1 Private Dataset 43 5-1.2 SynthRAD2023 CBCT-to-CT Dataset 44 5-2. Implementation 44 5-3. Metrics 44 5-4. Dataset Evaluation 44 5-4.1 Private Dataset 45 5-4.2 SynthRAD2023 Dataset 49 5-4.3 Experts' Assessment 49 5-4.4 Ablation Study 52 5-5. Limitation 54 5-5.1 Soft Tissue Distortion 58 5-5.2 Radiation Dosage Inaccurate 59 6 Conclusions and Future Work 60 6-1. Conclusions 60 6-2. Future Works 62 References 64

    [1] Hongbing Lu, Ing-Tsung Hsiao, Xiang Li, and Zhengrong Liang. Noise properties of low-dose ct projections and noise treatment by scale transformations. In 2001 IEEE Nuclear Science Symposium Conference Record(Cat.No.01CH37310),volume3,pages 1662–1666 vol.3, 2001.
    [2] Jing Wang, Hongbing Lu, Zhengrong Liang, Daria Eremina, Guangxiang Zhang, Su Wang, John Chen, and James Manzione. An experimental study on the noise properties of x-ray ct sinogram data in radon space. Physics in medicine and biology, 53(12):3327–3341, 2008.
    [3] F. J. Anscombe. The transformation of poisson, binomial and negative-binomial data. Biometrika, 35(3/4):246–254, 1948.
    [4] Hongbing Lu, Ing-Tsung Hsiao, Xiang Li, and Zhengrong Liang. Noise properties of low-dose ct projections and noise treatment by scale transformations. In 2001 IEEE Nuclear Science Symposium Conference Record(Cat.No.01CH37310),volume3,pages 1662–1666 vol.3, 2001.
    [5] F. Godtliebsen C. K. Chu, I. K. Glad and J. S. Marron. Edge-preserving smoothers for image processing. Journal of the American Statistical Association, 93(442):526–541, 1998.
    [6] Ping Liang and Y.F. Wang. Local scale controlled anisotropic diffusion with local noise estimate for image smoothing and edge detection. In Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271), pages 193–200, 1998.
    [7] JingWang,HongbingLu,TianfangLi,andZhengrongLiang. Sinogramnoisereduction for low-dose CT by statistics-based nonlinear filters. In J. Michael Fitzpatrick and Joseph M. Reinhardt, editors, Medical Imaging 2005: Image Processing, volume 5747, pages 2058– 2066. International Society for Optics and Photonics, SPIE, 2005.
    [8] Jing Wang, Tianfang Li, Hongbing Lu, and Zhengrong Liang. Noise reduction for low-dose single-slice helical ct sinograms. IEEE Transactions on Nuclear Science, 53(3):1230–1237, 2006.
    [9] Henri Hoyez, Cédric Schockaert, Jason Rambach, Bruno Mirbach, and Didier Stricker. Unsupervised image-to-image translation: A review. Sensors, 22(21), 2022.
    [10] Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. Image-to-image translation with conditional adversarial networks, 2018.
    [11] Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, and Bryan Catanzaro. High-resolution image synthesis and semantic manipulation with conditional gans, 2018.
    [12] Ming-YuLiu, ThomasBreuel, andJanKautz. Unsupervisedimage-to-image translation networks, 2018.
    [13] Lianying Chao, Zhiwei Wang, Haobo Zhang, Wenting Xu, Peng Zhang, and Qiang Li. Sparse-view cone beam ct reconstruction using dual cnns in projection domain and image domain. Neurocomputing, 493:536–547, 2022.
    [14] Qihang Chen, Zhidong Yuan, Chao Zhou, Weiguang Zhang, Mengxi Zhang, Yongfeng Yang, Dong Liang, Xin Liu, Hairong Zheng, Guanxun Cheng, and Zhanli Hu. Low dose dental ct image enhancement using a multiscale feature sensing network. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, 981:164530, 2020.
    [15] Yoseob Han and Jong Chul Ye. Framing u-net via deep convolutional framelets: Application to sparse-view ct, 2018.
    [16] Zhenxing Huang, Xinfeng Liu, Rongpin Wang, Jincai Chen, Ping Lu, Qiyang Zhang, Changhui Jiang, Yongfeng Yang, Xin Liu, Hairong Zheng, Dong Liang, and Zhanli Hu. Considering anatomical prior information for low-dose ct image enhancement using attribute-augmented wasserstein generative adversarial networks. Neurocomputing, 428:104–115, 2021.
    [17] Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models, 2020.
    [18] JiamingSong, ChenlinMeng, andStefanoErmon. Denoisingdiffusionimplicitmodels, 2022.
    [19] Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, andBenPoole. Score-based generative modeling through stochastic differential equations, 2021.
    [20] Prafulla Dhariwal and Alex Nichol. Diffusion models beat gans on image synthesis, 2021.
    [21] Jonathan Ho and Tim Salimans. Classifier-free diffusion guidance, 2022.
    [22] Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. High-resolution image synthesis with latent diffusion models, 2022.
    [23] Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. Lora: Low-rank adaptation of large language models, 2021.
    [24] Lvmin Zhang, Anyi Rao, and Maneesh Agrawala. Adding conditional control to text to-image diffusion models, 2023.
    [25] Chong Mou, Xintao Wang, Liangbin Xie, Yanze Wu, Jian Zhang, Zhongang Qi, Ying Shan, and Xiaohu Qie. T2i-adapter: Learning adapters to dig out more controllable ability for text-to-image diffusion models, 2023.
    [26] Amirhossein Kazerouni, Ehsan Khodapanah Aghdam, Moein Heidari, Reza Azad, Mohsen Fayyaz, Ilker Hacihaliloglu, and Dorit Merhof. Diffusion models in medical imaging: A comprehensive survey. Medical Image Analysis, 88:102846, 2023.
    [27] Yutong Xie and Quanzheng Li. Measurement-conditioned denoising diffusion probabilistic model for under-sampled medical image reconstruction, 2022.
    [28] Hyungjin Chung and Jong Chul Ye. Score-based diffusion models for accelerated mri, 2022.
    [29] Justin Johnson, Alexandre Alahi, and Li Fei-Fei. Perceptual losses for real-time style transfer and super-resolution, 2016.
    [30] Evi M. C. Huijben, Maarten L. Terpstra, Arthur Jr. Galapon, Suraj Pai, Adrian Thummerer, Peter Koopmans, Manya Afonso, Maureen van Eijnatten, Oliver Gurney Champion,ZeliChen,YiwenZhang,KaiyiZheng,ChuanpuLi,HaowenPang,Chuyang Ye, Runqi Wang, Tao Song, Fuxin Fan, Jingna Qiu, Yixing Huang, Juhyung Ha, Jong Sung Park, Alexandra Alain-Beaudoin, Silvain Bériault, Pengxin Yu, Hongbin Guo, Zhanyao Huang, Gengwan Li, Xueru Zhang, Yubo Fan, Han Liu, Bowen Xin, Aaron Nicolson, Lujia Zhong, Zhiwei Deng, Gustav Müller-Franzes, Firas Khader, Xia Li, Ye Zhang, Cédric Hémon, Valentin Boussot, Zhihao Zhang, Long Wang, Lu Bai, Shaobin Wang, Derk Mus, BramKooiman, ChelseaA.H.Sargeant, EdwardG.A.Henderson, Satoshi Kondo, Satoshi Kasai, Reza Karimzadeh, Bulat Ibragimov, Thomas Helfer, Jessica Dafflon, Zijie Chen, Enpei Wang, Zoltan Perko, and Matteo Maspero. Generating synthetic computed tomography for radiotherapy: Synthrad2023 challenge report, 2024.
    [31] Thummerer, Adrian, van der Bijl, Erik, Galapon, Arthur, Verhoeff, Joost J. C., Langendijk, Johannes A., Both, Stefan, van den Berg, Cornelis (Nico) A. T., Maspero, and Matteo. Synthrad2023 grand challenge dataset: Generating synthetic ct for radiotherapy. Medical Physics, 50(7):4664–4674, June 2023.
    [32] Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. Gans trained by a two time-scale update rule converge to a local nash equilibrium, 2018.

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE