簡易檢索 / 詳目顯示

研究生: 鄭宇呈
Cheng, Yu-Cheng
論文名稱: 一個基於改良式生成對抗網路的影像去模糊演算法
An Image Deblurring Algorithm Based on Improved Generative Adversarial Network
指導教授: 戴顯權
Tai, Shen-Chuan
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電腦與通信工程研究所
Institute of Computer & Communication Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 英文
論文頁數: 75
中文關鍵詞: 動態模糊影像去模糊生成對抗網路深度學習
外文關鍵詞: motion blur, image deblurring, generative adversarial network, deep learning
相關次數: 點閱:171下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 動態模糊是降低影像品質的最常見因素之一。它很常在手持相機拍攝出來的照片以及含有移動物體的低幀率影片中發現。許多電腦視覺的演算法如語義分割,物件偵測等依靠視覺輸入,模糊的影像都會影響這些演算法的效能。本論文中提出了一種基於生成對抗網絡的影像去模糊演算法。此網路包含一個產生器與一個鑑別器。產生器是基於編解碼器網路組成並合成趨於真實的輸出影像,而鑑別器則判別真實圖像相對來說是否比輸出影像更逼真。此外,混合式損失函數使網路能產生高品質的影像。
    實驗結果顯示,本論文所提出的演算法相較於其他方法,在主觀影像品質能得到更銳利的邊緣及更細節的紋理與客觀影像評估標準上皆有較好的表現。

    Motion blur is one of the most common factors degrading image quality. It is commonly found from photos taken by hand-held cameras, or low-frame-rate videos containing moving objects. Many computer vision algorithms such as semantic segmentation, object detection rely on visual inputs, blurry images affect the performance of these algorithms. In this Thesis, an image deblurring algorithm based on generative adversarial network is proposed. It contains a generator and a discriminator. The generator is based on an encoder-decoder architecture which synthesizes the output image that tends to be real, and the discriminator distinguishes whether the real image is relatively more realistic than the output image. In addition, the hybrid loss function enables the network to output high-quality images.
    The experimental results show that the proposed approach has better performance than other methods on subjective visual quality which can obtain sharper edges and more detail textures, and also the objective measurement.

    Contents iv List of Tables vi List of Figures vii Chapter 1 Introduction 1 Chapter 2 Background and Related Works 4 2.1 Overview of Image Deblurring 4 2.2 Neural Network 8 2.3 Convolutional Neural Network 15 2.4 Generative Adversarial Network 20 Chapter 3 The Proposed Algorithm 23 3.1 Proposed Network Architecture 25 3.2 Generator Architecture 27 3.3 Discriminator Architecture 33 3.4 Loss Function 35 3.4.1 Adversarial loss 35 3.4.2 Content loss 35 3.4.3 SSIM loss 36 3.4.4 Perceptual loss 36 3.4.5 Dark channel loss 37 3.4.6 Total loss function 38 Chapter 4 Experimental Results 39 4.1 Experimental Dataset 39 4.2 Implementation Details 41 4.3 Experimental Results 42 4.4 Ablation Experimental Results 62 4.5 Application 64 Chapter 5 Conclusion and Future Work 69 5.1 Conclusion 69 5.2 Future work 69 References 70

    [1] Chan, T.F., Wong, C.K.: Total variation blind deconvolution. IEEE transactions on Image Processing 7(3), 370–375 (1998).
    [2] D. Krishnan and R. Fergus. Fast image deconvolution using hyper-laplacian priors. In NIPS, pages 1033–1041, 2009.
    [3] Q. Shan, J. Jia, and A. Agarwala. High-quality motion deblurring from a single image. In TOG, volume 27, page 73. ACM, 2008.
    [4] L. Xu, S. Zheng, and J. Jia. Unnatural l0 sparse representation for natural image deblurring. In CVPR, pages 1107–1114. IEEE, 2013.
    [5] J. Pan, D. Sun, H. Pfister, and M.-H. Yang. Blind image deblurring using dark channel prior. In CVPR, 2016.
    [6] W. Ren, X. Cao, J. Pan, X. Guo, W. Zuo, and M.-H. Yang. Image deblurring via enhanced low-rank prior. IEEE Transactions on Image Processing, 25(7):3426–3437, 2016.
    [7] Zhang, H., Yang, J., Zhang, Y., Huang, T.S.: Sparse representation based blind image deblurring. In: Multimedia and Expo (ICME), 2011 IEEE International Conference on, pp. 1–6. IEEE (2011).
    [8] Ruiwen Zhen and Robert L. Stevenson. Multi−image motion deblurring aided by inertial sensors. Journal of Electronic Imaging, 25(1):013027−013027, (2016).
    [9] J. Zhang, J. Pan, W.-S. Lai, R. W. Lau, and M.-H. Yang, ‘‘Learning fully convolutional networks for iterative non-blind deconvolution,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jul. 2017, pp. 3817–3825.
    [10] L. Xu, J. S. Ren, C. Liu, and J. Jia, ‘‘Deep convolutional neural networkfor image deconvolution,’’ in Proc. Adv. Neural Inf. Process. Syst., 2014, pp. 1790–1798.
    [11] Jian Sun, Wenfei Cao, Zongben Xu, and Jean Ponce. Learning a convolutional neural network for non-uniform motion blur removal. In CVPR, pp. 769−777, (2015).
    [12] Christian J. Schuler, Michael Hirsch, Stefan Harmeling, and Bernhard Schlkopff. Learning to deblur. IEEE Trans. Pattern Anal. Mach. Intell., 38(7):1439−1451, (2016).
    [13] Ayan Chakrabarti. A neural approach to blind motion deblurring. In ECCV, pp. 221−235. Springer, (2016).
    [14] S. Nah, T. Hyun Kim, and K. Mu Lee, ‘‘Deep multi-scale convolutional neural network for dynamic scene deblurring,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jul. 2017, pp. 3883–3891.
    [15] X. Tao, H. Gao, X. Shen, J. Wang, and J. Jia, ‘‘Scale-recurrent network for deep image deblurring,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2018, pp. 8174–8182.
    [16] X. Shi, Z. Chen, H. Wang, D.-Y. Yeung, W.-K. Wong, and W.-c. Woo. Convolutional lstm network: A machine learning approach for precipitation nowcasting. In NIPS, pages 802–810, 2015.
    [17] Zhang, J.; Pan, J.; Ren, J.; Song, Y.; Bao, L.; Lau, R. W.; and Yang, M.-H. Dynamic scene deblurring using spatially variant recurrent neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2521–2529.
    [18] S. Ramakrishnan, S. Pachori, A. Gangopadhyay, and S. Raman. Deep generative filter for motion deblurring. 2017 IEEE International Conference on Computer Vision Workshops (ICCVW), pages 2993–3000, 2017.
    [19] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros. Image-to-image translation with conditional adversarial networks. arxiv, 2016.
    [20] G. Huang, Z. Liu, L. van der Maaten, and K. Q. Weinberger. Densely connected convolutional networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2261–2269, 2017.
    [21] Orest Kupyn, Volodymyr Budzan, Mykola Mykhailych, Dmytro Mishkin, and Jiˇr´ı Matas. Deblurgan: Blind motion deblurring using conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 8183–8192, 2018.
    [22] Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., WardeFarley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Advances in neural information processing systems, pp. 2672–2680 (2014).
    [23] Martin Arjovsky, Soumith Chintala, and Leon Bottou. Wasserstein gan. arXiv preprint arXiv:1701.07875, 2017.
    [24] J. Chen, J. Chen, H. Chao, and M. Yang, “Image blind denoising with generative adversarial network based noise modeling,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recognit., Jun. 2018, pp. 3155–3164.
    [25] Ledig, C., Theis, L., Husz´ar, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., Wang, Z., et al.: Photo-realistic single image super resolution using a generative adversarial network. In: CVPR. (2017)
    [26] Nair, Vinod and Hinton, Geoffrey E. Rectified linear units improve restricted Boltzmann machines. In ICML, pp. 807–814, 2010.
    [27] Maas, Andrew L, Hannun, Awni Y, and Ng, Andrew Y. Rectifier nonlinearities improve neural network acoustic models. In ICML, volume 30, 2013.
    [28] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel. Backpropagation applied to handwritten zip code recognition. Neural computation, 1989.
    [29] K. Fukushima. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological cybernetics, 1980.
    [30] Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle consistent adversarial networks. arXiv preprint (2017).
    [31] P. Luc, C. Couprie, S. Chintala, and J. Verbeek. (2016). “Semantic segmentation using adversarial networks.” [Online]. Available: https://arxiv.org/abs/1611.08408
    [32] Z. Yi, H. Zhang, T. Gong, Tan, and M. Gong. Dualgan. Unsupervised dual learning for image-to-image translation. In IEEE International Conference on Computer Vision (ICCV), 2017.
    [33] J.-Y. Zhu, R. Zhang, D. Pathak, T. Darrell, A. A. Efros, O. Wang, and E. Shechtman. Toward multimodal image-to-image translation. In Advances in Neural Information Processing Systems (NIPS), 2017.
    [34] K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, ‘‘Learning phrase representations using RNN encoder-decoder for statistical machine translation,’’ 2014, arXiv:1406.1078. [Online]. Available: https://arxiv.org/abs/1406.1078
    [35] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR. (2016)
    [36] Huang, G., Liu, Z., Weinberger, K.Q., van der Maaten, L.: Densely connected convolutional networks. In: CVPR. (2017)
    [37] Y. Zhang, Y. Tian, Y. Kong, B. Zhong, and Y. Fu, ‘‘Residual dense network for image super-resolution,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2018, pp. 2472–2481.
    [38] Wang, X., Yu, K., Wu, S., Gu, J., Liu, Y., Dong, C., Qiao, Y., Loy, C.C.: ESRGAN: Enhanced super-resolution generative adversarial networks. In: Proc. ECCV Workshops (2018).
    [39] Szegedy, C., Ioffe, S., Vanhoucke, V.: Inception-v4, inception-resnet and the impact of residual connections on learning. arXiv preprint arXiv:1602.07261 (2016).
    [40] A. Odena, V. Dumoulin, and C. Olah, “Deconvolution and checkerboard artifacts,” Distill, 2016. [Online]. Available: http://distill.pub/2016/deconv-checkerboard
    [41] M. Abadi, A. Agarwal, P. Barham, et al, “Tensorflow: Largescale machine learning on heterogeneous systems,” 2015, software available from tensorflow.org. [Online]. Available: http://tensorflow.org/
    [42] H. Son, S. Lee. “Fast Non-blind Deconvolution via Regularized Residual Networks with Long/Short Skip-Connections,” IEEE International Conference on Conference Photography (ICCP), pp.1-10, May 2017.
    [43] Z. Shen, W. Lai, T. Xu, J. Kautz, M. Yang. “Deep Semantic Face Deblurring,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
    [44] Jolicoeur-Martineau, A.: The relativistic discriminator: a key element missing from standard gan. arXiv preprint arXiv:1807.00734 (2018).
    [45] J. Johnson, A. Alahi, and L. Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. In Proceedings of the European Conference on Computer Vision (ECCV), pages 694–711. Springer, 2016.
    [46] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
    [47] Mahendran, A., Vedaldi, A. Understanding deep image representations by inverting them. In Proceedings of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2015.
    [48] Kaiming He, Jian Sun, and Xiaoou Tang. Single image haze removal using dark channel prior. In CVPR, pp. 1956−1963, (2009).
    [49] Zhang, Shuang, Zhen, A. and Stevenson, R.L., “GAN Based Image Deblurring Using Dark Channel Prior.,” arXiv preprint arXiv, 1903 00107 (2019).
    [50] S. Nah et al., “Ntire 2019 challenge on video deblurring and super resolution: Dataset and study,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR) Workshops, Jun. 2019, pp. 1–10.
    [51] Google Vision API: https://cloud.google.com/vision/

    下載圖示 校內:2025-07-01公開
    校外:2025-07-01公開
    QR CODE