簡易檢索 / 詳目顯示

研究生: 陳德瑋
Chen, De-Wei
論文名稱: 應用變換域訊號的雙路徑網路之圖像超解析度
Dual Path Network With Transform Domain Signal For Image Super-Resolution
指導教授: 郭致宏
Kuo, Chih-Hung
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 中文
論文頁數: 89
中文關鍵詞: 超解析度深度學習卷積神經網路
外文關鍵詞: super resolution, deep learning, convolutional neural network
相關次數: 點閱:136下載:4
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 最近,使用深度卷積神經網路(Deep Convolutional Neural Network, DCNN)進行的單張圖像超解析度研究比基於傳統信號處理的方法取得了更顯著的進步。近年來的超解析度模型多朝向更深且更寬的網路架構,因而導致大量的計算和記憶體成本,然而性能卻僅有很小的提升。為了解決這個問題,在本文中,我們提出了一種結合小波和Saak變換的雙路徑網路(Wavelet- and Saak-transform Dual Path Network, WSDPN),該網路不僅考慮了低解析度圖像,還考慮了使用圖像的變換域訊息來提取出有利於進行圖像重建的特徵。我們所提出的網路利用由變換域提取的豐富資訊來重建出更精確的高解析度圖像。此外,為了從殘差網路(ResNet)和密集卷積網路(DenseNet)的拓撲中獲益,我們使用雙路徑模塊作為基本的構建模塊,這些模塊不僅允許特徵重用,同時確保繼續探索新特徵的能力。得益於對注意力機制的廣泛研究,我們引入了空間和自我注意力模塊,以基於不同層之間特徵的相關性來重新校準。由於學習低解析度到高解析度圖像間的映射函數是一個典型的ill-posed問題,因此我們通過對低解析度數據引入附加約束來減少可能函數的空間,從而形成一個閉環以提供額外的監督。實驗結果表明,我們提出的方法在廣泛使用的評估基準上比其他最新技術具有更好的性能與表現。

    Recently, studies on single image super-resolution using Deep Convolutional Neural Networks (DCNN) have been demonstrated to have made outstanding progress over conventional signal-processing based methods. However, existing architectures have grown wider and deeper, resulting in a large amount of computation and memory cost, but only a small improvement in performance. To address this issue, in this paper, we present a Wavelet- and Saak-transform Dual Path Network (WSDPN), which considers not only low-resolution images but also transform-domain information. The proposed network takes advantage of the rich information extracted from the transform domain to reconstruct more accurate high-resolution images. In addition, to reap the benefits from both residual network (ResNet) and densely convolutional network (DenseNet) topologies, we use dual-path blocks as the basic building blocks which allow feature re-use while ensuring the ability to continue extracting new features. Thanks to extensive research on the attention mechanism, we further introduce spatial and self-attention blocks to refine features based on the cross-correlation of features at different layers. Since learning the mapping between low-resolution and high-resolution images is typically an ill-posed problem, we introduce an additional constraint on low-resolution data to limit the space of the possible functions, forming a closed-loop to provide additional supervision. The experimental results show that our proposed approach achieves better performance on extensive benchmark evaluation than other state-of-the-art methods.

    中文摘要 I 誌謝 XVIII 目錄 XIX 表目錄 XXII 圖目錄 XXII 第一章 緒論 1 1-1 前言 1 1-2 超解析度 2 1-2-1 單張圖像超解析度技術 (Single Image Super-Resolution) 3 1-3 深度學習 5 1-3-1 人工神經網路 (Artificial Neural Networks) 5 1-3-2 深度神經網路 (Deep Neural Networks) 6 1-3-3 反向傳播法 (Back-Propagation) 8 1-3-4 卷積神經網路 (Convolutional Neural Networks) 9 1-4 研究動機 12 1-5 研究貢獻 14 1-6 論文架構 15 第二章 相關研究背景介紹 16 2-1 反卷積神經網路 (Deconvolutional Neural Networks) 16 2-2 亞像素卷積神經網路 (Sub-Pixel Convolutional Neural Networks) 18 2-3 超解析度神經網路相關文獻 19 2-4 擴增核之子空間近似變換 (Saak Transform) 21 第三章 深度學習超解析度技術 24 3-1 利用預處理升採樣之超解析度演算法 24 3-1-1 基於卷積神經網路之超解析度演算法 25 3-1-2 基於深度卷積神經網路之圖片超解析度演算法 26 3-1-3 基於深度遞歸卷積網路之圖片超解析度演算法 26 3-1-4 基於深度遞歸殘差網路之圖片超解析度演算法 27 3-2 利用單層上採樣之超解析度演算法 28 3-2-1 基於卷積神經網路之快速超解析度演算法 29 3-2-2 基於增強深度殘差網路之圖片超解析度演算法 30 3-2-3 使用密集跳過連接之圖像超解析度演算法 31 3-2-4 基於雙路徑網路之圖像超解析度演算法 32 3-3 利用漸進式上採樣之超解析度演算法 34 3-3-1 利用深層拉普拉斯金字塔網路實現快速、準確之圖像超解析度演算法 35 3-3-2 全漸進式上採樣之超解析度演算法 36 3-4 利用迭代式升降採樣超解析度演算法 37 3-4-1 基於深度反向投影網路之超解析度演算法 38 3-5 注意力機制 39 3-6 超解析度演算法比較 40 第四章 應用變換域訊號的雙路徑網路之圖像超解析度 48 4-1 應用小波和Saak變換之雙路徑網路(WSDPN)架構 49 4-2 變換域數據 (Transform Domain Data) 50 4-2-1 小波變換之輸入(Wavelet-transform Input) 51 4-2-2 Saak變換之輸入 (Saak-transform Input) 53 4-3 雙路徑模塊 (Dual Path Block) 55 4-4 特徵融合模塊(Feature Fusion Module, FFM) 57 4-5 網路訓練及損失函數 58 第五章 實驗環境與數據分析 64 5-1 資料集 (Dataset) 64 5-2 網路操作細節 68 5-3 影像品質評估 68 5-4 架構分析 69 5-4-1 使用變換域輸入的影響 70 5-4-2 不同路徑拓撲對雙路徑模塊的影響 70 5-4-3 雙路徑模塊的數量對重建性能的影響 71 5-4-4 採用閉環式訓練與多監督式學習對重建性能的影響 72 5-5 重建結果與比較 73 5-6 網路複雜度 78 第六章 結論與未來展望 80 6-1 結論 80 6-2 未來展望 80 參考文獻 81

    [1] R.R. Peeters, P. Kornprobst, M. Nikolova, S. Sunaert, T. Vieville, G. Malandain, R. Deriche, O. Faugeras, M. Ng, and P. Van Hecke. The use of super-resolution techniques to reduce slice thickness in functional MRI. International Journal of Imaging Systems and Technology, 14(3):131–138, 2004.
    [2] J.A. Kennedy, O. Israel, A. Frenkel, R. Bar-Shalom, and H. Azhari. Super-resolution in PET imaging. IEEE transactions on medical imaging, 25(2):137–147, 2006.
    [3] J.A. Kennedy, O. Israel, A. Frenkel, R. Bar-Shalom, and H. Azhari. Improved image fusion in PET/CT using hybrid image reconstruction and super-resolution. Int. J. Biomed. Imaging, 46846, 2007.
    [4] H. Zhang, Z. Yang, L. Zhang, and H. Shen, “Super-resolution reconstruction for multi-angle remote sensing images considering resolution differences,” Remote Sensing, vol. 6, no. 1, pp. 637–657, 2014.
    [5] J. Yang, J. Wright, T. S. Huang and Y. Ma, "Image Super-Resolution Via Sparse Representation," in IEEE Transactions on Image Processing, vol. 19, no. 11, pp. 2861-2873, Nov. 2010.
    [6] J. Schmidhuber, “Deep learning in neural networks: An overview,” Neural Networks, vol. 61, pp. 85–117, 2015.
    [7] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by back-propagating errors,” Nature, vol. 323, no. 6088, p. 533, 1986.
    [8] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, “Backpropagation applied to handwritten zip code recognition,” Neural Computation, vol. 1, no. 4, pp. 541 551, 1989.
    [9] J. L. Elman, “Finding structure in time,” Cognitive Science, vol. 14, no. 2, pp. 179–211, 1990.
    [10] C. Dong, C. C. Loy, K. He, and X. Tang, “Learning a deep convolutional network for image super-resolution,” in European conference on computer vision, pp. 184–199, Springer, 2014.
    [11] D. Cires¸An, U. Meier, J. Masci, and J. Schmidhuber, “Multi-column deep neural network for traffic sign classification,” Neural Networks, vol. 32, pp. 333–338, 2012.
    [12] D. C. Ciresan, U. Meier, J. Masci, L. Maria Gambardella, and J. Schmidhuber, “Flexible, high performance convolutional neural networks for image classification,” in Proceedings of the International Joint Conference on Artificial Intelligence, 2011, pp. 1237–1242.
    [13] R. Salakhutdinov and H. Larochelle, “Efficient learning of deep Boltzmann machines,” in Proceedings of the International Conference on Artificial Intelligence and Statistics, 2010, pp. 693–700.
    [14] D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” arXiv preprint arXiv:1312.6114, 2013.
    [15] D. J. Rezende, S. Mohamed, and D. Wierstra, “Stochastic backpropagation and approximate inference in deep generative models,” arXiv preprint arXiv:1401.4082, 2014.
    [16] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Proceedings of the Advances in Neural Information Processing Systems, 2014, pp. 2672–2680.
    [17] Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, et al., “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.
    [18] D. H. Hubel and T. N. Wiesel, “Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex,” The Journal of physiology, vol. 160, no. 1, pp. 106–154, 1962.
    [19] A. Odena, V. Dumoulin, and C. Olah, “Deconvolution and checkerboard artifacts,” Distill, vol. 1, no. 10, p. e3, 2016.
    [20] Aitken, A., Ledig, C., Theis, L., Caballero, J., Wang, Z., & Shi, W. (2017). Checkerboard artifact free sub-pixel convolution: A note on sub-pixel convolution, resize convolution and convolution resize. arXiv preprint arXiv:1707.02937.
    [21] Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167.
    [22] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016.
    [23] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4700–4708, 2017.
    [24] C.-C. J. Kuo, “Understanding convolutional neural networks with a mathematical model,” Journal of Visual Communication and Image Representation, vol. 41, pp. 406–413, 2016.
    [25] C.-C. J. Kuo, “The cnn as a guided multilayer recos transform [lecture notes],” IEEE signal processing magazine, vol. 34, no. 3, pp. 81–89, 2017.
    [26] C.-C. J. Kuo and Y. Chen, “On data-driven saak transform,” Journal of Visual Communication and Image Representation, vol. 50, pp. 237–246, 2018.
    [27] J. Kim, J. Kwon Lee, and K. Mu Lee, “Accurate image super-resolution using very deep convolutional networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1646–1654, 2016.
    [28] C. Dong, C. C. Loy, and X. Tang, “Accelerating the super-resolution convolutional neural network,” in European conference on computer vision, pp. 391– 407, Springer, 2016.
    [29] W. Shi, J. Caballero, F. Huszár, J. Totz, A. P. Aitken, R. Bishop, D. Rueckert, and Z. Wang, “Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1874–1883, 2016.
    [30] Christian Ledig, Lucas Theis, Ferenc Husz´ar, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, et al., “Photo-realistic single image super resolution using a generative adversarial network,” pp. 4681–4690, 2017.
    [31] Bee Lim, Sanghyun Son, Heewon Kim, Seungjun Nah, and Kyoung Mu Lee, “Enhanced deep residual networks for single image super-resolution,” pp. 136–144, 2017.
    [32] Lai, W. S., Huang, J. B., Ahuja, N., & Yang, M. H. (2017). Deep laplacian pyramid networks for fast and accurate super-resolution. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 624-632).
    [33] Y. Wang, F. Perazzi, B. McWilliams, A. Sorkine-Hornung, O. Sorkine- Hornung, and C. Schroers, “A fully progressive approach to single-image super- resolution,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 864–873, 2018.
    [34] Haris, Muhammad, Gregory Shakhnarovich, and Norimichi Ukita. "Deep back-projection networks for super-resolution." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
    [35] Hu, Jie, Li Shen, and Gang Sun. "Squeeze-and-excitation networks." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
    [36] J. Yang, J. Wright, T. Huang, and Y. Ma, “Image super-resolution as sparse representation of raw image patches,” pp. 1–8, 2008.
    [37] S. Schulter, C. Leistner, and H. Bischof, “Fast and accurate image upscaling with super-resolution forests,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3791–3799, 2015.
    [38] R. Timofte, V. De Smet, and L. Van Gool, “A+: Adjusted anchored neighborhood regression for fast super-resolution,” in Asian conference on computer vision, pp. 111–126, Springer, 2014.
    [39] M. Tan and Q. V. Le, “Efficientnet: Rethinking model scaling for convolutional neural networks,” arXiv preprint arXiv:1905.11946, 2019.
    [40] B. Toufik and N. Mokhtar, “The wavelet transform for image processing applications,” Advances in Wavelet Theory and Their Applications in Engineering, Physics and Technology, no. 17, pp. 395–422, 2012.
    [41] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in neural information processing systems, pp. 5998–6008, 2017.
    [42] Z.-S. Liu, L.-W.Wang, C.-T. Li,W.-C. Siu, and Y.-L. Chan, “Image super resolution via attention based back projection networks,” arXiv preprint arXiv:1910.04476, 2019.
    [43] D. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
    [44] Zeiler, M. D. (2012). Adadelta: an adaptive learning rate method. arXiv preprint arXiv:1212.5701.
    [45] Graves, A. (2013). Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850.
    [46] M. Bevilacqua, A. Roumy, C. Guillemot, and M. L. Alberi-Morel, “Low complexity single-image super-resolution based on nonnegative neighbor embedding,” 2012.
    [47] R. Zeyde, M. Elad, and M. Protter, “On single image scale-up using sparse representations,” pp. 711–730, 2010.
    [48] P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik, “Contour detection and hierarchical image segmentation,” IEEE transactions on pattern analysis and machine intelligence, vol. 33, no. 5, pp. 898–916, 2010.
    [49] J.-B. Huang, A. Singh, and N. Ahuja, “Single image super resolution from transformed self-exemplars,” pp. 5197–5206, 2015.
    [50] H. Xu and K. Saenko, “Ask, attend and answer: Exploring question-guided spatial attention for visual question answering,” pp. 451–466, 2016.
    [51] Z. Wang, A. C. Bovik, H. R. Sheikh, E. P. Simoncelli, et al., “Image quality assessment: from error visibility to structural similarity,” IEEE transactions on image processing, vol. 13, no. 4, pp. 600–612, 2004.
    [52] MATLAB, 9.5.0.944444 (R2018b). Natick, Massachusetts: The Math-Works Inc., 2018.
    [53] L. Chen, H. Zhang, J. Xiao, L. Nie, J. Shao, W. Liu, and T.-S. Chua,“Sca-cnn: Spatial and channel-wise attention in convolutional networks for image captioning,” pp. 5659–5667, 2017.
    [54] A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer, “Automatic differentiation in pytorch,” 2017.
    [55] J. Kim, J. Kwon Lee, and K. Mu Lee, “Deeply-recursive convolutional network for image super-resolution,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1637–1645, 2016.
    [56] Y. Tai, J. Yang, and X. Liu, “Image super-resolution via deep recursive residual network,” pp. 3147–3155, 2017.
    [57] Y. Zhang, Y. Tian, Y. Kong, B. Zhong, and Y. Fu, “Residual dense network for image super-resolution,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2472–2481, 2018.
    [58] Yunpeng Chen, Xiaojie Jin, Bingyi Kang, Jiashi Feng, and Shuicheng Yan. Sharing residual units through collective tensor factorization in deep neural networks. arXiv preprint arXiv:1703.02180, 2017.
    [59] Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He. Aggregated residual transformations for deep neural networks. arXiv preprint arXiv:1611.05431, 2016.
    [60] Y. Bengio, P. Simard, and P. Frasconi. Learning long-term dependencies with gradient descent is difficult. Neural Networks, IEEE Transactions on, 5(2), 1994.
    [61] S. Han, H. Mao, and W. J. Dally. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. In ICLR, 2016.
    [62] Tong, T., Li, G., Liu, X., & Gao, Q. (2017). Image super-resolution using dense skip connections. In Proceedings of the IEEE International Conference on Computer Vision (pp. 4799-4807).
    [63] Chen, Y., Li, J., Xiao, H., Jin, X., Yan, S., & Feng, J. (2017). Dual path networks. In Advances in neural information processing systems (pp. 4467-4475).
    [64] Kuang, H., Wang, H., Ma, X., & Liu, X. (2018, February). Image Super-Resolution Based on Dual Path Network. In 2018 10th International Conference on Measuring Technology and Mechatronics Automation (ICMTMA) (pp. 225-228). IEEE.
    [65] Wang, F., Jiang, M., Qian, C., Yang, S., Li, C., Zhang, H., ... & Tang, X. (2017). Residual attention network for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 3156-3164).
    [66] Bengio, Y., Louradour, J., Collobert, R., & Weston, J. (2009, June). Curriculum learning. In Proceedings of the 26th annual international conference on machine learning (pp. 41-48).
    [67] Zhang, H., Cisse, M., Dauphin, Y. N., & Lopez-Paz, D. (2017). mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412.
    [68] Yun, S., Han, D., Oh, S. J., Chun, S., Choe, J., & Yoo, Y. (2019). Cutmix: Regularization strategy to train strong classifiers with localizable features. In Proceedings of the IEEE International Conference on Computer Vision (pp. 6023-6032).
    [69] DeVries, T., & Taylor, G. W. (2017). Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552.
    [70] Yoo, J., Ahn, N., & Sohn, K. A. (2020). Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Analysis and a New Strategy. arXiv preprint arXiv:2004.00448.
    [71] Guo, Y., Chen, J., Wang, J., Chen, Q., Cao, J., Deng, Z., ... & Tan, M. (2020). Closed-loop Matters: Dual Regression Networks for Single Image Super-Resolution. arXiv preprint arXiv:2003.07018.
    [72] H. Chang, D.-Y. Yeung, and Y. Xiong, “Super-resolution through neighbor embedding,” vol. 1, pp. I–I, 2004.
    [73] Misra, D. (2019). Mish: A Self Regularized Non-Monotonic Neural Activation Function. arXiv preprint arXiv:1908.08681.

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE