簡易檢索 / 詳目顯示

研究生: 林柏辰
Lin, Po-Chen
論文名稱: 植基於深度階級殘差網路之超高解析度影像縮放
Image Super-Resolution via Deep Level Residual Network
指導教授: 陳培殷
Chen, Pei-Yin
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2018
畢業學年度: 106
語文別: 英文
論文頁數: 29
中文關鍵詞: 超高解析度影像縮放類神經網路殘差學習
外文關鍵詞: super-resolution image scaling, neural network, residual learning
相關次數: 點閱:69下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在播放媒體盛行的當下,消費者對於影像畫質的要求日漸提高,螢幕顯示器的解析度一直是各大面板廠努力競逐的目標,4K、8K顯示器的超高解析度影像對於畫面顯示的要求遠遠高於目前主流的1080P螢幕畫質,因此影像解析度提升的討論與研究是注目的焦點。然而在面臨超高解析度的挑戰下,傳統影像處理技術已略顯疲態,影像處理的效果已經無法達到4K、8K的高影像品質需求。因此,在當下人工智慧與機器學習盛行的時代,近期幾篇論文皆陸續以類神經網路架構重建影像達到超高解析度影像縮放的效果,且皆有達到比傳統方法還要好的影像重建品質,但是常有面臨到影像品質還不夠好、網路模組占用內存與記憶體空間過大與影像縮放重建時間太長等等問題。因此本論文提出一個記憶體內存使用空間低且影像重建品質高的卷積神經網路模組,將低解析度的影像輸入,經由網路模組運算後達到兩倍、四倍與八倍放大的效果而且影像不會過度失真進而產生不錯的影像重建品質。
    本論文所提出的深度階級殘差網路,有幾個特點:1)經由兩條特徵提取分支對影像做不同的卷積大小運算,達到更全面性的特徵提取效果。2)利用殘差學習的技巧將網路深度擴大,藉此可以將學習的容量與範圍增加,且使模組學習有效收斂不會因為堆疊過多卷積層而造成訓練失敗。3)網路模組卷積層之間的參數共享,透過參數共享的方式降低參數數量,內存與記憶體佔用空間隨之降低。
    實驗結果證明,本論文提出的方法比其他參考文獻能得到更好的影像縮放重建品質,而且參數使用數量也與比較的論文還要少。本篇論文利用PSNR與SSIM做為影像品質指標。

    At the moment when broadcast media is prevalent, consumers' demands for image quality are increasing. The resolution of screen monitors has always been the goal of panel manufacturers to compete for, so the discussion and research on image resolution improvement is the focus of attention. However, under the challenge of super-resolution, the traditional image processing technology has been slightly fatigued. Therefore, in the era when artificial intelligence and machine learning are prevalent, recent papers have successively reconstructed images with neural network architectures to achieve super-resolution image scaling. However, there are some problems such as the image quality is not good enough, the network module occupies too much memory space, and the image zooming time takes too long. Therefore, this paper proposes a convolutional neural network module with low memory usage and high image reconstruction quality. The low-resolution image input is doubled, quadrupled and eight times amplified by the network module. The effect is that the image is not over-distorted and produces good image reconstruction quality.
    The deep level residual network (DLRN) proposed in this paper has three characteristics: 1) Perform different convolution size operations on the image through two feature extraction branches to achieve more comprehensive feature extraction effect. 2) Using residual learning to expand the depth of the network, thereby increasing the capacity of learning. 3) Through parameter sharing between the network module reduces the number of parameters, and the memory are reduced. Experimental results show that the proposed method can obtain better image scaling and reconstruction quality than other references, and the number of parameters used is less than that of the comparative papers. This paper uses PSNR and SSIM as image quality indicators.

    摘要 III Abstract IV 誌謝 V Contents VI Table VIII Figure Captions IX Chapter 1. Introduction 1 1.1 Motivation 1 1.2 Background 1 1.3 Organization 5 Chapter 2. Related Work 6 2.1 Single-Image Super-Resolution 6 2.2 Residual Neural Network 8 2.2.1 ResNet 9 2.2.2 VDSR 10 2.3 Laplacian Pyramid 12 Chapter 3. Proposed Method 13 3.1 Proposed Network 13 3.1.1 Detail Feature Extraction Branch 14 3.1.2 Local Information Feature Extraction Branch 14 3.1.3 Image Reconstruction Branch 15 3.2 Residual Learning 16 3.3 Parameter sharing 17 3.4 Multi-scale 19 Chapter 4. Experiments and Comparisons 20 4.1 Datasets 20 4.3 Number of Parameters 21 4.4 Comparison of State-of-the-Art Models 22 Chapter 5. Conclusion and Future Work 25 References 26

    [1] R. Keys (1981). "Cubic convolution interpolation for digital image processing". IEEE Transactions on Acoustics, Speech, and Signal Processing. 29 (6): 1153–1160. doi:10.1109/TASSP.1981.1163711.
    [2] Numerical Recipes in C, 1988–92 Cambridge University Press, ISBN 0-521-43108-5, pp. 123–128.
    [3] J. Yang, J. Wright, T. Huang, and Y. Ma, “Image super-resolution as sparse representation of raw image patches,” in IEEE Conference on Computer Vision and Pattern Recognition, 2008.
    [4] R. Zeyde, M. Elad, and M. Protter. On single image scale-up using sparse-representations. In Curves and Surfaces, pages 711–730. Springer, 2012.
    [5] S. Schulter, C. Leistner, and H. Bischof. Fast and accurate image upscaling with super-resolution forests. In CVPR, 2015.
    [6] C. Dong, C. C. Loy, K. He, and X. Tang, “Image super-resolution using deep convolutional networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 2, pp. 295–307, 2015.
    [7] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in IEEE Conference on Computer Vision and Pattern Recognition, 2016.
    [8] V. Jain and H. S. Seung, “Natural image denoising with convolutional networks,” in Advances in Neural Information Processing Systems 21 (NIPS 2008). MIT Press, 2008.
    [9] Z. Wang, D. Liu, J. Yang, W. Han, and T. Huang, “Deep networks for image super-resolution with sparse prior,” in IEEE International Conference on Computer Vision, 2015.
    [10] J. Kim, J. K. Lee, and K. M. Lee, “Accurate image super-resolution using very deep convolutional networks,” in IEEE Conference on Computer Vision and Pattern Recognition, 2016.
    [11] Jiwon Kim, Jung Kwon Lee and Kyoung Mu Lee, “Deeply-recursive convolutional network for image super-resolution,” in IEEE Conference on Computer Vision and Pattern Recognition, 2016.
    [12] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In CVPR, 2016.
    [13] P. J. Burt and E. H. Adelson, “The Laplacian pyramid as a compact image code,” IEEE Transactions on Communications, vol. 31, no. 4, pp. 532–540, 1983.
    [14] G. Ghiasi and C. C. Fowlkes, “Laplacian pyramid reconstruction and refinement for semantic segmentation,” in European Conference on Computer Vision, 2016.
    [15] S. Paris, S. W. Hasinoff, and J. Kautz, “Local laplacian filters: Edge-aware image processing with a laplacian pyramid.” ACM Transactions on Graphics (Proceedings of SIGGRAPH), vol. 30, no. 4, p. 68, 2011.
    [16] E. L. Denton, S. Chintala, and R. Fergus, “Deep generative image models using a laplacian pyramid of adversarial networks,” in Neural Information Processing Systems, 2015.
    [17] W.-S. Lai, J.-B. Huang, N. Ahuja, and M.-H. Yang. Deep laplacian pyramid networks for fast and accurate super-resolution. In IEEE Conferene on Computer Vision and Pattern Recognition, 2017.
    [18] A. Bruhn, J. Weickert, and C. Schn¨orr, “Lucas/Kanade meets Horn/Schunck: Combining local and global optic flow methods,” International Journal of Computer Vision, vol. 61, no. 3, pp. 211–231,2005.
    [19] J. T. Barron, “A more general robust loss function,” arXiv preprint arXiv:1701.03077, 2017.
    [20] J. Yang, J. Wright, T. Huang, and Y. Ma. Image super-resolution via sparse representation. IEEE Transactions on image processing, 19(11):2861–2873, 2010.
    [21] D. Martin, C. Fowlkes, D. Tal, and J. Malik. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In ICCV, 2001.
    [22] C. M. Bevilacqua, A. Roumy, and M.-L. A. Morel. Lowcomplexity single-image super-resolution based on nonnegative neighbor embedding. In BMVC, 2012.
    [23] R. Zeyde, M. Elad, and M. Protter. On single image scaleup using sparse-representations. Curves and Surfaces, pages 711–730, 2012.
    [24] J.-B. Huang, A. Singh, and N. Ahuja. Single image superresolution from transformed self-exemplars. In CVPR, 2015.
    [25] Y. Matsui, K. Ito, Y. Aramaki, T. Yamasaki, and K. Aizawa. Sketch-based manga retrieval using manga109 dataset. Multimedia Tools and Applications, pages 1–28, 2016.
    [26] R. Timofte, R. Rothe, and L. V. Gool. Seven ways to improve example-based single image super resolution. In CVPR, 2016.
    [27] A. Vedaldi and K. Lenc, “MatConvNet: Convolutional neural networks for matlab,” in ACM International conference on Multimedia, 2015.
    [28] A. L. Maas, A. Y. Hannun, and A. Y. Ng, “Rectifier nonlinearities improve neural network acoustic models,” in International Conference on Machine Learning, 2013.
    [29] Zhou Wang and Alan C. Bovik, "Mean squared error: Love it or leave it? - A new look at signal fidelity measures," IEEE Signal Processing Magazine, vol. 26, no. 1, pp 98−117, Jan. 2009.
    [30] H.R. Sheikh, M.F. Sabir, and A.C. Bovik, "A statistical evaluation of recent full reference image quality assessment algorithms," IEEE Transactions on Image Processing, vol.15, no.11, pp.3440−3451, Nov. 2006.
    [31] C. Dong, C. Loy, and X. Tang. Accelerating the superresolution convolutional neural network. In ECCV, 2016.

    下載圖示 校內:2023-08-07公開
    校外:2023-08-07公開
    QR CODE