簡易檢索 / 詳目顯示

研究生: 陳奕廷
Chen, Yi-Ting
論文名稱: 輕量化高頻強化網路之圖像超解析度演算法
Lightweight High-Frequency Enhancing Network for Image Super-Resolution
指導教授: 郭致宏
Kuo, Chih-Hung
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 55
中文關鍵詞: 圖像超解析度深度學習輕量化卷積神經網路Transformer
外文關鍵詞: Image super resolution, Deep learning, Lightweight convolutional neural network, Transformer
相關次數: 點閱:84下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 大多數基於卷積神經網路的超解析度技術在追求較好的重建性能時,往往伴隨大量的參數量以及計算量。由於儲存和計算能力的限制,這些方法無法直接實現於邊緣裝置上。為了提出有競爭力的輕量級超解析架構,我們採用中央像素差分卷積 (Central Pixel Difference Convolution, CPDC) 與Swin Transformer模塊和離散小波轉換 (Discrete Wavelet Transform, DWT) 建構高頻強化網路 (High-Frequency Enhancing Network, HFEN)。我們藉由小波轉換將輸入圖像分解成多個不同頻段的子帶,並且分配較多的參數於高頻部份迫使網路專注於生成高頻特徵。我們也使用卷積運算搭配Swin Transformer模塊以提取局部與非局部特徵。與先前的方法相比,所提出的方法可以減少 35% 的乘加計算量並達到較佳的重建圖像品質。

    Most super-resolution techniques based on convolutional neural networks pursue high performance by growing a large number of parameters. These methods cannot be directly implemented on mobile or edge devices due to limitations on storing and computing capacities. To make a lightweight super-resolution model, we apply the Central Pixel Difference Convolution (CPDC) to be integrated with the Swin Transformer and the discrete wavelet transform (DWT) to construct the High-Frequency Enhancing Network (HFEN). We apply the DWT to decompose the input image into different frequency bands and make the network focus on generating high-frequency information more. We also integrate the convolutional operation and the Swin Transformer block to extract features that consider both local and non-local information. Compared to state-of-the-art methods, the proposed approach can achieve better performance while reducing 35% of the computational costs.

    中文摘要 I 目錄 XI 圖目錄 XIV 表目錄 XVI 第一章 緒論 1 1-1 前言 1 1-2 研究動機 2 1-3 研究貢獻 3 1-4 論文架構 3 第二章 相關研究背景介紹 4 2-1 超解析度技術 (Super-Resolution) 4 2-1-1 圖像超解析度 (Single image super-resolution, SISR) 4 2-2 深度學習 (Deep Learning) 5 2-2-1 人工神經網路 (Artificial Neural Networks) 6 2-2-2 深度神經網路 (Deep Neural Networks) 6 2-2-3 反向傳播法 (Back-Propagation) 7 2-2-4 卷積神經網路 (Convolutional Neural Networks) 8 2-2-5 注意力機制 (Attention Mechanism) 9 第三章 深度學習超解析度技術回顧 14 3-1 經典的圖像超解析度演算法 14 3-1-1 基於卷積神經網路之圖像超解析度演算法 14 3-1-2 基於增強深度殘差網路之圖像超解析度演算法 16 3-1-3 使用通道注意力機制之圖像超解析度演算法 16 3-2 輕量化圖像超解析度演算法 18 3-2-1 基於卷積神經網路之快速超解析度演算法 18 3-2-2 使用通道分裂策略之圖像解析度演算法 19 3-2-3 使用像素注意力機制之圖像解析度演算法 20 3-2-4 使用自我校準特徵融合之圖像解析度演算法 22 3-3 超解析度相關研究方法比較 23 第四章 輕量化高頻強化網路之圖像超解析度演算法 25 4-1 高頻強化網路之架構 (HFEN) 25 4-2 混合模塊 (Hybrid Module, HM) 27 4-2-1 調整後的帶有像素注意力機制之自我校準模塊 (Modified Self-Calibrated block with Pixel Attention, M-SCPA) 28 4-2-2 Swin Transformer Block, STB 29 4-3 重建模塊 (Reconstruction Module) 32 4-4 損失函數 (Loss Function) 33 第五章 實驗環境與數據分析 35 5-1 資料集(Dataset) 35 5-2 影像品質評估 39 5-3 網路實施細節 40 5-4 架構分析 40 5-4-1 不同卷積算子對超解析任務之有效性 41 5-4-2 架構之計算量分析 42 5-4-3 高頻強化網路中各部件之有效性 43 5-4-4 不同分支中混合模塊數量對超解析任務的影響 44 5-4-5 不同小波轉換損失項對超解析任務之有效性 46 5-5 重建結果與比較 47 第六章 結論與未來展望 50 6-1 結論 50 6-2 未來展望 50 參考文獻 51

    [1] C. Dong, C. C. Loy, K. He, and X. Tang, “Learning a Deep Convolutional Network for Image Super-Resolution,” in The European Conference on Computer Vision, pp. 184–199, Springer, 2014..
    [2] B. Lim, S. Son, H. Kim, S. Nah, and K. Mu Lee, “Enhanced Deep Residual Networks for Single Image Super-Resolution,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 136–144, 2017.
    [3] Y. Zhang, K. Li, K. Li, L. Wang, B. Zhong, and Y. Fu, “Image Super-Resolution Using Very Deep Residual Channel Attention Networks,” in Proceedings of the European Conference on Computer Vision (ECCV), pp. 286–301, 2018.
    [4] Z. Hui, X. Wang, and X. Gao, “Fast and Accurate Single Image Super-Resolution via Information Distillation Network,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 723– 731, 2018.
    [5] H. Zhao, X. Kong, J. He, Y. Qiao, and C. Dong, “Efficient Image Super-Resolution Using Pixel Attention,” in European Conference on Computer Vision, pp. 56–72, Springer, 2020.
    [6] Z. Yu, B. Zhou, J. Wan, P. Wang, H. Chen, X. Liu, S. Z. Li, and G. Zhao, “Searching Multi-Rate and Multi-Modal Temporal Enhanced Networks for Gesture Recognition,” IEEE Transactions on Image Processing, vol. 30, pp. 5626–5640, 2021.
    [7] Z. Yu, C. Zhao, Z. Wang, Y. Qin, Z. Su, X. Li, F. Zhou, and G. Zhao, “Searching Central Difference Convolutional Networks for Face AntiSpoofing,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5295–5305, 2020.
    [8] Z. Yu, J. Wan, Y. Qin, X. Li, S. Z. Li, and G. Zhao, “NAS-FAS: StaticDynamic Central Difference Network Search for Face Anti-Spoofing,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 9, pp. 3005–3023, 2020.
    [9] Z. Su, W. Liu, Z. Yu, D. Hu, Q. Liao, Q. Tian, M. Pietikäinen, and L. Liu, “Pixel Difference Networks for Efficient Edge Detection,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5117– 5127, 2021.
    [10] J. Yang, J. Wright, T. S. Huang, and Y. Ma, “Image Super-Resolution Via Sparse Representation,” IEEE Transactions on Image Processing, vol. 19, no. 11, pp. 2861– 2873, 2010.
    [11] F. Rosenblatt, “The perceptron: A probabilistic model for information storage and organization in the brain.,” Psychological Review, vol. 65, no. 6, p. 386, 1958.
    [12] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by back-propagating errors,” nature, vol. 323, no. 6088, pp. 533– 536, 1986.
    [13] G. E. Hinton, “A Practical Guide to Training Restricted Boltzmann Machines,” in Neural networks: Tricks of the trade, pp. 599–619, Springer, 2012.
    [14] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105, 2012.
    [15] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.
    [16] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention Is All You Need,” Advances in neural information processing systems, vol. 30, 2017.
    [17] J. Hu, L. Shen, and G. Sun, “Squeeze-and-Excitation Networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141, 2018.
    [18] S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, “CBAM: Convolutional Block Attention Module,” in Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19, 2018.
    [19] K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778, 2016.
    [20] W. Shi, J. Caballero, F. Huszár, J. Totz, A. P. Aitken, R. Bishop, D. Rueckert, and Z. Wang, “Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1874– 1883, 2016.
    [21] C. Dong, C. C. Loy, and X. Tang, “Accelerating the Super-Resolution Convolutional Neural Network,” in European Conference on Computer Vision, pp. 391–407, Springer, 2016.
    [22] C. Tan, S. Cheng, and L. Wang, “Efficient Image Super-Resolution via Self-Calibrated Feature Fuse,” Sensors, vol. 22, no. 1, p. 329, 2022.
    [23] J. Fang, H. Lin, X. Chen, and K. Zeng, “A Hybrid Network of CNN and Transformer for Lightweight Image Super-Resolution,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1103–1112, 2022.
    [24] Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin Transformer: Hierarchical Vision Transformer using Shifted Windows,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022, 2021.
    [25] W. Zou, L. Chen, Y. Wu, Y. Zhang, Y. Xu, and J. Shao, “Joint wavelet subbands guided network for single image super-resolution,” IEEE Transactions on Multimedia, 2022.
    [26] E. Agustsson and R. Timofte, “NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 126– 135, 2017.
    [27] M. Bevilacqua, A. Roumy, C. Guillemot, and M. L. Alberi-Morel, “Low-Complexity Single-Image Super-Resolution based on Nonnegative Neighbor Embedding,” 2012.
    [28] R. Zeyde, M. Elad, and M. Protter, “On Single Image Scale-Up Using Sparse-Representations,” in International Conference on Curves and Surfaces, pp. 711–730, Springer, 2010.
    [29] D. Martin, C. Fowlkes, D. Tal, and J. Malik, “A Database of Human Segmented Natural Images and its Application to Evaluating Segmentation Algorithms and Measuring Ecological Statistics,” in Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, vol. 2, pp. 416–423, IEEE, 2001.
    [30] J.-B. Huang, A. Singh, and N. Ahuja, “Single Image Super-resolution from Transformed Self-Exemplars ,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5197–5206, 2015.
    [31] Y. Matsui, K. Ito, Y. Aramaki, A. Fujimoto, T. Ogawa, T. Yamasaki, and K. Aizawa, “Sketch-based Manga Retrieval using Manga109 Dataset,” Multimedia Tools and Applications, vol. 76, no. 20, pp. 21811–21838, 2017.
    [32] ] D. P. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization,” arXiv preprint arXiv:1412.6980, 2014.
    [33] I. Loshchilov and F. Hutter, “SGDR: Stochastic Gradient Descent with Warm Restarts,” arXiv preprint arXiv:1608.03983, 2016.
    [34] J. Kim, J. K. Lee, and K. M. Lee, “Accurate Image Super-Resolution Using Very Deep Convolutional Networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1646–1654, 2016.
    [35] J. Kim, J. K. Lee, and K. M. Lee, “Deeply-Recursive Convolutional Network for Image Super-Resolution,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1637–1645, 2016.
    [36] W.-S. Lai, J.-B. Huang, N. Ahuja, and M.-H. Yang, “Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 624–632, 2017.
    [37] Y. Tai, J. Yang, and X. Liu, “Image Super-Resolution via Deep Recursive Residual Network,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3147–3155, 2017.
    [38] Y. Tai, J. Yang, X. Liu, and C. Xu, “MemNet: A Persistent Memory Network for Image Restoration,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 4539–4547, 2017.
    [39] N. Ahn, B. Kang, and K.-A. Sohn, “Fast, Accurate, and Lightweight Super-Resolution with Cascading Residual Network,” in Proceedings of the European Conference on Computer Vision (ECCV), pp. 252–268, 2018.
    [40] C. Ledig, L. Theis, F. Huszár, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, et al., “Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4681–4690, 2017.
    [41] X. Ding, X. Zhang, N. Ma, J. Han, G. Ding, and J. Sun, “RepVGG: Making VGG-style ConvNets Great Again,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13733–13742, 2021.

    下載圖示 校內:2025-03-16公開
    校外:2025-03-16公開
    QR CODE