研究生: |
陳奕廷 Chen, Yi-Ting |
---|---|
論文名稱: |
輕量化高頻強化網路之圖像超解析度演算法 Lightweight High-Frequency Enhancing Network for Image Super-Resolution |
指導教授: |
郭致宏
Kuo, Chih-Hung |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
論文出版年: | 2023 |
畢業學年度: | 111 |
語文別: | 中文 |
論文頁數: | 55 |
中文關鍵詞: | 圖像超解析度 、深度學習 、輕量化卷積神經網路 、Transformer |
外文關鍵詞: | Image super resolution, Deep learning, Lightweight convolutional neural network, Transformer |
相關次數: | 點閱:84 下載:2 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
大多數基於卷積神經網路的超解析度技術在追求較好的重建性能時,往往伴隨大量的參數量以及計算量。由於儲存和計算能力的限制,這些方法無法直接實現於邊緣裝置上。為了提出有競爭力的輕量級超解析架構,我們採用中央像素差分卷積 (Central Pixel Difference Convolution, CPDC) 與Swin Transformer模塊和離散小波轉換 (Discrete Wavelet Transform, DWT) 建構高頻強化網路 (High-Frequency Enhancing Network, HFEN)。我們藉由小波轉換將輸入圖像分解成多個不同頻段的子帶,並且分配較多的參數於高頻部份迫使網路專注於生成高頻特徵。我們也使用卷積運算搭配Swin Transformer模塊以提取局部與非局部特徵。與先前的方法相比,所提出的方法可以減少 35% 的乘加計算量並達到較佳的重建圖像品質。
Most super-resolution techniques based on convolutional neural networks pursue high performance by growing a large number of parameters. These methods cannot be directly implemented on mobile or edge devices due to limitations on storing and computing capacities. To make a lightweight super-resolution model, we apply the Central Pixel Difference Convolution (CPDC) to be integrated with the Swin Transformer and the discrete wavelet transform (DWT) to construct the High-Frequency Enhancing Network (HFEN). We apply the DWT to decompose the input image into different frequency bands and make the network focus on generating high-frequency information more. We also integrate the convolutional operation and the Swin Transformer block to extract features that consider both local and non-local information. Compared to state-of-the-art methods, the proposed approach can achieve better performance while reducing 35% of the computational costs.
[1] C. Dong, C. C. Loy, K. He, and X. Tang, “Learning a Deep Convolutional Network for Image Super-Resolution,” in The European Conference on Computer Vision, pp. 184–199, Springer, 2014..
[2] B. Lim, S. Son, H. Kim, S. Nah, and K. Mu Lee, “Enhanced Deep Residual Networks for Single Image Super-Resolution,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 136–144, 2017.
[3] Y. Zhang, K. Li, K. Li, L. Wang, B. Zhong, and Y. Fu, “Image Super-Resolution Using Very Deep Residual Channel Attention Networks,” in Proceedings of the European Conference on Computer Vision (ECCV), pp. 286–301, 2018.
[4] Z. Hui, X. Wang, and X. Gao, “Fast and Accurate Single Image Super-Resolution via Information Distillation Network,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 723– 731, 2018.
[5] H. Zhao, X. Kong, J. He, Y. Qiao, and C. Dong, “Efficient Image Super-Resolution Using Pixel Attention,” in European Conference on Computer Vision, pp. 56–72, Springer, 2020.
[6] Z. Yu, B. Zhou, J. Wan, P. Wang, H. Chen, X. Liu, S. Z. Li, and G. Zhao, “Searching Multi-Rate and Multi-Modal Temporal Enhanced Networks for Gesture Recognition,” IEEE Transactions on Image Processing, vol. 30, pp. 5626–5640, 2021.
[7] Z. Yu, C. Zhao, Z. Wang, Y. Qin, Z. Su, X. Li, F. Zhou, and G. Zhao, “Searching Central Difference Convolutional Networks for Face AntiSpoofing,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5295–5305, 2020.
[8] Z. Yu, J. Wan, Y. Qin, X. Li, S. Z. Li, and G. Zhao, “NAS-FAS: StaticDynamic Central Difference Network Search for Face Anti-Spoofing,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 9, pp. 3005–3023, 2020.
[9] Z. Su, W. Liu, Z. Yu, D. Hu, Q. Liao, Q. Tian, M. Pietikäinen, and L. Liu, “Pixel Difference Networks for Efficient Edge Detection,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5117– 5127, 2021.
[10] J. Yang, J. Wright, T. S. Huang, and Y. Ma, “Image Super-Resolution Via Sparse Representation,” IEEE Transactions on Image Processing, vol. 19, no. 11, pp. 2861– 2873, 2010.
[11] F. Rosenblatt, “The perceptron: A probabilistic model for information storage and organization in the brain.,” Psychological Review, vol. 65, no. 6, p. 386, 1958.
[12] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by back-propagating errors,” nature, vol. 323, no. 6088, pp. 533– 536, 1986.
[13] G. E. Hinton, “A Practical Guide to Training Restricted Boltzmann Machines,” in Neural networks: Tricks of the trade, pp. 599–619, Springer, 2012.
[14] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105, 2012.
[15] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.
[16] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention Is All You Need,” Advances in neural information processing systems, vol. 30, 2017.
[17] J. Hu, L. Shen, and G. Sun, “Squeeze-and-Excitation Networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141, 2018.
[18] S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, “CBAM: Convolutional Block Attention Module,” in Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19, 2018.
[19] K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778, 2016.
[20] W. Shi, J. Caballero, F. Huszár, J. Totz, A. P. Aitken, R. Bishop, D. Rueckert, and Z. Wang, “Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1874– 1883, 2016.
[21] C. Dong, C. C. Loy, and X. Tang, “Accelerating the Super-Resolution Convolutional Neural Network,” in European Conference on Computer Vision, pp. 391–407, Springer, 2016.
[22] C. Tan, S. Cheng, and L. Wang, “Efficient Image Super-Resolution via Self-Calibrated Feature Fuse,” Sensors, vol. 22, no. 1, p. 329, 2022.
[23] J. Fang, H. Lin, X. Chen, and K. Zeng, “A Hybrid Network of CNN and Transformer for Lightweight Image Super-Resolution,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1103–1112, 2022.
[24] Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin Transformer: Hierarchical Vision Transformer using Shifted Windows,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022, 2021.
[25] W. Zou, L. Chen, Y. Wu, Y. Zhang, Y. Xu, and J. Shao, “Joint wavelet subbands guided network for single image super-resolution,” IEEE Transactions on Multimedia, 2022.
[26] E. Agustsson and R. Timofte, “NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 126– 135, 2017.
[27] M. Bevilacqua, A. Roumy, C. Guillemot, and M. L. Alberi-Morel, “Low-Complexity Single-Image Super-Resolution based on Nonnegative Neighbor Embedding,” 2012.
[28] R. Zeyde, M. Elad, and M. Protter, “On Single Image Scale-Up Using Sparse-Representations,” in International Conference on Curves and Surfaces, pp. 711–730, Springer, 2010.
[29] D. Martin, C. Fowlkes, D. Tal, and J. Malik, “A Database of Human Segmented Natural Images and its Application to Evaluating Segmentation Algorithms and Measuring Ecological Statistics,” in Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, vol. 2, pp. 416–423, IEEE, 2001.
[30] J.-B. Huang, A. Singh, and N. Ahuja, “Single Image Super-resolution from Transformed Self-Exemplars ,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5197–5206, 2015.
[31] Y. Matsui, K. Ito, Y. Aramaki, A. Fujimoto, T. Ogawa, T. Yamasaki, and K. Aizawa, “Sketch-based Manga Retrieval using Manga109 Dataset,” Multimedia Tools and Applications, vol. 76, no. 20, pp. 21811–21838, 2017.
[32] ] D. P. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization,” arXiv preprint arXiv:1412.6980, 2014.
[33] I. Loshchilov and F. Hutter, “SGDR: Stochastic Gradient Descent with Warm Restarts,” arXiv preprint arXiv:1608.03983, 2016.
[34] J. Kim, J. K. Lee, and K. M. Lee, “Accurate Image Super-Resolution Using Very Deep Convolutional Networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1646–1654, 2016.
[35] J. Kim, J. K. Lee, and K. M. Lee, “Deeply-Recursive Convolutional Network for Image Super-Resolution,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1637–1645, 2016.
[36] W.-S. Lai, J.-B. Huang, N. Ahuja, and M.-H. Yang, “Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 624–632, 2017.
[37] Y. Tai, J. Yang, and X. Liu, “Image Super-Resolution via Deep Recursive Residual Network,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3147–3155, 2017.
[38] Y. Tai, J. Yang, X. Liu, and C. Xu, “MemNet: A Persistent Memory Network for Image Restoration,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 4539–4547, 2017.
[39] N. Ahn, B. Kang, and K.-A. Sohn, “Fast, Accurate, and Lightweight Super-Resolution with Cascading Residual Network,” in Proceedings of the European Conference on Computer Vision (ECCV), pp. 252–268, 2018.
[40] C. Ledig, L. Theis, F. Huszár, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, et al., “Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4681–4690, 2017.
[41] X. Ding, X. Zhang, N. Ma, J. Han, G. Ding, and J. Sun, “RepVGG: Making VGG-style ConvNets Great Again,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13733–13742, 2021.