| 研究生: |
姚恆琳 Yao, Heng-Lin |
|---|---|
| 論文名稱: |
離散小波轉換取樣模型於低解析度影像復原之應用 Discrete Wavelet Transform Sampling for low resolution Image Restoration |
| 指導教授: |
陳介力
Chen, Chieh-Li |
| 學位類別: |
碩士 Master |
| 系所名稱: |
工學院 - 航空太空工程學系 Department of Aeronautics & Astronautics |
| 論文出版年: | 2024 |
| 畢業學年度: | 112 |
| 語文別: | 中文 |
| 論文頁數: | 59 |
| 中文關鍵詞: | 深度學習 、超解析度 、離散小波轉換 、輕巧 |
| 外文關鍵詞: | Deep Learning, Super Resolution, Discrete Wavelet Transform, Lightweight |
| 相關次數: | 點閱:44 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在現代生活中無人機扮演著重要角色,其中在戰爭上的戰略地位也逐漸提升,為了讓無人機更有效的完成各項不同的戰略任務,除了無人機本身的運動能力改善之外,更好的影像品質也是一種提升手段,而人工智慧鏡頭就是一種提升影像品質的方式,透過超解析度(Super Resolution)之深度學習模型,來提升遠處目標的影像解析度,並以此提升對於同樣距離下目標物的辨識成功度和準確度,本文採取監督式學習(Supervised Learning)來實現這項任務,使用U-Net作為深度學習模型網路的基礎架構,並使用離散小波轉換(Discrete Wavelet Transform)和反轉換(Discrete Wavelet Inverse Transform)來取代傳統的上下取樣,並將編碼器(Encoder)與解碼器(Decoder)當中的卷積層換成由殘差通道注意力模塊(Residual Channel Attention Block)組成的殘差群(Residual Group),將其中的批次標準化(Batch Normalization)用權重標準化(Weight Normalization)替換、加上Dropout層,另外加上選擇性核特徵融合(Selective Kernel Feature Fusion)模塊作為注意力閘(Attention gate)。以現有超解析度測試資料集與其他網路模型進行解析度品質比較,並同時考慮模型整體可訓練的參數量以便建構輕巧的影像超解析度之模型。最後會根據所設計的類神經網路模型進行影像辨識實驗來驗證所設計的類神經網路效能,讓無人載具在更遠的距離下,就能提前辨識到目標物件。
In the battlefield, drones require clear images to perform various tasks, such as target detection and other missions. When targets are located at a significant distance, obtaining clear images can be challenging, necessitating the use of image super-resolution to ensure the reliability of detection results. Furthermore, due to the need for real-time and clear battlefield environment images, drones not only need to restore damaged images but also must be able to do so in a short period, allowing for faster target identification and tactical planning. This paper adopts a supervised learning approach to address this problem. We employ U-Net as the foundational architecture of our network, enhancing its encoder and decoder with suitable modules and techniques such as Discrete Wavelet Transform, Channel Attention Residual Modules, Selective Kernel Feature Fusion, Weight Normalization, and Dropout. Additionally, a super-resolution dataset is used to compare image super-resolution performance, and the model is benchmarked against other network models, considering the total trainable parameters to construct a lightweight and real-time image super-resolution model. Finally, the designed neural network is applied to image recognition and tested in a practical scenario to validate its performance. With the proposed network, drones can obtain clearer images at the same distance, detect enemies before they do, and thus formulate countermeasures in advance.
Agustsson, E., & Timofte, R. (2017). Ntire 2017 challenge on single image super-resolution: Dataset and study. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops (pp. 126-135).
Ahn, B., and Cho, N. I. (2017). Block-Matching Convolutional Neural Network for Image Denoising. arXiv:1704.00524. Retrieved April 01, 2017, from https://ui.adsabs.harvard.edu/abs/2017arXiv170400524A
Bochkovskiy, A., Wang, C.-Y., and Liao, H.-Y. M. (2020). Yolov4: Optimal speed and accuracy of object detection. arXiv:2004.10934. https://arxiv.org/abs/2004.10934
Dong, C., Change Loy, C., He, K. and Tang, X. (2014). Image Super-Resolution Using Deep Convolutional Networks. arXiv:1501.00092. Retrieved December 01, 2014, from https://ui.adsabs.harvard.edu/abs/2015arXiv150100092D
Dong, C., Change Loy, C., He, K., and Tang, X. (2016). Accelerating the Super-Resolution Convolutional Neural Network. arXiv:1608.00367. Retrieved August 01, 2016, from https://ui.adsabs.harvard.edu/abs/2016arXiv160800367D
Deon van der Merwe, David R. Burchfield, Trevor D. Witt, Kevin P. Price and Sharda, A. (2020). Chapter One - Drones in agriculture. In D. L. Sparks (Ed.), Advances in Agronomy (Vol. 162, pp. 1-30). https://doi.org/10.1016/bs.agron.2020.03.001
Daubechies, I., (1990). The wavelet transform, time-frequency localization and signal analysis. IEEE Transactions on Information Theory, 36(5), 961-1005. https://doi.org/10.1109/18.57199
Dehghani, M., Djolonga, J., Mustafa, B., Padlewski, P., Heek, J., Gilmer, J., ... & Houlsby, N. (2023, July). Scaling vision transformers to 22 billion parameters. In International Conference on Machine Learning (pp. 7480-7512). PMLR.
Howard, J., Murashov, V., and Branche, C. M. (2018). Unmanned aerial vehicles in construction and worker safety. American Journal of Industrial Medicine, 61(1), 3-10.
Ioffe, S., and Szegedy, C. (2015). Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. arXiv:1502.03167. Retrieved February 01, 2015, from https://ui.adsabs.harvard.edu/abs/2015arXiv150203167I
Kingma, D. P., and Ba ,J., (2014). A method for stochastic optimization. arXiv:1412.6980. https://arxiv.org/abs/1412.6980
Lim ,B., Son, S., Kim, H., Nah, S., and Lee, K. M. (2017). Enhanced Deep Residual Networks for Single Image Super-Resolution. arXiv:1707.02921. Retrieved July 01, 2017, from https://ui.adsabs.harvard.edu/abs/2017arXiv170702921L
Ledig, C., Theis, L., Huszar, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., Wang, Z.,and Shi, W. (2016). Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network. arXiv:1609.04802. Retrieved September 01, 2016, from https://ui.adsabs.harvard.edu/abs/2016arXiv160904802L
Li., X., Wang., W., Hu, X., and Yang, J. (2019). Selective Kernel Networks. arXiv:1903.06586. Retrieved March 01, 2019, from https://ui.adsabs.harvard.edu/abs/2019arXiv190306586L
Liu, P., Zhang, H., Zhang, K., Lin, L., and Zuo, W. (2018). Multi-level Wavelet-CNN for Image Restoration. arXiv:1805.07071. Retrieved May 01, 2018, from https://ui.adsabs.harvard.edu/abs/2018arXiv180507071L
Lecun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324. https://doi.org/10.1109/5.726791
Ma'Sum, M. A., Prayudi, Y. T., S., A. H., Harjoko, A., Crigler, A. I., and Purnamasari, D. (2013). Simulation of intelligent unmanned aerial vehicle (UAV) for military surveillance
Mishra, B., Garg, D., Narang, P., and Mishra V. (2020). Drone-surveillance for search and rescue in natural disaster. Computer Communications, 156, 1-10.
Meng, N., Wu, X., Liu, J., and Lam, E. Y. (2020). High-Order Residual Network for Light Field Super-Resolution. arXiv:2003.13094. Retrieved March 01, 2020, from https://ui.adsabs.harvard.edu/abs/2020arXiv200313094M
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15(1), 1929-1958.
Ronneberger, O., Fischer, P., & Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. In Medical image computing and computer-assisted intervention–MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18 (pp. 234-241). Springer International Publishing.
Rosser JC Jr, Vignesh V, Terwilliger BA and BC, P. (2018). Surgical and Medical Applications of Drones: A Comprehensive Review. JSLS, 22(3), e2018.00018. https://doi.org/10.4293/JSLS.2018.00018
Shao, G., Sun, Q., Gao, Y., Zhu, Q., Gao, F., and Zhang, J. (2023). Sub-Pixel Convolutional Neural Network for Image Super-Resolution Reconstruction. Electronics, 12(17), 3572. https://www.mdpi.com/2079-9292/12/17/3572
Shi, S., Xiangli, B., and Yin, Z. (2021). Multiframe Super-Resolution of Color Images Based on Cross Channel Prior. Symmetry, 13(5), 901. https://www.mdpi.com/2073-8994/13/5/901
Salimans, T. and Kingma, D. P. (2016). Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks. arXiv:1602.07868. Retrieved February 01, 2016, from https://ui.adsabs.harvard.edu/abs/2016arXiv160207868S
Tsai, C.-Y., and Chen, C.-L. (2022). Attention-Gate-Based Model with Inception-like Block for Single-Image Dehazing. Applied Sciences, 12(13), 6725. https://www.mdpi.com/2076-3417/12/13/6725
Timofte, R., Agustsson, E., Van Gool, L., Yang, M. H., & Zhang, L. (2017). Ntire 2017 challenge on single image super-resolution: Methods and results. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops (pp. 114-125).
Wei, C., Wang, W., Yang, W., and Liu, J. (2018). Deep Retinex Decomposition for Low-Light Enhancement. arXiv:1808.04560. Retrieved August 01, 2018, from https://ui.adsabs.harvard.edu/abs/2018arXiv180804560W
Xie, S., Girshick, R., Dollár, P., Tu, Z., and He, K. (2016). Aggregated Residual Transformations for Deep Neural Networks. arXiv:1611.05431. Retrieved November 01, 2016, from https://ui.adsabs.harvard.edu/abs/2016arXiv161105431X
Yu, J., Fan, Y., Yang, J., Xu, N., Wang, Z., Wang, X., and Huang, T. (2018). Wide Activation for Efficient and Accurate Image Super-Resolution. arXiv:1808.08718. Retrieved August 01, 2018, from https://ui.adsabs.harvard.edu/abs/2018arXiv180808718Y
Zhang, Q. L. Z., and Wang, Y. (2018). Road Extraction by Deep Residual U-Net. IEEE Geoscience and Remote Sensing Letters, 15(5), 749-753. https://doi.org/10.1109/LGRS.2018.2802944
Zamir, S. W., Arora, A., Khan, S., Hayat, M., Khan, F. S., and Yang, L. S. M. (2020, August 23–28, 2020). Learning enriched features for real image restoration and enhancement Computer Vision–ECCV 2020: 16th European Conference,
Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., and Fu, Y. (2018). Image Super-Resolution Using Very Deep Residual Channel Attention Networks. arXiv:1807.02758. Retrieved July 01, 2018, from https://ui.adsabs.harvard.edu/abs/2018arXiv180702758Z
校內:2027-08-15公開