簡易檢索 / 詳目顯示

研究生: 魏慎廷
Wei, Shen-Ting
論文名稱: 一個利用全域與區域生成對抗網路的自然影像修補演算法
A Natural Image Inpainting Algorithm by Global and Local Generative Adversarial Networks
指導教授: 戴顯權
Tai, Shen-Chuan
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電腦與通信工程研究所
Institute of Computer & Communication Engineering
論文出版年: 2018
畢業學年度: 106
語文別: 英文
論文頁數: 73
中文關鍵詞: 影像修補卷積網路自編碼生成對抗網路非監督式學習
外文關鍵詞: image inpainting, convolution neural network, autoencoder, generative adversarial network, unsupervised learning
相關次數: 點閱:83下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 傳統方法的影像修補對於小區塊的修補能夠達到良好的效果,但在大區塊且複雜的影像修補,常因為影像損失的資訊太多,以及無法自行生成影像中不存在物件的限制,造成修補的結果不盡理想。近年來,鑒於深度學習強大的學習能力與生成對抗網路自行生成影像的能力,大區塊的影像修補有著不錯的進展,但修補結果在細節與紋理的效果並不佳。
    本論文提出的模型主要針對細節的修補做出改善,主要架構為一個卷積網路與兩個生成對抗網路的非監督式學習。首先會先將數據集做初步的裁切當作訓練集,裁切的部分則作為修補的依據,之後將訓練集輸入卷積網路得到影像修補的結果,並利用區域對抗網路與全域對抗網路作為輔助進行權重的更新與改善修補的效果。實驗結果顯示,此方法修補出來影像可獲得更多的細節與紋理。
    此外,利用深度學習進行影像修補往往需要使用大量的數據集訓練,通常需要花費大量的訓練時間以及使用效能極佳的硬體設備,本論文也探討是否使用小量的數據集以及在有限的硬體資源下,是否也能達到較佳的影像修補結果。

    The traditional methods of image inpainting can achieve good results for the recovery of small blocks. However, because of a large number information loss of the image and the inability to create objects, the inpainting result is not as expected in large blocks and complicated image inpainting. Recently, in light of the powerful learning ability of deep learning and the ability of generating images from generative adversarial network (GAN), large block image inpainting has made good progress, but the effect on details and textures is not good.
    The model presented in this thesis is mainly aimed at improving the inpainting of details. The main structure is an unsupervised learning framework, including one convolutional network and two discriminators. First, the dataset is cropped as the training set, and the cropping part is used as a basis for inpainting. After that, the training set is the input of convolutional network to obtain the results of image patching. The local discriminator and the global discriminator as auxiliary networks to update the weights and improve the results. Experimental results show that this method can complete the image and obtain more details and textures.
    In addition, deep learning for image inpainting often requires to use a large number of datasets in training, this spends a lot of training time and the use of high-level hardware resources. This thesis also discusses whether with a smaller amount of datasets and limited hardware resources can also achieve better results of image inpainting.

    Contents i List of Tables iii List of Figures iv Chapter 1 Introduction 1 Chapter 2 Background and Related Works 5 2.1 Traditional Approaches 5 2.1.1 Patch-based Approach 5 2.1.2 Data-driven Approach 6 2.2 Machine Learning 7 2.2.1 Principal component analysis 9 2.3 Deep Learning 10 2.3.1 Neural Network 11 2.3.2 Convolutional Neural Network 17 2.4 Related Works 22 2.4.1 Autoencoder 22 2.4.2 Generative Adversarial Networks 24 2.4.2.1 Deep Convolutional Generative Adversarial Networks 26 2.4.2.2 Wasserstein GAN 28 2.4.2.3 Improved training of Wasserstein GAN 30 2.4.3 Context Encoders 33 2.4.4 Image Inpainting for Irregular Holes Using Partial Convolutions 35 Chapter 3 The Proposed Algorithm 40 3.1 Reconstruction Network 43 3.2 Local Discriminator and Global Discriminator 45 3.3.1 Reconstruction Loss 48 3.3.2 Adversarial Loss 49 3.4 Training 50 Chapter 4 Experimental Results 52 4.1 Dataset 52 4.2 Experimental Setting 54 4.3 Experimental Result of Image Inpainting 55 Chapter 5 Conclusion and Future Work 67 5.1 Conclusion 67 5.2 Future Work 68 Reference 69

    [1] C. Barnes, E. Shechtman, A. Finkelstein, and D. Goldman. Patchmatch: A randomized correspondence algorithm for structural image editing. ACM Transactions on Graphics,
    2009.
    [2] Soheil Darabi, Eli Shechtman, Connelly Barnes, Dan B Goldman, and Pradeep Sen. Image Melding: Combining Inconsistent Images using Patch-based Synthesis. ACM Transactions on Graphics (Proceedings of SIGGRAPH) 31, 4, Article 82 (2012), 82:1 – 82:10 pages.
    [3] Jia-Bin Huang, Sing Bing Kang, Narendra Ahuja, and Johannes Kopf. Image Completion Using Planar Structure Guidance. ACM Transactions on Graphics (Proceedings of SIGGRAPH) 33, 4, Article 129 (2014), 10 pages.
    [4] Denis Simakov, Yaron Caspi, Eli Shechtman, and Michal Irani. 2008. Summarizing visual data using bidirectional similarity. In IEEE Conference on Computer Vision and Pattern Recognition. 1 – 8.
    [5] Yonatan Wexler, Eli Shechtman, and Michal Irani. 2007. Space-Time Completion of Video. IEEE Transactions on Pattern Analysis and Machine Intelligence 29, 3 (2007), 463–476.
    [6] A. A. Efros and T. K. Leung. Texture synthesis by nonparametric sampling. In ICCV, pages 1033–1038, 1999.
    [7] A. A. Efros and W. T. Freeman. Image quilting for texture synthesis and transfer. In ACM SIGGRAPH, pages 341– 346, 2001.
    [8] V. Kwatra, A. Schödl, I. Essa, G. Turk, and A. Bobick. Graphcut textures: Image and video synthesis using graph cuts. In ACM SIGGRAPH, pages 277–286, 2003.
    [9] V. Kwatra, I. Essa, A. Bobick, and N. Kwatra. Texture optimization for example-based synthesis. TOG, 24(3):795 – 802, 2005.
    [10] A. Criminisi, P. Pérez, and K. Toyama. Object removal by exemplar-based inpainting. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), volume 2, pages II–721 – II–728 vol.2, 2003.
    [11] I. Drori, D. Cohen-Or, and H. Yeshurun. Fragment-based image completion. TOG, 22(3):303–312, 2003.
    [12] Y. Wexler, E. Shechtman, and M. Irani. Space-time video completion. In CVPR, pages 120–127, 2001.
    [13] M. Wilczkowiak, G. Brostow, B. Tordoff, and R. Cipolla. Hole fill through photomontage. In BMVC, pages 492 – 501, 2005.
    [14] N. Komodakis. Image completion using global optimization. In CVPR, pages 442–452, 2006.
    [15] N. Komodakis and G. Tziritas. Image completion using efficient belief propagation via priority scheduling and dynamic pruning. TIP, 16(11):2649–2661, 2007.
    [16] J. Hays and A. A. Efros. Scene completion using millions of photographs. TOG, 26(3), 2007.
    [17] D. Pathak, P. Krähenbühl, J. Donahue, T. Darrell, and A. Efros. Context encoders: Feature learning by inpainting. In CVPR, 2016.
    [18] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In NIPS, pages 2672–2680, 2014.
    [19] Martin Arjovsky, Soumith Chintala, and Léon Bottou. Wasserstein GAN. arXiv, 2017.
    [20] Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin and Aaron Courville. Improved Training of Wasserstein GANs. NIPS, 2017.
    [21] M. Bertalmio, L. Vese, G. Sapiro, and S. Osher. 2003. Simultaneous structure and texture image inpainting. IEEE Transactions on Image Processing 12, 8 (2003), 882–889.
    [22] A. Criminisi, P. Perez, and K. Toyama. 2004. Region Filling and Object Removal by Exemplar-based Image Inpainting. IEEE Transactions on Image Processing 13, 9 (2004), 1200–1212.
    [23] Oliver Whyte, Josef Sivic, and Andrew Zisserman. 2009. Get Out of my Picture! Internetbased Inpainting. In British Machine Vision Conference.
    [24] K. Fukushima. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological cybernetics, 1980.
    [25] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel. Backpropagation applied to handwritten zip code recognition. Neural computation, 1989. 2
    [26] R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In CVPR, 2014.
    [27] A. Krizhevsky, I. Sutskever, and G. E. Hinton. ImageNet classification with deep convolutional neural networks. In NIPS, 2012.
    [28] J. Long, E. Shelhamer, and T. Darrell. Fully convolutional networks for semantic segmentation. In CVPR, 2015.
    [29] Rolf Kohler, Christian Schuler, Bernhard Scholkopf, and Stefan Harmeling. 2014. Mask specific inpainting with deep neural networks. In German Conference on Pattern Recognition.
    [30] Jimmy SJ Ren, Li Xu, Qiong Yan, and Wenxiu Sun. 2015. Shepard Convolutional Neural Networks. In Conference on Neural Information Processing Systems.
    [31] Junyuan Xie, Linli Xu, and Enhong Chen. 2012. Image Denoising and Inpainting with Deep Neural Networks. In Conference on Neural Information Processing Systems. 341–349.
    [32] Chao Yang, Xin Lu, Zhe Lin, Eli Shechtman, Oliver Wang, and Hao Li. 2017. High-Resolution Image Inpainting using Multi-Scale Neural Patch Synthesis. In IEEE Conference on Computer Vision and Pattern Recognition.
    [33] Pierre Baldi, I. Guyon, G. Dror, V. Lemaire, G. Taylor and D. Silver. Autoencoders, Unsupervised Learning, and Deep Architectures. JMLR: Workshop and Conference Proceedings 27:37–50, 2012
    [34] A. Radford, L. Metz, and S. Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. ICLR, 2016.
    [35] Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. 2016. Improved techniques for training gans. In Conference on Neural Information Processing Systems.
    [36] Martin Arjovsky and Léon Bottou. Towards Principled Methods for Training Generative Adversarial Networks. ICLR, 2017.
    [37] D.E. Rumelhart, G.E. Hinton, and R.J. Williams. 1986. Learning representations by back-propagating errors. In Nature.
    [38] Sergey Ioffe and Christian Szegedy. 2015. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In International Conference on Machine Learning.
    [39] A. L. Maas, A. Y. Hannun, and A. Y. Ng, "Rectifier nonlinearities improve neural network acoustic models," in Proc. ICML, 2013, vol. 30, no. 1.
    [40] Guilin Liu, Fitsum A. Reda, Kevin J. Shih, Ting-Chun Wang, Andrew Tao, Bryan Catanzaro. Image Inpainting for Irregular Holes Using Partial Convolutions. arXiv: 1804.07723, 2018.
    [41] Ronneberger, O., Fischer, P., Brox, T. U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention, Springer (2015) 234–241
    [42] Gatys, L.A., Ecker, A.S., Bethge, M. A neural algorithm of artistic style. arXiv preprint arXiv:1508.06576, 2015.
    [43] Simonyan, K., Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014
    [44] V. Ramanathan, K. Tang, G. Mori, and L. Fei-Fei. Learning temporal embeddings for complex video analysis. ICCV, 2015.
    [45] G. E. Hinton and R. R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 2006.

    無法下載圖示 校內:2023-08-31公開
    校外:不公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE