| 研究生: |
姜竣嚴 Chiang, Chun-Yen |
|---|---|
| 論文名稱: |
基於生成對抗網路之具有自動偵測隨機破壞的圖像修補模型 Automatic Detection of Random Holes for Image Inpainting based on Generative Adversarial Networks |
| 指導教授: |
王明習
Wang, Ming-Shi |
| 學位類別: |
碩士 Master |
| 系所名稱: |
工學院 - 工程科學系 Department of Engineering Science |
| 論文出版年: | 2019 |
| 畢業學年度: | 107 |
| 語文別: | 中文 |
| 論文頁數: | 61 |
| 中文關鍵詞: | 圖像修補 、卷積神經網路 、圖像分割 、生成對抗網路 |
| 外文關鍵詞: | Inpainting, Convolutional Neural Networks(CNN), Image Segmentation, Generative Adversarial Networks |
| 相關次數: | 點閱:141 下載:2 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
圖像修補主要是用於修補圖像缺失的部分或是移除圖像中不需要的目標,過去常用的修補方法是透過自身圖像中紋理與線條結構,來進行修補或重建破壞區域內容,使整張圖像達到人類視覺分辨不出是否經過修補的效果。在過去幾年中,深度學習技術在圖像修復方面取得了顯著進步。本論文是基於生成對抗網路所建立的端對端模型,與過去各研究應用於圖像修補議題中的生成對抗網路架構不同,本研究所提出之架構除使修補質量提高外,過去可生成較高質量的修補結果,會添加破壞區域作為輸入,但在實際應用時在某些情況下破壞區域圖不好取得,所以本架構中添加可自動偵測破壞區域網路,通過偵測待修補區域後,對這些區域進行符合圖像語意之修補,以利於使用者在使用模型時,不需要再針對破壞區域進行額外前處理的動作,且本論文所提之架構與其他架構相比,不因網路層數加深而使修補效能下降,在實驗結果顯示,在小面積破壞修補上有著跟其他研究相同優良的效果,而在大面積連續破壞相較其他研究也有著較高質量的表現,此外,也可做圖像擴充,以迭代生成的方式描繪出圖像邊界外的內容。
Image inpainting is mainly used to repair the missing parts of the image or remove unwanted objects in the image. In the past, the common inpainting method was to repair or reconstruct the corrupted area content through the texture and line structure in the image. Inpainting makes the whole image reach the human visual and can't tell if it has been repaired. In the past few years, deep learning methods have made significant progress in image restoration. In this study, an end-to-end model established by the generative adversarial networks was proposed for image inpainting. It is different from the generative adversarial networksarchitecture used in image inpainting issues as others. Usually, the higher quality inapinting model were based on corrupted image and the mask of corrupted area. However, in some situations, the corrupted area image is not easy to obtain. Hence the architecture proposed in this study was increased the function to detect mask automatically. The proposed method also improves the quality of the result. By automatically detecting the mask, the corrupted areas are completed in accordance with the image semantics, as a result, user does not need to perform additional pre-processing actions on the corrupted area when using the model. Moreover, compared with other architectures, the architecture proposed in this study does not reduce the patching performance due to the deepening of the network layer. The experimental results show that the small corrupted area inpainting has the same excellent effect as other studies; but in a large and continuous corrupted area, the results have higher quality performance than others. In addition, the model can also be used to implement image outpainting.
[1] Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., ... & Dieleman, S. (2016). Mastering the game of Go with deep neural networks and tree search. nature, 529(7587), 484.
[2] Barnes, C., Shechtman, E., Finkelstein, A., & Goldman, D. B. (2009, July). PatchMatch: A randomized correspondence algorithm for structural image editing. In ACM Transactions on Graphics (ToG) (Vol. 28, No. 3, p. 24). ACM.
[3] Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., ... & Bengio, Y. (2014). Generative adversarial nets. In Advances in neural information processing systems (pp. 2672-2680).
[4] Dolhansky, B., & Canton Ferrer, C. (2018). Eye in-painting with exemplar generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(pp. 7902-7911).
[5] Demir, U., & Unal, G. (2018). Patch-based image inpainting with generative adversarial networks. arXiv preprint arXiv:1803.07422.
[6] Wang, Y., Tao, X., Qi, X., Shen, X., & Jia, J. (2018). Image Inpainting via Generative Multi-column Convolutional Neural Networks. In Advances in Neural Information Processing Systems (pp. 331-340).
[7] Nazeri, K., Ng, E., Joseph, T., Qureshi, F., & Ebrahimi, M. (2019). EdgeConnect: Generative Image Inpainting with Adversarial Edge Learning. arXiv preprint arXiv:1901.00212.
[8] Criminisi, A., Pérez, P., & Toyama, K. (2004). Region filling and object removal by exemplar-based image inpainting. IEEE Transactions on image processing, 13(9), 1200-1212.
[9] Wohlberg, B. (2009, April). Inpainting with sparse linear combinations of exemplars. In 2009 IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 689-692). IEEE.
[10] Gomathi, R., & Kumar, A. V. A. (2010, December). An efficient GEM model for image inpainting using a new directional sparse representation: Discrete Shearlet Transform. In 2010 IEEE International Conference on Computational Intelligence and Computing Research (pp. 1-4). IEEE.
[11] Hays, J., & Efros, A. A. (2007). Scene completion using millions of photographs. ACM Transactions on Graphics (TOG), 26(3), 4.
[12] Convolution operation. [online]. Avaiable: https://towardsdatascience.com/simple-introduction-to-convolutional-neural-networks-cdf8d3077bac
[13] Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3431-3440).
[14] Ronneberger, O., Fischer, P., & Brox, T. (2015, October). U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention (pp. 234-241). Springer, Cham.
[15] Radford, A., Metz, L., & Chintala, S. (2015). Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434.
[16] Denton, E. L., Chintala, S., & Fergus, R. (2015). Deep generative image models using a laplacian pyramid of adversarial networks. In Advances in neural information processing systems (pp. 1486-1494).
[17] Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., & Efros, A. A. (2016). Context encoders: Feature learning by inpainting. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2536-2544).
[18] Yeh, R. A., Chen, C., Yian Lim, T., Schwing, A. G., Hasegawa-Johnson, M., & Do, M. N. (2017). Semantic image inpainting with deep generative models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 5485-5493).
[19] Iizuka, S., Simo-Serra, E., & Ishikawa, H. (2017). Globally and locally consistent image completion. ACM Transactions on Graphics (ToG), 36(4), 107.
[20] Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., & Huang, T. S. (2018). Generative image inpainting with contextual attention. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 5505-5514).
[21] Liu, G., Reda, F. A., Shih, K. J., Wang, T. C., Tao, A., & Catanzaro, B. (2018). Image inpainting for irregular holes using partial convolutions. In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 85-100).
[22] Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., & Huang, T. S. (2018). Free-form image inpainting with gated convolution. arXiv preprint arXiv:1806.03589.
[23] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).
[24] Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4700-4708).
[25] Zhang, H., Goodfellow, I., Metaxas, D., & Odena, A. (2018). Self-attention generative adversarial networks. arXiv preprint arXiv:1805.08318.
[26] Arjovsky, M., Chintala, S., & Bottou, L. (2017, July). Wasserstein generative adversarial networks. In International Conference on Machine Learning (pp. 214-223).
[27] Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., & Courville, A. C. (2017). Improved training of wasserstein gans. In Advances in Neural Information Processing Systems (pp. 5767-5777).
[28] Zhao, J., Mathieu, M., & LeCun, Y. (2016). Energy-based generative adversarial network. arXiv preprint arXiv:1609.03126.
[29] Large-scale CelebFaces Attributes (CelebA) Dataset. [online]. Avaiable: http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html
[30] Places2: A Large-Scale Database for Scene Understanding. [online]. Avaiable: http://places2.csail.mit.edu/download.html
[31] Yu, F., & Koltun, V. (2015). Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122.
校內:2021-08-01公開