簡易檢索 / 詳目顯示

研究生: 張詠裕
Chang, Yung-Yu
論文名稱: 以循環生成對抗網路實現不同場景間影像轉換
Multi-Domain Image-to-Image Translations based on Generative Adversarial Networks
指導教授: 王明習
Wang, Ming-Shi
學位類別: 碩士
Master
系所名稱: 工學院 - 工程科學系
Department of Engineering Science
論文出版年: 2018
畢業學年度: 106
語文別: 中文
論文頁數: 60
中文關鍵詞: 影像還原除霧去雜訊卷積神經網路生成式對抗網路
外文關鍵詞: Deblurred, Dehaze, Denoise, Convolutional Neural Networks, Generative Adversarial Networks
相關次數: 點閱:111下載:10
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年來,深度學習領域有了突破性的發展,掀起了一波波人潮相繼湧入研究此課題,所涵蓋的領域無奇不有。但就影像處理方面的應用,目前被提出來的議題,多數所設計之架構僅致力於單一任務,並藉由誤差數據與參考標準數據兩者成對的數據集訓練而成。得到之效果固然顯著,但因為需要有參考標準數據,對於日後數據集更新會有一定的難度。隨著安全意識的抬頭,電腦視覺輔助系統的需求度也日益提高,以監視攝影設備最為顯著,監視攝影設備可以紀錄事發過程,甚至是利用捕捉畫面進行辨識與偵測,有效嚇止與防範犯罪,然而若是在有霧氣造成視線不良、夜間高感光度拍攝造成影像噪化及拍攝物件高速移動造成影像模糊等原因,造成影像無法輕易用肉眼辨識,甚至導致辨識錯誤的情況發生,為解決此類問題,本文以生成式對抗網路為基礎,搭建一多功能循環生成式對抗網路架構用來執行無監督影像域轉換,其中定義了「退化」、「噪化」、「霧化」與「還原」的影像,訓練架構中的生成模組,藉由神經網路學習多組影像定義域間的轉換關係,將大多數人認為模糊、噪化及霧化的影像轉換成清楚可辨識的圖像,以達到改善無監督影像域轉換的可擴展性。

    In recent years, domain translation has been a breakthrough in the field of deep learning. However, most of the issues raised so far are dedicated to a single situation, and trained through paired datasets. The effect is significant, but the defect is that the architectures lack scalability and the paired data update in the future is difficult. The demand for computer vision assistance systems is increasing, and there is more than one mission requirement in some environments. In this Thesis, we propose a multi-domain image translation model which has two advantages in terms of flexibility: one is the depth of the architecture that can be designed according to expectations, and the other is the number of domains that can be designed according to the number of tasks. We demonstrate the effectiveness of our theory on dehaze, debluring, and denoising tasks.

    目錄 摘要 II 致謝 XII 目錄 XIII 表目錄 XV 圖目錄 XVI 第一章 緒論 1 1.1 研究背景與動機 1 1.2 研究目的 3 1.3 論文架構 3 第二章 相關知識探討 4 2.1 RGB色彩模型 4 2.2 類神經網路模擬人類視覺 6 2.2.1 類神經網路—全連接層(fully connected layers, FC) 6 2.2.2 卷積神經網路(Convolution Neural Networks) 8 2.3 生成對抗網路(Generative Adversarial Networks, GAN) [1] 14 2.4 深度卷積生成對抗網路[2] 17 2.5 循環式生成對抗網路[8] 20 2.6 多域間的相互轉換—StarGAN 24 第三章 研究方法 26 3.1 整體架構 26 3.2 生成器架構 31 3.3 辨別器架構 36 第四章 實驗結果 38 4.1 訓練及實驗環境 38 4.2 成果展現 40 4.2.1 數據集(dataset) 41 4.2.2 實驗成果 42 4.3 討論 55 第五章 結論與未來展望 56 5.1結論 56 5.2未來展望 56 參考文獻 57   表目錄 表 4-1開發環境 39 表 4-2 風景圖經各處理後的品質比較 42 表 4-3 街景圖經各處理後的品質比較 42 表 4-4 汽車圖經各處理後的品質比較 45 表 4-5 行人圖經各處理後的品質比較 45 表 4-6 城市高樓圖經各處理後的品質比較 48 表 4-7 人物風景圖經各處理後的品質比較 48 表 4-8 行人圖退化經各處理後的品質比較 52 表 4-9 霧霾行人圖(非數據集資料,無ground truth)經各處理後的品質比較 52   圖目錄 圖 2-1 RGB色彩模型 5 圖 2-2類神經網路模擬人類視覺示意圖 7 圖 2-3類神經網路架構圖 7 圖 2-4局部連接模擬人類視覺 10 圖 2-5權值共享 10 圖 2-6多卷積核 10 圖 2-7卷積層 11 圖 2-8 tanh 13 圖 2-9 Sigmoid function 13 圖 2-10 ReLU 13 圖 2-11 激勵函數比較 13 圖 2-12生成對抗網路架構圖 14 圖 2-13 虛擬程式碼表示生成對抗網路[1] 15 圖 2-14 生成對抗網路訓練分布圖[1] 16 圖 2-15 生成對抗網路成果圖[1] 17 圖 2-16深度卷積生成模組[2] 18 圖 2-17 深度卷積生成對抗網路成果圖[2] 18 圖 2-18 Leaky ReLU 19 圖 2-19成對數據集[8] 21 圖 2-20非成對數據集[8] 21 圖 2-21將圖像從原域X轉換到目標域Y[8] 22 圖 2-22影像經轉換與再反轉換的成果圖[8] 22 圖 2-23將圖像從原域Y轉換到目標域X[8] 23 圖 2-24成果比較圖[8] 23 圖 2-25循環式生成對抗網路成果圖[8] 23 圖 2-26 普通域轉換與StarGAN的比較 25 圖 2-27 CelebA經StarGAN訓練的成果圖 25 圖 3-1 循環式生成對抗網路分數據布示意圖 27 圖 3-2 循環式生成對抗網路之生成器分為編碼與解碼 28 圖 3-3 循環式生成對抗網路編碼端權值互享架構圖 29 圖 3-4循環式生成對抗網路多域轉換架構圖 30 圖 3-5 新舊神經架構比較圖 31 圖 3-6 殘差模組(Residual Block) 32 圖 3-7 生成器架構圖(編碼器) 34 圖 3-8 生成器架構圖(解碼器) 35 圖 3-9 辨別器架構圖 37 圖 4-1 風景圖比較 43 圖 4-2 街景圖比較 44 圖 4-3汽車圖退化還原比較 46 圖 4-4行人圖退化還原比較 47 圖 4-5城市高樓圖除霧比較 49 圖 4-6人物風景圖除霧比較 50 圖 4-7行人圖退化還原以YOLO比較 53 圖 4-8行人圖除霧以YOLO比較 54

    [1] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. “Generative adversarial nets”, In Advances in Neural Information Processing Systems (NIPS), 8-13 Dec., Palais des Congrès de Montréal, pp. 2672–2680, 2014.
    [2] A. Radford, L. Metz, and S. Chintala. “Unsupervised representation learning with deep convolutional generative adversarial networks”, In International Conference on Learning Representations (ICLR), 2 - 4 May, Caribe Hilton, San Juan, Puerto Rico, 2016.
    [3] M.-Y. Liu and O. Tuzel. “Coupled generative adversarial networks”, In Advances in Neural Information Processing Systems (NIPS), 5-10 December, Centre Convencions Internacional Barcelona, Barcelona SPAIN, pp. 469–477, 2016.
    [4] A. Shrivastava, T. Pfister, O. Tuzel, J. Susskind, W. Wang, and R. Webb. “Learning from simulated and unsupervised images through adversarial training”, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 21-16 July, Hawaii Convention Center in Honolulu, Hawaii, pp. 2107-2116, 2017.
    [5] J. Donahue, P. Krähenbühl, and T. Darrell. “Adversarial feature learning” International Conference on Learning Representations (ICLR), 24 – 26 April, Palais des Congrès Neptune, Toulon, France, 2017.
    [6] V. Dumoulin, I. Belghazi, B. Poole, A. Lamb, M. Arjovsky, O. Mastropietro, and A. Courville, “Adversarially Learned Inference”, arXiv preprint arXiv:1606.00704, 2016.
    [7] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-Image Translation with Conditional Adversarial Networks”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , 21-16 July, Hawaii Convention Center in Honolulu, Hawaii., pp. 5967–5976, 2017.
    [8] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “ Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”, The IEEE International Conference on Computer Vision (ICCV), 21-26 July, Venice, Italy, pp.2223-2232, 2017.
    [9] G. E. Hinton and R. R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, VOL 313, pages 504–507, 2006.
    [10] M. Arjovsky, S. Chintala, and L. Bottou, “ Wasserstein Generative Adversarial Networks”, Proceedings of the International Conference on Machine Learning (ICML), 6-11 August, Sydney Australia , pp. 214–223, 2017.
    [11] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. Courville, “ Improved Training of Wasserstein Gans”, Advances in Neural Information Processing Systems (NIPS), 4-9 Dec., Long Beach Convention Center, 2017.
    [12] Y. Choi, M. Choi, M. Kim, J.-W. Ha, S. Kim, and J. Choo, “StarGAN: Unified Generative Adversarial Networks for MultiDomain Image-to-Image Translation”, ArXiv e-prints ArXiv:1711.09020, 2017.
    [13] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation”, Proceedings of the Medical Image Computing and Computer-Assisted Intervention (MICCAI), 5-9 October, Granada, Spain, pp. 234–241, 2015.
    [14] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 26 June – 1 July, Caesar's Palace in Las Vegas, Nevada, pp. 770– 778, 2016.
    [15] G. Huang, Z. Liu, L. van der Maaten, and K.Q. Weinberger, “Densely connected convolutional networks”. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 21-16 July, Hawaii Convention Center in Honolulu, Hawaii, pp. 2261-2269, 2017.
    [16] J. Sun, W. Cao, Z. Xu, and J. Ponce, “Learning a convolutional neural network for non-uniform motion blur removal”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 7-12 June, Boston, MA, pp. 769–777, 2015.
    [17] L. Xu, J. S. Ren, C. Liu, and J. Jia, “Deep convolutional neural network for image deconvolution”, In Advances in Neural Information Processing Systems (NIPS) ), 8-13 Dec, Palais des Congrès de Montréal, pp. 1790–1798, 2014.
    [18] S. Nah, T. H. Kim and K. M. Lee, “Deep multi-scale convolutional neural network for dynamic scene deblurring”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 21-16 July, Hawaii Convention Center in Honolulu, Hawaii, 2017.
    [19] M. Noroozi, P. Chandramouli, and P. Favaro, “Motion Deblurring in the Wild”, German Conference on Pattern Recognition (GCPR), 13-15 September, Basel, Switzerland, pp. 65–77, 2017.
    [20] A. Chakrabarti, “ A neural approach to blind motion deblurring”, Proceedings of the European Conference on Computer Vision (ECCV), 8-16 October, Amsterdom,Netherlans, pp. 221–235, 2016.
    [21] G. Boracchi and A. Foi, “ Modeling the performance of image restoration from motion blur”, IEEE Transactions on Image Processing, Vol. 21, No. 8, pp. 3502–3517, 2012.
    [22]O. Kupyn, V. Budzan, M. Mykhailych, D. Mishkin, and J. Matas, “DeblurGAN: blind motion deblurring using conditional adversarial networks” arXiv preprint arXiv:1711.07064, 2017.
    [23] B. Li, W. Ren, D. Fu, D. Tao, D. Feng, W. Zeng, and Z. Wang, “RESIDE: A Benchmark for Single Image Dehazing”, arXiv preprint arXiv:1712.04143, 2017.
    [24] M. Brown and S. Süsstrunk, “Multi-spectral SIFT for scene category recognition”, Computer Vision and Pattern Recognition (CVPR), 20-25 June, Colorado Springs, CO, USA, USA, pp. 177-184, 2011.
    [25] J. Redmon and A. Farhadi, “ Yolo9000: Better, faster, stronger”, Computer Vision and Pattern Recognition (CVPR), 21-16 July, Hawaii Convention Center in Honolulu, Hawaii, pp. 6517–6525, 2017.
    [26] J. Redmon and A. Farhadi, “Yolov3:An incremental improvement”, arXiv preprint arXiv:1804.02767, 2018.
    [27] B. Cai, X. Xu, K. Jia, C. Qing, and D. Tao. Dehazenet, “An end-to-end system for single image haze removal”, IEEE Transactions on Image Processing, Vol. 25, No. 11, pp. 5187–5198, 2016.

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE