| 研究生: |
温育政 Wen, Yu-Cheng |
|---|---|
| 論文名稱: |
人臉任意卡通風格化之深度學習神經網路 Deep Learning Neural Networks for Arbitrary Style Face Cartoonization |
| 指導教授: |
楊家輝
Yang, Jar-Ferr |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電腦與通信工程研究所 Institute of Computer & Communication Engineering |
| 論文出版年: | 2021 |
| 畢業學年度: | 109 |
| 語文別: | 英文 |
| 論文頁數: | 42 |
| 中文關鍵詞: | 深度學習 、風格轉換 、生成對抗網路 、多尺度路徑轉換 、特徵加權器 、風格提取器 |
| 外文關鍵詞: | deep learning, style transfer, generative adversarial network, multi-scale path transformation, feature weighter, style extractor |
| 相關次數: | 點閱:118 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本文提出了一種新穎的人臉卡通風格變換神經網絡。所提出的系統採用循環生成對抗網絡(CycleGAN)為主架構,其結構主要包括四個模塊:兩個生成器與兩個鑑別器。使用 CycleGAN 架構可以克服樣式轉換中缺少配對數據的問題。在我們系統中所使用的的重要概念,多尺度路徑,用於使轉換過程中具有更大的分辨率靈活性。多尺度路徑可以為我們的訓練生成不同尺度的信息,以供訓練之用。另一個重要模塊是特徵加權器。使用這個加權器可以使我們的系統更加專注於需要密集處理的人臉區域,例如眼睛,鼻子等。關於風格提取器模塊。使用這種風格提取方式不僅可以大大減少系統所使用的訓練參數量還可以保持網路所生成的圖像質量。我們認為這有助於在資源有限的平台上運行此系統。經過實驗,我們的系統可以通過進行真實人眼比較來達到不錯的效果。結果表明,我們所提出的系統生成結果可以在節省資源的前提下達到大多數人的接受要求。
In this thesis, a human face cartoon style transformation neural network is proposed. The proposed system adapts the idea of cycle generative adversarial network (CycleGAN), structure which mainly includes four modules: two generators and two discriminators. Using CycleGAN structure can overcome the difficulty of lacking paired data in style transform. The important concept in our system, multi-scale path, is used to make the processing with more resolution flexibility. Multi-scale path can generate information at different scales for our training. Another important module is feature weighter. Using this module can make our system focus on regions that we need to be intensive processed, e.g. eyes, noses and etc. About the style extractor module, comparing to another method, we choose using direct calculation to replace deep learning method. Using this way can not only highly reduce the number of parameters we used and keep the image quality generated from our system. We think this may be helpful for installing this system on resource limited platforms. After the experiment, our system can reach good result by doing real human comparison. The results of the system we proposed can meet the acceptance requirements of most people on the premise of saving resources.
[1] J.-Y. Zhu, "Unpaired image-to-image translation using cycle-consistent adversarial networks." Proc. of the IEEE International Conference on Computer Vision. 2017.
[2] X. Huang, and S. Belongie. "Arbitrary style transfer in real-time with adaptive instance normalization." Proc. of the IEEE International Conference on Computer Vision. 2017.
[3] J. Kim, M. Kim, H. Kang, K. Lee "U-GAT-IT: unsupervised generative attentional networks with adaptive layer-instance normalization for image-to-image translation." Proc. of arXiv preprint arXiv:1907.10830, 2019.
[4] J. Liou, J. Yang, S. Mao, "Real-time Multi-person 3D Pose Estimation Convolutional Neuron Networks." Intelligent Information Processing Systems and Applications, 2020.
[5] S. Mao, H. Wu, G. Sandison, S. Fang, "Iterative volume morphing and learning for mobile tumor based on 4DCT," Physics in Medicine and Biology, 2017.
[6] S. Mao, M. Ye, X. Li, F. Pang, J. Zhou, "Rapid vehicle logo region detection based on information theory," International Journal of Computers & Electrical Engineering, 39(3), 2013.
[7] J. Zhou, M. Ye, J. Ding, S. Mao and H. J. Zhang, "Rapid and robust traffic accident detection based on orientation map," Optical Engineering, 51 (11), 2012.
[8] Gatys, L. A., Ecker, A. S., & Bethge, M. (2016). Image style transfer using convolutional neural networks. Proc. of the IEEE conference on computer vision and pattern recognition (pp. 2414-2423).
[9] L. A. Gatys, A. S. Ecker, and M. Bethge, “Image style transfer using convolutional neural networks,” Proc. of CVPR, 2016.
[10] Chen, Y., Lai, Y. K., & Liu, Y. J. Cartoongan: Generative adversarial networks for photo cartoonization. Proc. of the IEEE conference on CVPR, pp. 9465-9474, 2018.
[11] Hertzmann, A., Jacobs, C. E., Oliver, N., Curless, B., & Salesin, D. H. (2001, August). Image analogies. Proc. of the 28th annual conference on Computer graphics and interactive techniques (pp. 327-340).
[12] Isola, P., Zhu, J. Y., Zhou, T., & Efros, A. A. (2017). Image-to-image translation with conditional adversarial networks. Proc. of the IEEE conference on computer vision and pattern recognition (pp. 1125-1134).
[13] Sangkloy, P., Lu, J., Fang, C., Yu, F., & Hays, J. (2017). Scribbler: Controlling deep image synthesis with sketch and color. Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 5400-5409).
[14] Karacan, L., Akata, Z., Erdem, A., & Erdem, E. (2016). Learning to generate images of outdoor scenes from attributes and semantic layouts. Proc. of arXiv preprint arXiv:1612.00215.
[15] Rosales, R., Achan, K., & Frey, B. J. (2003, October). Unsupervised image translation. Proc. of ICCV (pp. 472-478).
[16] Liu, M. Y., & Tuzel, O. (2016). Coupled generative adversarial networks. Advances in neural information processing systems, 29, 469-477.
[17] Liu, M. Y., Breuel, T., & Kautz, J. (2017). Unsupervised image-to-image translation networks. In Advances in neural information processing systems (pp. 700-708).
[18] Taigman, Y., Polyak, A., & Wolf, L. (2016). Unsupervised cross-domain image generation. Proc. of arXiv preprint arXiv:1611.02200.
[19] Bousmalis, K., Silberman, N., Dohan, D., Erhan, D., & Krishnan, D. (2017). Unsupervised pixel-level domain adaptation with generative adversarial networks. Proc. of the IEEE conference on computer vision and pattern recognition (pp. 3722-3731).
[20] Kim, T., Cha, M., Kim, H., Lee, J. K., & Kim, J. (2017, July). Learning to discover cross-domain relations with generative adversarial networks. Proc. of International Conference on Machine Learning (pp. 1857-1865). PMLR.
校內:2026-07-21公開