| 研究生: |
謝明翰 Shieh, Ming-Han |
|---|---|
| 論文名稱: |
新型生成對抗神經網路系統之設計與實現 Design and Implementation of a New Generative Adversarial Neural Network System |
| 指導教授: |
周哲民
Jou, Jer-Min |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2020 |
| 畢業學年度: | 108 |
| 語文別: | 中文 |
| 論文頁數: | 98 |
| 中文關鍵詞: | 生成對抗神經網路 、機器學習 、軟硬體共同設計 、硬體加速器 、SoPC |
| 外文關鍵詞: | GAN, Machine Learning, Algorithm-Hardware Co-Design Methodology, Hardware Accelerator, System on Programmable Chip (SoPC) |
| 相關次數: | 點閱:65 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
GAN(Generative Adversarial Network,生成對抗網絡)成為當今最受歡迎的無監督學習方法之一。它由兩個核心元素組成:生成器和鑑別器,前者使用跨度卷積(S-Conv),後者使用競爭性和協同運行的分數跨度卷積(FS-Conv)。與傳統的監督學習相比,GAN訓練包含兩個訓練元素,因此具有更多的計算模式。對於GAN訓練,需要多次向前和向後傳遞,且計算複雜度很高。 GAN的無監督學習引發了更複雜且非傳統的捲積。儘管這些操作仍然可以利用傳統的DNN加速器,但是數據流的差異導致計算資源的利用不足。我們使用系統的層次控制/數據流方法和算法-硬件協同設計方法設計了一個新的GAN系統。我們使用兩個級別的設計空間探索(包括GAN級別和CONV級別)來提高計算性能。我們提出了一個具有時間重用和負載平衡設計的統一架構,其中GAN學習運算能有效地映射到統一架構中。我們在DE2i-150 FPGA開發套件上使用SoPC架構來實現我們的設計。在50 MHz的工作頻率下,GAN的實驗結果實現了0.94 GFOPS的吞吐量和22508.195 ms的等待時間,用於訓練一幅圖像和一幅噪聲。
GAN (Generative Adversarial Network) becomes one of the most popular unsupervised learning methods now. It consists of two core elements: a generator and a discriminator, and strided-convolutions are used in the former as well as the latter fractional-strided convolutions are used, both operating competitively and cooperatively. Compared with traditional supervised learning, GAN training has more computing modes since it contains two elements of training. For GAN training requires multiple forward and backward passes with high computation complexity. GAN’s unsupervised learning invokes more complex and non-traditional convolutions. Although these operations could still utilize traditional DNN accelerators, the differences in dataflow lead to underutilization of the compute resources. We had designed a new GAN system using a systematic hierarchy control/data flow approach and algorithm-hardware co-design methodology. We use two level design space exploration (include GAN level and CONV level) to improve the computing performance. We proposed a unified architecture with the time reused and loading balance design, in which GAN learning operations are then mapped into unified architecture efficiently. We implement our design with SoPC architecture on the DE2i-150 FPGA development kit. With working frequency of 50 MHz, the experimental result for GAN achieved 0.94 GFOPS of throughput and 22508.195 ms of latency for training one image and one noise.
[1] Hinton G E, Sejnowski T J, Ackley D H. Boltzmann Machines: Constraint Satisfaction Networks that Learn. Technical Report No. CMU-CS-84¡119, Carnegie-Mellon University, Pittsburgh, PA, USA, 1984.
[2] Ackley D H, Hinton G E, Sejnowski T J. A learning algorithm for Boltzmann machines. Cognitive Science, 1985,9(1): 147-169.
[3] Hinton G E, Osindero S, Teh Y W. A fast learning algorithm for deep belief nets. Neural Computation, 2006, 18(7):1527-1554.
[4] Kingma D P, Welling M. Auto-encoding variational Bayes. arXiv preprint arXiv: 1312.6114, 2013.
[5] Rezende D J, Mohamed S, Wierstra D. Stochastic backpropagation and approximate inference in deep generative models. arXiv preprint arXiv: 1401.4082, 2014.
[6] Salakhutdinov R, Hinton G. Deep Boltzmann machines. In: Proceedings of the 12th International Conference on Artifcial Intelligence and Statistics. Clearwater Beach, Florida, USA: AISTATS, 2009. 448-455.
[7] Smolensky P. Information processing in dynamical systems: Foundations of harmony theory. Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Cambridge, MA, USA: MIT Press, 1986.
[8] Hinton G E, Zemel R S. Autoencoders, minimum description length and Helmholtz free energy. In: Proceedings of the 6th International Conference on Neural Informa-tion Processing Systems. Denver, Colorado, USA: Morgan Kaufmann Publishers Inc., 1994. 3-10.
[9] I. J. Goodfellow, J. Pouget-Abadie, Mehdi Mirza and Bing Xu. Generative Adversarial Networks. arXiv preprint. arXiv: 1406.2661, 2014.
[10] SPRINGENBERG J T. Unsupervised and semi-supervised learning with categorical generative adversarial networks[J]. arXiv: arXiv1511.06390, 2015.
[11] SANTANA E, HOTZ G. Learning a driving simulator[J]. arXiv:arXiv1608.01230, 2016.
[12] WU L, XIA Y, ZHAO L, et al. Adversarial neural machine translation. arXiv: arXiv1704.06933, 2017.
[13] SCHLEGL T, SEEBÖCK P, WALDSTEIN S M, et al. Unsupervised anomaly detection with generative adversarial networks to guide marker discovery[J]. arXiv: arXiv1703.05921, 2017.
[14] HU W W, TAN Y. Generating adversarial malware examples for black-box attacks based on GAN[J]. arXiv: arXiv1702.05983, 2017.
[15] Gou C, Wu Y, Wang K, Wang F Y, Ji Q. Learning-by-synthesis for accurate eye detection. In: Proceedings of the 2016 IEEE International Conference on Pattern Recognition (ICPR). Cancun, Mexico: IEEE, 2016.
[16] Gou C, Wu Y, Wang K, Wang K F, Wang F Y, Ji Q. A joint cascaded framework for simultaneous eye detection and eye state estimation. Pattern Recognition, 2017, 67: 23-31.
[17] Iizuka S, Simo-Serra E, Ishikawa H. Globally and locally consistent image completion. ACM Transactions on Graphics, 2017, 36(4): Article No. 107.
[18] Li Y J, Liu S F, Yang J M, Yang M H. Generative face completion. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, Hawaii, USA: IEEE, 2017. 5892-5900.
[19] Ledig C, Theis L, Huszar F, Caballero J, Cunningham A, Acosta A, et al. Photo-realistic single image super-resolution using a generative adversarial network. arXiv preprint arXiv: 1609.04802, 2016.
[20] Lotter W, Kreiman G, Cox D. Unsupervised learning of visual structure using predictive generative networks. arXiv preprint arXiv: 1511.06380, 2015.
[21] Lotter W, Kreiman G, Cox D. Deep predictive coding networks for video prediction and unsupervised learning. arXiv preprint arXiv: 1605.08104, 2016.
[22] Kupyn O, Budzan V, Mykhailych M, Mishkin D, Matas J. DeblurGAN: Blind motion deblurring using conditional adversarial networks. arXiv preprint arXiv: 1711.07064, 2017.
[23] PERARNAU G, VAN DE WEIJER J, RADUCANU B, et al. Invertible conditional GANs for image editing[J]. arXiv: arXiv1611.06355, 2016.
[24] CRESWELL A, BHARATH A A. Inverting the generator of a generative adversarial network[J]. arXiv: arXiv1611.05644, 2016.
[25] ZHOU S, XIAO T, YANG Y, et al. GeneGAN: learning object transfiguration and attribute subspace from unpaired data[J]. arXiv: arXiv1705.04932, 2017.
[26] KIM T, CHA M, KIM H, et al. Learning to discover cross-domain relations with generative adversarial networks[J]. arXiv: arXiv1703.05192, 2017.
[27] WANG C, WANG C, XU C, et al. Tag disentangled generative adversarial network for object image rerendering[C]//The Twenty-Sixth International Joint Conference on Artificial Intelligence. 2017: 2901-2907.
[28] ANTIPOV G, BACCOUCHE M, DUGELAY JL. Face aging with conditional generative adversarial networks[J]. arXiv: arXiv1702. 01983, 2017.
[29] MATHIEU M, COUPRIE C, LECUN Y. Deep multi-scale video prediction beyond mean square error[J]. arXiv: arXiv1511.05440, 2015.
[30] VONDRICK C, PIRSIAVASH H, TORRALBA A. Generating videos with scene dynamics[C]//Conferrence on Neural Information Processing Systems. 2016: 613-621.
[31] ZHU J Y, PARK T, ISOLA P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks[J]. arXiv: arXiv 1703.10593, 2017.
[32] ISOLA P, ZHU J Y, ZHOU T, et al. Image-to-image translation with conditional adversarial networks[J]. arXiv: arXiv1611.07004, 2016.
[33] YI Z, ZHANG H, GONG PT. DualGAN: unsupervised dual learning for image-to-image translation[J]. arXiv: arXiv1704.02510, 2017.
[34] Alec Radford, Luke Metz, and Soumith Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint. arXiv:1511.06434, 2015.
[35] M. Arjovsky, S. Chintala, and L. Bottou. Wasserstein GAN. In International Conference on Machine Learning (ICML), 2017.
[36] Ishaan Gulrajani, Faruk Ahmed, Mart´ın Arjovsky, Vincent Dumoulin, and Aaron C. Courville. Improved training of Wasserstein GANs. CoRR, abs/1704.00028, 2017.
[37] A. Yazdanbakhsh, H. Falahati, P. J. Wolfe, K. Samadi, H. Esmaeilzadeh, and N. S. Kim, “GANAX: A Unified SIMD-MIMD Acceleration for Generative Adversarial Network,” in ISCA, 2018.
[38] A. Yazdanbakhsh, M. Brzozowski, B. Khaleghi, Soroush Ghodrati, Kambiz Samadi, Nam Sung Kim and Hadi Esmaeilzadeh. FlexiGAN: An End-to-End Solution for FPGA Acceleration of Generative Adversarial Networks. In 2018 IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), 2018.
[39] F. Liu, C. Liu. A Memristor based Unsupervised Neuromorphic System Towards Fast and Energy-Efficient GAN. arXiv preprint. arXiv1806.01775, 2018.
[40] F. Chen, L. Song, and Y. Chen, “Regan: A pipelined reram-based accelerator for generative adversarial network,” in ASPDAC, 2018.
[41] M Song, J Zhang, H Chen and T Li. Towards Efficient Microarchitectural Design for Accelerating Unsupervised GAN-Based Deep Learning. In 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA), 2018.
[42] W. Lu, G. Yan, J. Li, S. Gong, Y. Han, and X. Li, “Flexflow: A flexible dataflow accelerator architecture for convolutional neural networks,” in High Performance Computer Architecture (HPCA), 2017 IEEE International Symposium on. IEEE, 2017, pp. 553–564.
[43] Y. Ma et al., “Optimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks,” in FPGA, 2017.
[44] Qingcheng Xiao, Yun Liang, Liqiang Lu, Shengen Yan, and Yu-Wing Tai. “Exploring Heterogeneous Algorithms for Accelerating Deep Convolutional Neural Networks on FPGAs,” Proceedings of the 54th Annual Design Automation Conference, 2017, ACM, 62.
[45] H. Kwon, A. Samajdar, and T. Krishna, “Maeri: Enabling flexible dataflow mapping over dnn accelerators via reconfigurable interconnects,” in Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, 2018, pp. 461–475.
[46] Terasic, “DE2i-150 Development Kit FPGA System User Manual,” 2013.
[47] Altera, “Cyclone IV Device Handbook, Volume 1,” Altera Corporation, March 2016.
[48] Paszke, Adam and Gross, Sam and Chintala, Soumith and Chanan, Gregory and Yang, Edward and DeVito, Zachary and Lin, Zeming and Desmaison, Alban and Antiga, Luca and Lerer, Adam, “Automatic differentiation in PyTorch”, in NIPS-W, 2017.
[49] Howard, Jeremy and others, “fastai”, https://github.com/fastai/fastai, 2018.
校內:2024-02-25公開