簡易檢索 / 詳目顯示

研究生: 謝明翰
Shieh, Ming-Han
論文名稱: 新型生成對抗神經網路系統之設計與實現
Design and Implementation of a New Generative Adversarial Neural Network System
指導教授: 周哲民
Jou, Jer-Min
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 中文
論文頁數: 98
中文關鍵詞: 生成對抗神經網路機器學習軟硬體共同設計硬體加速器SoPC
外文關鍵詞: GAN, Machine Learning, Algorithm-Hardware Co-Design Methodology, Hardware Accelerator, System on Programmable Chip (SoPC)
相關次數: 點閱:65下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • GAN(Generative Adversarial Network,生成對抗網絡)成為當今最受歡迎的無監督學習方法之一。它由兩個核心元素組成:生成器和鑑別器,前者使用跨度卷積(S-Conv),後者使用競爭性和協同運行的分數跨度卷積(FS-Conv)。與傳統的監督學習相比,GAN訓練包含兩個訓練元素,因此具有更多的計算模式。對於GAN訓練,需要多次向前和向後傳遞,且計算複雜度很高。 GAN的無監督學習引發了更複雜且非傳統的捲積。儘管這些操作仍然可以利用傳統的DNN加速器,但是數據流的差異導致計算資源的利用不足。我們使用系統的層次控制/數據流方法和算法-硬件協同設計方法設計了一個新的GAN系統。我們使用兩個級別的設計空間探索(包括GAN級別和CONV級別)來提高計算性能。我們提出了一個具有時間重用和負載平衡設計的統一架構,其中GAN學習運算能有效地映射到統一架構中。我們在DE2i-150 FPGA開發套件上使用SoPC架構來實現我們的設計。在50 MHz的工作頻率下,GAN的實驗結果實現了0.94 GFOPS的吞吐量和22508.195 ms的等待時間,用於訓練一幅圖像和一幅噪聲。

    GAN (Generative Adversarial Network) becomes one of the most popular unsupervised learning methods now. It consists of two core elements: a generator and a discriminator, and strided-convolutions are used in the former as well as the latter fractional-strided convolutions are used, both operating competitively and cooperatively. Compared with traditional supervised learning, GAN training has more computing modes since it contains two elements of training. For GAN training requires multiple forward and backward passes with high computation complexity. GAN’s unsupervised learning invokes more complex and non-traditional convolutions. Although these operations could still utilize traditional DNN accelerators, the differences in dataflow lead to underutilization of the compute resources. We had designed a new GAN system using a systematic hierarchy control/data flow approach and algorithm-hardware co-design methodology. We use two level design space exploration (include GAN level and CONV level) to improve the computing performance. We proposed a unified architecture with the time reused and loading balance design, in which GAN learning operations are then mapped into unified architecture efficiently. We implement our design with SoPC architecture on the DE2i-150 FPGA development kit. With working frequency of 50 MHz, the experimental result for GAN achieved 0.94 GFOPS of throughput and 22508.195 ms of latency for training one image and one noise.

    摘要 III Design and Implementation of a New Generative Adversarial Neural Network System IV SUMMARY IV OUR PROPOSED DESIGN IV EXPERIMENTS V CONCLUSION VIII 誌謝 IX 目錄 X 表目錄 XI 圖目錄 XII 第一章 緒論 1 1.1研究背景 1 1.2研究動機與目的 2 1.3論文架構 3 第二章 背景知識與相關研究 4 2.1 生成對抗網路(Generative Adversarial Network;GAN) 4 2.2 深度捲積生成對抗網路 9 2.3 Wasserstein GAN 19 2.4 生成對抗網路加速器文獻回顧 20 第三章 生成對抗網路設計空間探索 21 3.1 生成對抗網路之前向傳播與反向傳播整合設計 24 3.2 生成對抗網路之統一運算架構整合分析與設計 43 第四章 生成對抗網路系統設計 59 4.1 GAN IP硬體加速器設計 59 4.2 SoPC系統設計 69 第五章 實驗環境與實驗數據分析 76 5.1 開發平台 76 5.2 使用Pytorch建構生成對抗網路 78 5.3 使用Matlab實現生成對抗網路演算法 80 5.4 基於C進行生成對抗網路中各運算之執行時間模擬 80 5.5 基於SystemVerilog實現GAN IP與FPGA驗證及實驗結果 82 5.6 晶片前端設計與實驗結果 92 第六章 結論與未來展望 95 參考文獻 96

    [1] Hinton G E, Sejnowski T J, Ackley D H. Boltzmann Machines: Constraint Satisfaction Networks that Learn. Technical Report No. CMU-CS-84¡119, Carnegie-Mellon University, Pittsburgh, PA, USA, 1984.
    [2] Ackley D H, Hinton G E, Sejnowski T J. A learning algorithm for Boltzmann machines. Cognitive Science, 1985,9(1): 147-169.
    [3] Hinton G E, Osindero S, Teh Y W. A fast learning algorithm for deep belief nets. Neural Computation, 2006, 18(7):1527-1554.
    [4] Kingma D P, Welling M. Auto-encoding variational Bayes. arXiv preprint arXiv: 1312.6114, 2013.
    [5] Rezende D J, Mohamed S, Wierstra D. Stochastic backpropagation and approximate inference in deep generative models. arXiv preprint arXiv: 1401.4082, 2014.
    [6] Salakhutdinov R, Hinton G. Deep Boltzmann machines. In: Proceedings of the 12th International Conference on Artifcial Intelligence and Statistics. Clearwater Beach, Florida, USA: AISTATS, 2009. 448-455.
    [7] Smolensky P. Information processing in dynamical systems: Foundations of harmony theory. Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Cambridge, MA, USA: MIT Press, 1986.
    [8] Hinton G E, Zemel R S. Autoencoders, minimum description length and Helmholtz free energy. In: Proceedings of the 6th International Conference on Neural Informa-tion Processing Systems. Denver, Colorado, USA: Morgan Kaufmann Publishers Inc., 1994. 3-10.
    [9] I. J. Goodfellow, J. Pouget-Abadie, Mehdi Mirza and Bing Xu. Generative Adversarial Networks. arXiv preprint. arXiv: 1406.2661, 2014.
    [10] SPRINGENBERG J T. Unsupervised and semi-supervised learning with categorical generative adversarial networks[J]. arXiv: arXiv1511.06390, 2015.
    [11] SANTANA E, HOTZ G. Learning a driving simulator[J]. arXiv:arXiv1608.01230, 2016.
    [12] WU L, XIA Y, ZHAO L, et al. Adversarial neural machine translation. arXiv: arXiv1704.06933, 2017.
    [13] SCHLEGL T, SEEBÖCK P, WALDSTEIN S M, et al. Unsupervised anomaly detection with generative adversarial networks to guide marker discovery[J]. arXiv: arXiv1703.05921, 2017.
    [14] HU W W, TAN Y. Generating adversarial malware examples for black-box attacks based on GAN[J]. arXiv: arXiv1702.05983, 2017.
    [15] Gou C, Wu Y, Wang K, Wang F Y, Ji Q. Learning-by-synthesis for accurate eye detection. In: Proceedings of the 2016 IEEE International Conference on Pattern Recognition (ICPR). Cancun, Mexico: IEEE, 2016.
    [16] Gou C, Wu Y, Wang K, Wang K F, Wang F Y, Ji Q. A joint cascaded framework for simultaneous eye detection and eye state estimation. Pattern Recognition, 2017, 67: 23-31.
    [17] Iizuka S, Simo-Serra E, Ishikawa H. Globally and locally consistent image completion. ACM Transactions on Graphics, 2017, 36(4): Article No. 107.
    [18] Li Y J, Liu S F, Yang J M, Yang M H. Generative face completion. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, Hawaii, USA: IEEE, 2017. 5892-5900.
    [19] Ledig C, Theis L, Huszar F, Caballero J, Cunningham A, Acosta A, et al. Photo-realistic single image super-resolution using a generative adversarial network. arXiv preprint arXiv: 1609.04802, 2016.
    [20] Lotter W, Kreiman G, Cox D. Unsupervised learning of visual structure using predictive generative networks. arXiv preprint arXiv: 1511.06380, 2015.
    [21] Lotter W, Kreiman G, Cox D. Deep predictive coding networks for video prediction and unsupervised learning. arXiv preprint arXiv: 1605.08104, 2016.
    [22] Kupyn O, Budzan V, Mykhailych M, Mishkin D, Matas J. DeblurGAN: Blind motion deblurring using conditional adversarial networks. arXiv preprint arXiv: 1711.07064, 2017.
    [23] PERARNAU G, VAN DE WEIJER J, RADUCANU B, et al. Invertible conditional GANs for image editing[J]. arXiv: arXiv1611.06355, 2016.
    [24] CRESWELL A, BHARATH A A. Inverting the generator of a generative adversarial network[J]. arXiv: arXiv1611.05644, 2016.
    [25] ZHOU S, XIAO T, YANG Y, et al. GeneGAN: learning object transfiguration and attribute subspace from unpaired data[J]. arXiv: arXiv1705.04932, 2017.
    [26] KIM T, CHA M, KIM H, et al. Learning to discover cross-domain relations with generative adversarial networks[J]. arXiv: arXiv1703.05192, 2017.
    [27] WANG C, WANG C, XU C, et al. Tag disentangled generative adversarial network for object image rerendering[C]//The Twenty-Sixth International Joint Conference on Artificial Intelligence. 2017: 2901-2907.
    [28] ANTIPOV G, BACCOUCHE M, DUGELAY JL. Face aging with conditional generative adversarial networks[J]. arXiv: arXiv1702. 01983, 2017.
    [29] MATHIEU M, COUPRIE C, LECUN Y. Deep multi-scale video prediction beyond mean square error[J]. arXiv: arXiv1511.05440, 2015.
    [30] VONDRICK C, PIRSIAVASH H, TORRALBA A. Generating videos with scene dynamics[C]//Conferrence on Neural Information Processing Systems. 2016: 613-621.
    [31] ZHU J Y, PARK T, ISOLA P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks[J]. arXiv: arXiv 1703.10593, 2017.
    [32] ISOLA P, ZHU J Y, ZHOU T, et al. Image-to-image translation with conditional adversarial networks[J]. arXiv: arXiv1611.07004, 2016.
    [33] YI Z, ZHANG H, GONG PT. DualGAN: unsupervised dual learning for image-to-image translation[J]. arXiv: arXiv1704.02510, 2017.
    [34] Alec Radford, Luke Metz, and Soumith Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint. arXiv:1511.06434, 2015.
    [35] M. Arjovsky, S. Chintala, and L. Bottou. Wasserstein GAN. In International Conference on Machine Learning (ICML), 2017.
    [36] Ishaan Gulrajani, Faruk Ahmed, Mart´ın Arjovsky, Vincent Dumoulin, and Aaron C. Courville. Improved training of Wasserstein GANs. CoRR, abs/1704.00028, 2017.
    [37] A. Yazdanbakhsh, H. Falahati, P. J. Wolfe, K. Samadi, H. Esmaeilzadeh, and N. S. Kim, “GANAX: A Unified SIMD-MIMD Acceleration for Generative Adversarial Network,” in ISCA, 2018.
    [38] A. Yazdanbakhsh, M. Brzozowski, B. Khaleghi, Soroush Ghodrati, Kambiz Samadi, Nam Sung Kim and Hadi Esmaeilzadeh. FlexiGAN: An End-to-End Solution for FPGA Acceleration of Generative Adversarial Networks. In 2018 IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), 2018.
    [39] F. Liu, C. Liu. A Memristor based Unsupervised Neuromorphic System Towards Fast and Energy-Efficient GAN. arXiv preprint. arXiv1806.01775, 2018.
    [40] F. Chen, L. Song, and Y. Chen, “Regan: A pipelined reram-based accelerator for generative adversarial network,” in ASPDAC, 2018.
    [41] M Song, J Zhang, H Chen and T Li. Towards Efficient Microarchitectural Design for Accelerating Unsupervised GAN-Based Deep Learning. In 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA), 2018.
    [42] W. Lu, G. Yan, J. Li, S. Gong, Y. Han, and X. Li, “Flexflow: A flexible dataflow accelerator architecture for convolutional neural networks,” in High Performance Computer Architecture (HPCA), 2017 IEEE International Symposium on. IEEE, 2017, pp. 553–564.
    [43] Y. Ma et al., “Optimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks,” in FPGA, 2017.
    [44] Qingcheng Xiao, Yun Liang, Liqiang Lu, Shengen Yan, and Yu-Wing Tai. “Exploring Heterogeneous Algorithms for Accelerating Deep Convolutional Neural Networks on FPGAs,” Proceedings of the 54th Annual Design Automation Conference, 2017, ACM, 62.
    [45] H. Kwon, A. Samajdar, and T. Krishna, “Maeri: Enabling flexible dataflow mapping over dnn accelerators via reconfigurable interconnects,” in Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, 2018, pp. 461–475.
    [46] Terasic, “DE2i-150 Development Kit FPGA System User Manual,” 2013.
    [47] Altera, “Cyclone IV Device Handbook, Volume 1,” Altera Corporation, March 2016.
    [48] Paszke, Adam and Gross, Sam and Chintala, Soumith and Chanan, Gregory and Yang, Edward and DeVito, Zachary and Lin, Zeming and Desmaison, Alban and Antiga, Luca and Lerer, Adam, “Automatic differentiation in PyTorch”, in NIPS-W, 2017.
    [49] Howard, Jeremy and others, “fastai”, https://github.com/fastai/fastai, 2018.

    無法下載圖示 校內:2024-02-25公開
    校外:不公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE