簡易檢索 / 詳目顯示

研究生: 孫郁雯
Sun, Yu-Wen
論文名稱: 以FPGA實現―MNIST資料集之DCGAN影像生成器
An FPGA Implementation of the DCGAN Image Generator Using the MNIST Dataset
指導教授: 陳進興
Chen, Chin-Hsing
張志文
Chang, Chih-Wen
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電腦與通信工程研究所
Institute of Computer & Communication Engineering
論文出版年: 2025
畢業學年度: 113
語文別: 中文
論文頁數: 86
中文關鍵詞: 現場可規劃邏輯電路生成式模型深度卷積生成對抗網路MNISTRS232
外文關鍵詞: Field Programmable Logic Gate Array, Generative Models, Deep Convolutional Generative Adversarial Network, MNIST, RS232
相關次數: 點閱:5下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著科技的發展,人工智慧(Artificial Intelligence, AI)已是生活中不可或缺的部份,近年又以生成式模型(Generative Models, GM)、深度學習(Deep Learning)、Edge AI等方向最為熱門。而GM目前已被廣泛應用在圖像生成、影音合成、風格轉換等領域,是相當重要的技術,本論文採其變體之一深度卷積生成對抗網路(DCGAN)作為研究方法,並將其生成器實現在現場可規劃邏輯電路(FPGA)上,以達成低功耗、高效率的影像生成,推動Edge AI將AI模型部署到裝置端的應用。
    本論文將以預先透過灰階MNIST資料集訓練好的DCGAN生成器各層參數,從電腦端透過RS232串接口輸入至FPGA,並儲存於M9K區塊記憶體及SRAM當中。FPGA輸入則以隨機高斯雜訊,並結合SRAM控制器進行讀取與寫入參數來協助運算,最後將生成隨機數字的灰階圖像,再透過RS232串接口輸出至電腦端,將圖像可視化於螢幕上。
    此系統相較傳統生成對抗網路(GAN),透過結合卷積神經網路(CNN)提升生成圖像的品質與穩定性,生成的圖像與軟體端產生的預測圖像相比進行驗證,能取得平均PSNR 20.52 dB、SSIM 0.7637的表現,而硬體資源使用了26%的FPGA邏輯單元及84%的區塊記憶體和2MB SRAM。研究發現DCGAN運算會需要龐大的記憶體資源,反而運算過程所使用的邏輯單元不占多數,受限於FPGA呼叫SRAM一次僅能傳輸一筆資料,所以在運算速度與資源取捨下生成速度與軟體端相比速度相近儘快上約2倍。

    With the development of technology, artificial intelligence (AI) has become an indispensable part of life. In recent years, generative models, deep learning, and edge AI have gained significant popularity. Generative models (GM) have been widely used in image generation, audio and video synthesis, style transfer, and other fields, making it a critically important technology. This thesis employs one of its variants, the deep convolutional generative adversarial network (DCGAN), as a research method and implements its generator on a field programmable logic gate array (FPGA) to achieve low-power, high-efficiency image generation and to promote edge AI for deploying AI models to device-side applications.
    This thesis will utilize the parameters of each layer of the DCGAN generator that have been pre-trained with the grayscale MNIST dataset, input them from the computer through the RS232 serial interface to the FPGA, and store them in the M9K block memory and SRAM. The FPGA input employs random Gaussian noise and combines with the SRAM controller to read and write parameters to assist in the calculations. Finally, a grayscale image of random numbers will be generated and then outputted to the computer through the RS232 serial interface to visualize the image on the screen.
    Compared with conventional Generative Adversarial Networks (GANs), the proposed system integrates Convolutional Neural Networks (CNNs) to improve image generation quality and stability. The generated images were validated against those produced by the software-based model, achieving an average PSNR of 20.52 dB and SSIM of 0.7637. The hardware implementation utilized 26% of FPGA logic elements, 84% of block memory, and 2 MB of SRAM. Experimental results indicate that DCGAN computation demands considerable memory resources, while consuming relatively few logic units. Due to the limitation that SRAM access on the FPGA is restricted to one data per cycle, the generation speed is constrained. Nevertheless, the hardware implementation achieves approximately 2× faster performance compared to the software counterpart, demonstrating competitive throughput with reasonable resource utilization.

    摘要 I Abstract III 誌謝 V Acknowledgment VI Contents VIII List of Tables X List of Figures XI Chapter1 Introduction 1 1.1 Motivation and Contribution 1 1.2 Thesis Outline 4 Chapter2 Background and Related Works 6 2.1 Deep Convolutional Generative Adversarial Network (DCGAN) 6 2.2 Compute Unified Device Architecture (CUDA) 7 2.3 MATLAB Environment 8 2.4 Field Programmable Logic Gate Array (FPGA) 9 2.5 RS232 Serial Communication Protocol 11 Chapter3 Software Implementation of the DCGAN Model 13 3.1 Model Architecture Design 13 3.1.1 Transposed Convolution 14 3.1.2 Batch Normalization 16 3.1.3 ReLU 16 3.1.4 Tanh 17 3.2 Dataset and Preprocessing 17 3.3 Training Process and Hyperparameter Configuration 19 3.4 GPU Training Environment and Performance 20 3.5 Training Results and Image Generation Examples 21 3.6 Model Export and ONNX Conversion 24 Chapter4 Hardware Implementation of the DCGAN Generator 26 4.1 System Design Overview 26 4.2 Weight Conversion and Storage Mechanism 28 4.3 Functional Module Design and Optimization 30 4.3.1 TOP Module Architecture 30 4.3.2 DCGAN Core Module 33 4.3.3 SRAM Control and Memory Allocation 48 4.3.4 FIFO 51 4.3.5 UART 52 Chapter5 Experimental Results 55 5.1 Experimental Environment and Setup 55 5.2 Image Quality Evaluation 57 5.3 Image Output Examples 59 5.4 Execution Time Comparison 62 5.5 FPGA Resource Utilization 63 5.6 Hardware Demonstration on FPGA Board 65 Chapter6 Conclusion and Future Work 68 6.1 Conclusion 68 6.2 Future Work 68 References 70

    [1] P. Andreev, A. Fritzler, and D. Vetrov, “Quantization of Generative Adversarial Networks for Efficient Inference: a Methodological Study,” arXiv preprint arXiv:2108.13996v1, Aug. 31, 2021.
    [2] Y. Cheng, D. Wang, P. Zhou, and T. Zhang, “Model compression and hardware acceleration for neural networks: A comprehensive survey,” Proceedings of the IEEE, vol. 108, no. 4, pp. 485–532, Apr. 2020. doi: 10.1109/JPROC.2020.2976475
    [3] P. Chu, FPGA Prototyping by Verilog Examples: Xilinx Spartan-3 Version, Wiley, 2008.
    [4] M. Doumet, M. Stan, M. Hall, and V. Betz, “HPIPE: High Throughput CNN Inference on FPGAs with High-Bandwidth Memory,” arXiv preprint arXiv:2408.09209v1, Aug. 17, 2024.
    [5] V. Dumoulin and F. Visin, “A guide to convolution arithmetic for deep learning,” arXiv preprint arXiv:1603.07285, 2016.
    [6] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville and Y. Bengio, “Generative adversarial nets,” Advances in Neural Information Processing Systems 27 (NeurIPS), pages 2672–2680, 2014.
    [7] S. Hauck and A. DeHon, “Reconfigurable computing: the theory and practice of FPGA-based computation,” Morgan Kaufmann, 2007.
    [8] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” Proc. Int’l Conf. on Machine Learning (ICML), pp. 448–456, 2015.
    [9] Y. LeCun, C. Cortes and C. Burges, “MNIST handwritten digit database,” [Online]. Available: http://yann.lecun.com/exdb/mnist/
    [10] V. Nair and G. E. Hinton, “Rectified linear units improve restricted boltzmann machines,” Proc. 27th Int’l Conf. on Machine Learning (ICML), pp. 807–814, 2010.
    [11] NVIDIA Corporation, “CUDA C Programming Guide,” Version 12.0, 2023.
    [12] The ONNX Community, “Open Neural Network Exchange (ONNX),” [Online]. Available: https://onnx.ai/
    [13] A. Paszke et al., “PyTorch: An imperative style, high-performance deep learning library,” Advances in Neural Information Processing Systems (NeurIPS), pp. 8026–8037, 2019.
    [14] A. Radford, L. Metz and S. Chintala, “Unsupervised representation learning with deep convolutional generative adversarial networks,” arXiv preprint arXiv:1511.06434, 2015.
    [15] Lutz Roeder, “Netron: Visualizer for neural network, deep learning and machine learning models,” [Online]. Available: https://github.com/lutzroeder/netron
    [16] P. Shingare, “Design and Implementation of UART Using VHDL on FPGA,” International Journal of Management, Information Technology and Engineering, vol. 2, no. 5, pp. 57–62, May 2014.
    [17] Terasic Technologies, “DE2-115 User Manual,” v.1.3.0, 2022.
    [18] R. Wei, S. Xu, Q. Guo, and M. Li, “FPQVAR: Floating Point Quantization for Visual Autoregressive Model with FPGA Hardware Co-design,” arXiv preprint arXiv:2505.16335, May 2025.
    [19] Y. Yu, “Implementing the generator of DCGAN on FPGA,” Bachelor’s Thesis, Metropolia University of Applied Sciences, 2018.

    無法下載圖示 校內:立即公開
    校外:2028-08-22公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE