| 研究生: |
孫郁雯 Sun, Yu-Wen |
|---|---|
| 論文名稱: |
以FPGA實現―MNIST資料集之DCGAN影像生成器 An FPGA Implementation of the DCGAN Image Generator Using the MNIST Dataset |
| 指導教授: |
陳進興
Chen, Chin-Hsing 張志文 Chang, Chih-Wen |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電腦與通信工程研究所 Institute of Computer & Communication Engineering |
| 論文出版年: | 2025 |
| 畢業學年度: | 113 |
| 語文別: | 中文 |
| 論文頁數: | 86 |
| 中文關鍵詞: | 現場可規劃邏輯電路 、生成式模型 、深度卷積生成對抗網路 、MNIST 、RS232 |
| 外文關鍵詞: | Field Programmable Logic Gate Array, Generative Models, Deep Convolutional Generative Adversarial Network, MNIST, RS232 |
| 相關次數: | 點閱:5 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著科技的發展,人工智慧(Artificial Intelligence, AI)已是生活中不可或缺的部份,近年又以生成式模型(Generative Models, GM)、深度學習(Deep Learning)、Edge AI等方向最為熱門。而GM目前已被廣泛應用在圖像生成、影音合成、風格轉換等領域,是相當重要的技術,本論文採其變體之一深度卷積生成對抗網路(DCGAN)作為研究方法,並將其生成器實現在現場可規劃邏輯電路(FPGA)上,以達成低功耗、高效率的影像生成,推動Edge AI將AI模型部署到裝置端的應用。
本論文將以預先透過灰階MNIST資料集訓練好的DCGAN生成器各層參數,從電腦端透過RS232串接口輸入至FPGA,並儲存於M9K區塊記憶體及SRAM當中。FPGA輸入則以隨機高斯雜訊,並結合SRAM控制器進行讀取與寫入參數來協助運算,最後將生成隨機數字的灰階圖像,再透過RS232串接口輸出至電腦端,將圖像可視化於螢幕上。
此系統相較傳統生成對抗網路(GAN),透過結合卷積神經網路(CNN)提升生成圖像的品質與穩定性,生成的圖像與軟體端產生的預測圖像相比進行驗證,能取得平均PSNR 20.52 dB、SSIM 0.7637的表現,而硬體資源使用了26%的FPGA邏輯單元及84%的區塊記憶體和2MB SRAM。研究發現DCGAN運算會需要龐大的記憶體資源,反而運算過程所使用的邏輯單元不占多數,受限於FPGA呼叫SRAM一次僅能傳輸一筆資料,所以在運算速度與資源取捨下生成速度與軟體端相比速度相近儘快上約2倍。
With the development of technology, artificial intelligence (AI) has become an indispensable part of life. In recent years, generative models, deep learning, and edge AI have gained significant popularity. Generative models (GM) have been widely used in image generation, audio and video synthesis, style transfer, and other fields, making it a critically important technology. This thesis employs one of its variants, the deep convolutional generative adversarial network (DCGAN), as a research method and implements its generator on a field programmable logic gate array (FPGA) to achieve low-power, high-efficiency image generation and to promote edge AI for deploying AI models to device-side applications.
This thesis will utilize the parameters of each layer of the DCGAN generator that have been pre-trained with the grayscale MNIST dataset, input them from the computer through the RS232 serial interface to the FPGA, and store them in the M9K block memory and SRAM. The FPGA input employs random Gaussian noise and combines with the SRAM controller to read and write parameters to assist in the calculations. Finally, a grayscale image of random numbers will be generated and then outputted to the computer through the RS232 serial interface to visualize the image on the screen.
Compared with conventional Generative Adversarial Networks (GANs), the proposed system integrates Convolutional Neural Networks (CNNs) to improve image generation quality and stability. The generated images were validated against those produced by the software-based model, achieving an average PSNR of 20.52 dB and SSIM of 0.7637. The hardware implementation utilized 26% of FPGA logic elements, 84% of block memory, and 2 MB of SRAM. Experimental results indicate that DCGAN computation demands considerable memory resources, while consuming relatively few logic units. Due to the limitation that SRAM access on the FPGA is restricted to one data per cycle, the generation speed is constrained. Nevertheless, the hardware implementation achieves approximately 2× faster performance compared to the software counterpart, demonstrating competitive throughput with reasonable resource utilization.
[1] P. Andreev, A. Fritzler, and D. Vetrov, “Quantization of Generative Adversarial Networks for Efficient Inference: a Methodological Study,” arXiv preprint arXiv:2108.13996v1, Aug. 31, 2021.
[2] Y. Cheng, D. Wang, P. Zhou, and T. Zhang, “Model compression and hardware acceleration for neural networks: A comprehensive survey,” Proceedings of the IEEE, vol. 108, no. 4, pp. 485–532, Apr. 2020. doi: 10.1109/JPROC.2020.2976475
[3] P. Chu, FPGA Prototyping by Verilog Examples: Xilinx Spartan-3 Version, Wiley, 2008.
[4] M. Doumet, M. Stan, M. Hall, and V. Betz, “HPIPE: High Throughput CNN Inference on FPGAs with High-Bandwidth Memory,” arXiv preprint arXiv:2408.09209v1, Aug. 17, 2024.
[5] V. Dumoulin and F. Visin, “A guide to convolution arithmetic for deep learning,” arXiv preprint arXiv:1603.07285, 2016.
[6] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville and Y. Bengio, “Generative adversarial nets,” Advances in Neural Information Processing Systems 27 (NeurIPS), pages 2672–2680, 2014.
[7] S. Hauck and A. DeHon, “Reconfigurable computing: the theory and practice of FPGA-based computation,” Morgan Kaufmann, 2007.
[8] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” Proc. Int’l Conf. on Machine Learning (ICML), pp. 448–456, 2015.
[9] Y. LeCun, C. Cortes and C. Burges, “MNIST handwritten digit database,” [Online]. Available: http://yann.lecun.com/exdb/mnist/
[10] V. Nair and G. E. Hinton, “Rectified linear units improve restricted boltzmann machines,” Proc. 27th Int’l Conf. on Machine Learning (ICML), pp. 807–814, 2010.
[11] NVIDIA Corporation, “CUDA C Programming Guide,” Version 12.0, 2023.
[12] The ONNX Community, “Open Neural Network Exchange (ONNX),” [Online]. Available: https://onnx.ai/
[13] A. Paszke et al., “PyTorch: An imperative style, high-performance deep learning library,” Advances in Neural Information Processing Systems (NeurIPS), pp. 8026–8037, 2019.
[14] A. Radford, L. Metz and S. Chintala, “Unsupervised representation learning with deep convolutional generative adversarial networks,” arXiv preprint arXiv:1511.06434, 2015.
[15] Lutz Roeder, “Netron: Visualizer for neural network, deep learning and machine learning models,” [Online]. Available: https://github.com/lutzroeder/netron
[16] P. Shingare, “Design and Implementation of UART Using VHDL on FPGA,” International Journal of Management, Information Technology and Engineering, vol. 2, no. 5, pp. 57–62, May 2014.
[17] Terasic Technologies, “DE2-115 User Manual,” v.1.3.0, 2022.
[18] R. Wei, S. Xu, Q. Guo, and M. Li, “FPQVAR: Floating Point Quantization for Visual Autoregressive Model with FPGA Hardware Co-design,” arXiv preprint arXiv:2505.16335, May 2025.
[19] Y. Yu, “Implementing the generator of DCGAN on FPGA,” Bachelor’s Thesis, Metropolia University of Applied Sciences, 2018.
校內:立即公開