簡易檢索 / 詳目顯示

研究生: 蕭人豪
Hsiao, Jen-Hao
論文名稱: 利用串列輸入與預測門閥降低類神經網路卷積層能量消耗之硬體實現
Hardware Implementation of Convolutional Neural Network to Reduce Energy Consumption Using Serial Input and Prediction Threshold
指導教授: 卿文龍
Chin, Wen-Long
學位類別: 碩士
Master
系所名稱: 工學院 - 工程科學系
Department of Engineering Science
論文出版年: 2020
畢業學年度: 108
語文別: 中文
論文頁數: 58
中文關鍵詞: 深度學習卷積神經網路串列輸入硬體加速能量效率
外文關鍵詞: convolutional neural network, deep learning, energy efficiency, hardware accelerator, bit-seria, serial input
相關次數: 點閱:76下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 基於卷積神經網路(convolutional neural network, CNN)的圖像辨識能力已經超越人類,以及在其他電腦視覺領域應用均有不錯的成果,卷積神經網路遂成為深度學習(deep learning)領域中非常熱門的議題。加上近來可攜帶裝置的流行,與物聯網及5G網路的興起,為了加速卷積神經網路的運算,使得硬體實現的需求逐漸提高。不過隨著網路模型的深度越來越深,計算成本不斷提升,而這對於容易有能量限制的裝置來說,並非是一件好事。因此,如何實現低功耗的卷積神經網路硬體加速器,也成為當下的焦點議題。
    本篇論文藉由AlexNet神經網路之原本的參數,透過軟體模擬的方式,完成定點數(fixed-point number)設計,並模擬串列輸入(serial input)卷積運算演算法,及搭配預測門閥(prediction threshold)的創意,以達到降低運算量的目的。同時,也將該演算法實現成硬體電路,與傳統並列輸入方式相互比較,以探討能量節省效率、吞吐量(throughput)、邏輯閘數目的差異。

    The image recognition based on convolutional neural networks has surpassed that performed by humans, and has achieved a good result in other computer vision applications. Therefore, convolutional neural networks (CNNs) have becomes a very popular topic in the field of deep learning. Owing to recent popularity of wearable devices and the rise of the Internet of Things (IoT) and 5G networks, the demand for hardware implementation has gradually increased to accelerate the calculation of convolutional neural networks. However, as the network structure becomes deeper and deeper, the calculation cost continues increasing. This is not a good thing for devices that have energy limitations. Therefore, how to effectively apply the deep convolutional neural network on those low-power devices has become an essential issue.
    This paper uses the original parameters of AlexNet to complete the fixed-point number design through software simulation, for the serial input algorithm with prediction threshold in order to achieve the purpose of reducing the amount of computation. Beyond that, we implement its digital circuit and compare it with the traditional parallel input method to discuss the tradeoffs between energy saving efficiency, throughput, and gate counts.

    摘要 i Hardware Implementation of Convolutional Neural Network to Reduce Energy Consumption Using Serial Input and Prediction Threshold ii I. INTRODUCTION iii II. THE ARCHITECTURE OF CONVOLOTIONAL NEURAL NETWORK iv III. PROPOSED SERIAL-INPUT ALGORITHM vi IV. SOFTWARE SIMULATION RESULTS ix V. HARDWARE MODULE xiii VI. HARDWARE SIMULATION RESULTS xv VII. CONCLUSIONS AND FUTURE WORKS xviii VIII. REFERENCES xix 致謝 xxi 目錄 xxii 圖目錄 xxiv 表目錄 xxvi 第一章 導論 1 1.1 前言 1 1.2 研究動機 2 1.3 文獻探討 3 1.4 論文架構 7 第二章 卷積神經網路 8 2.1 卷積神經網路架構介紹 8 2.1.1 卷積層 9 2.1.2 激勵函數 11 2.1.3 池化層 13 2.1.4 全連接層 14 2.2 常見卷積神經網路模型 15 2.2.1 LeNet 16 2.2.2 AlexNet 17 2.2.3 VGG 18 2.2.4 GoogLeNet 20 2.2.5 ResNet 22 2.2.6 總結 23 第三章 卷積層串列輸入演算法設計 24 3.1 演算法概念 24 3.2 串列輸入卷積運算演算法 25 3.3 預測門閥 27 3.4 閥值最佳化演算法 29 3.5 軟體模擬結果與討論 32 3.5.1 決定N與閥值 33 3.5.2 效能模擬 36 第四章 硬體模組 40 4.1 硬體功能描述 40 4.2 系統方塊圖與介面 41 4.3 資料路徑架構 43 4.4 控制訊號單元 47 第五章 硬體模擬結果比較與分析 49 5.1 實驗對照組簡述 49 5.2 模擬結果比較與分析 50 5.3 硬體缺點探討 54 第六章 結論與未來展望 55 參考文獻 56

    [1] J. Donahue, L. A. Hendricks, S. Guadarrama, M. Rohrbach, S. Venugopalan, T. Darrell, and K. Saenko, “Long-term recurrent convolutional networks for visual recognition and description,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2015, pp. 2625-2634.
    [2] S. Lawrence, C. L. Giles, A. C. Tsoi, and A. D. Back, “Face recognition: A convolutional neural networks approach,” in Proc. IEEE Trans. on Neural Networks, 1997, pp. 98-113.
    [3] C. Feichtenhofer, A. Pinz, and A. Zisserman, “Convolutional two-stream network fusion for video action recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 1933-1941.
    [4] C. Li, Y. Hou, P. Wang, and W. Li, “Joint distance maps based action recognition with convolutional neural networks,” in Proc. IEEE Signal Processing Lett., 2017, pp. 624-628.
    [5] H. Nam, and B. Han, “Learning multi-domain convolutional neural networks for visual tracking,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 4293-4302.
    [6] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 770-778.
    [7] A. Krizhevsky, I. Sutskever, and G. Hinton, “Imagenet classification with deep convolutional neural networks,” in Proc. Adv. Neural Inf. Process. Syst., 2012, pp. 1106-1114.
    [8] M. Courbariaux, Y. Bengio, and J.-P. David, “BinaryConnect: Training deep neural networks with binary weights during propagations,” in Proc. NIPS, 2015, pp. 3123-3131.
    [9] F. Li and B. Liu, “Ternary weight networks,” in Proc. NIPS Workshop Efficient Methods Deep Neural Netw., 2016.
    [10] C. Zhu, S. Han, H. Mao, and W. J. Dally, “Trained ternary quantization,” in Proc. ICLR, 2017.
    [11] M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi, “XNOR-Net: ImageNet classification using binary convolutional neural networks,” in Proc. ECCV, 2016, pp. 525-542.
    [12] M. Courbariaux and Y. Bengio, “Binarized neural networks: Training deep neural networks with weights and activations constrained to +1 or −1.” [Online]. Available: https://arxiv.org/abs/1602.02830
    [13] I. Hubara, M. Courbariaux, D. Soudry, R. El-Yaniv, and Y. Bengio, “Quantized neural networks: Training neural networks with low precision weights and activations.” [Online]. Available: https://arxiv.org/abs/1609.07061
    [14] S. Zhou, Y. Wu, Z. Ni, X. Zhou, H. Wen, and Y. Zou, “DoReFa-Net: Training low bitwidth convolutional neural networks with low bitwidth gradients.” [Online]. Available: https://arxiv.org/abs/1606.06160
    [15] Z. Cai, X. He, J. Sun, and N. Vasconcelos, “Deep learning with low precision by halfwave Gaussian quantization,” in Proc. CVPR, 2017.
    [16] E. H. Lee, D. Miyashita, E. Chai, B. Murmann, and S. S. Wong, “LogNet: Energy-efficient neural networks using logrithmic computations,” in Proc. ICASSP, 2017, pp. 5900-5904.
    [17] S. Han, H. Mao, and W. J. Dally, “Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding,” in Proc. ICLR, 2016.
    [18] V. Sze, Y. H. Chen, T. J. Yang, and J. S. Emer, “Efficient processing of deep neural networks: A tutorial and survey,” in Proc. IEEE, 2017, pp. 2295-2329.
    [19] P. Judd, J. Albericio, T. Hetherington, T. Aamodt, N. E. Jerger, R. Urtasun, and A. Moshovos, “Reduced-precision strategies for bounded memory in deep neural nets.” [Online]. Available: https://arxiv.org/abs/1511.05236
    [20] M. Horowitz, “Computing’s energy problem (and what we can do about it),” in Proc. IEEE ISSCC Dig. Tech. Papers, 2014, pp. 10-14.
    [21] P. Judd, J. Albericio, T. Hetherington, T. M. Aamodt, and A. Moshovos, “Stripes: Bit-serial deep neural network computing,” in Proc. MICRO, 2016, pp. 1-12.
    [22] L. Hsu, C. Chiu, K. Lin, H. Chou, Y. Pu, “ESSA: An energy-aware bit-serial streaming deep convolutional neural network accelerator,” in J. Syst. Archit., 2020.
    [23] 黃啟翔 (2020)。降低類神經網路卷積層計算複雜度之硬體設計。國立成功大學工程科學研究所碩士論文,台南市。
    [24] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” in Proc. IEEE, 1998, pp. 2278-2324.
    [25] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in Proc. ICLR, 2015.
    [26] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Proc. CVPR, 2015, pp. 1-9.
    [27] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. CVPR, 2016, pp. 770-778.
    [28] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei, “ImageNet large scale visual recognition challenge,” in Int. J. Comput. Vis., 2015, pp. 211-252.

    下載圖示 校內:2025-07-30公開
    校外:2025-07-30公開
    QR CODE