研究生: |
蕭人豪 Hsiao, Jen-Hao |
---|---|
論文名稱: |
利用串列輸入與預測門閥降低類神經網路卷積層能量消耗之硬體實現 Hardware Implementation of Convolutional Neural Network to Reduce Energy Consumption Using Serial Input and Prediction Threshold |
指導教授: |
卿文龍
Chin, Wen-Long |
學位類別: |
碩士 Master |
系所名稱: |
工學院 - 工程科學系 Department of Engineering Science |
論文出版年: | 2020 |
畢業學年度: | 108 |
語文別: | 中文 |
論文頁數: | 58 |
中文關鍵詞: | 深度學習 、卷積神經網路 、串列輸入 、硬體加速 、能量效率 |
外文關鍵詞: | convolutional neural network, deep learning, energy efficiency, hardware accelerator, bit-seria, serial input |
相關次數: | 點閱:76 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
基於卷積神經網路(convolutional neural network, CNN)的圖像辨識能力已經超越人類,以及在其他電腦視覺領域應用均有不錯的成果,卷積神經網路遂成為深度學習(deep learning)領域中非常熱門的議題。加上近來可攜帶裝置的流行,與物聯網及5G網路的興起,為了加速卷積神經網路的運算,使得硬體實現的需求逐漸提高。不過隨著網路模型的深度越來越深,計算成本不斷提升,而這對於容易有能量限制的裝置來說,並非是一件好事。因此,如何實現低功耗的卷積神經網路硬體加速器,也成為當下的焦點議題。
本篇論文藉由AlexNet神經網路之原本的參數,透過軟體模擬的方式,完成定點數(fixed-point number)設計,並模擬串列輸入(serial input)卷積運算演算法,及搭配預測門閥(prediction threshold)的創意,以達到降低運算量的目的。同時,也將該演算法實現成硬體電路,與傳統並列輸入方式相互比較,以探討能量節省效率、吞吐量(throughput)、邏輯閘數目的差異。
The image recognition based on convolutional neural networks has surpassed that performed by humans, and has achieved a good result in other computer vision applications. Therefore, convolutional neural networks (CNNs) have becomes a very popular topic in the field of deep learning. Owing to recent popularity of wearable devices and the rise of the Internet of Things (IoT) and 5G networks, the demand for hardware implementation has gradually increased to accelerate the calculation of convolutional neural networks. However, as the network structure becomes deeper and deeper, the calculation cost continues increasing. This is not a good thing for devices that have energy limitations. Therefore, how to effectively apply the deep convolutional neural network on those low-power devices has become an essential issue.
This paper uses the original parameters of AlexNet to complete the fixed-point number design through software simulation, for the serial input algorithm with prediction threshold in order to achieve the purpose of reducing the amount of computation. Beyond that, we implement its digital circuit and compare it with the traditional parallel input method to discuss the tradeoffs between energy saving efficiency, throughput, and gate counts.
[1] J. Donahue, L. A. Hendricks, S. Guadarrama, M. Rohrbach, S. Venugopalan, T. Darrell, and K. Saenko, “Long-term recurrent convolutional networks for visual recognition and description,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2015, pp. 2625-2634.
[2] S. Lawrence, C. L. Giles, A. C. Tsoi, and A. D. Back, “Face recognition: A convolutional neural networks approach,” in Proc. IEEE Trans. on Neural Networks, 1997, pp. 98-113.
[3] C. Feichtenhofer, A. Pinz, and A. Zisserman, “Convolutional two-stream network fusion for video action recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 1933-1941.
[4] C. Li, Y. Hou, P. Wang, and W. Li, “Joint distance maps based action recognition with convolutional neural networks,” in Proc. IEEE Signal Processing Lett., 2017, pp. 624-628.
[5] H. Nam, and B. Han, “Learning multi-domain convolutional neural networks for visual tracking,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 4293-4302.
[6] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 770-778.
[7] A. Krizhevsky, I. Sutskever, and G. Hinton, “Imagenet classification with deep convolutional neural networks,” in Proc. Adv. Neural Inf. Process. Syst., 2012, pp. 1106-1114.
[8] M. Courbariaux, Y. Bengio, and J.-P. David, “BinaryConnect: Training deep neural networks with binary weights during propagations,” in Proc. NIPS, 2015, pp. 3123-3131.
[9] F. Li and B. Liu, “Ternary weight networks,” in Proc. NIPS Workshop Efficient Methods Deep Neural Netw., 2016.
[10] C. Zhu, S. Han, H. Mao, and W. J. Dally, “Trained ternary quantization,” in Proc. ICLR, 2017.
[11] M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi, “XNOR-Net: ImageNet classification using binary convolutional neural networks,” in Proc. ECCV, 2016, pp. 525-542.
[12] M. Courbariaux and Y. Bengio, “Binarized neural networks: Training deep neural networks with weights and activations constrained to +1 or −1.” [Online]. Available: https://arxiv.org/abs/1602.02830
[13] I. Hubara, M. Courbariaux, D. Soudry, R. El-Yaniv, and Y. Bengio, “Quantized neural networks: Training neural networks with low precision weights and activations.” [Online]. Available: https://arxiv.org/abs/1609.07061
[14] S. Zhou, Y. Wu, Z. Ni, X. Zhou, H. Wen, and Y. Zou, “DoReFa-Net: Training low bitwidth convolutional neural networks with low bitwidth gradients.” [Online]. Available: https://arxiv.org/abs/1606.06160
[15] Z. Cai, X. He, J. Sun, and N. Vasconcelos, “Deep learning with low precision by halfwave Gaussian quantization,” in Proc. CVPR, 2017.
[16] E. H. Lee, D. Miyashita, E. Chai, B. Murmann, and S. S. Wong, “LogNet: Energy-efficient neural networks using logrithmic computations,” in Proc. ICASSP, 2017, pp. 5900-5904.
[17] S. Han, H. Mao, and W. J. Dally, “Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding,” in Proc. ICLR, 2016.
[18] V. Sze, Y. H. Chen, T. J. Yang, and J. S. Emer, “Efficient processing of deep neural networks: A tutorial and survey,” in Proc. IEEE, 2017, pp. 2295-2329.
[19] P. Judd, J. Albericio, T. Hetherington, T. Aamodt, N. E. Jerger, R. Urtasun, and A. Moshovos, “Reduced-precision strategies for bounded memory in deep neural nets.” [Online]. Available: https://arxiv.org/abs/1511.05236
[20] M. Horowitz, “Computing’s energy problem (and what we can do about it),” in Proc. IEEE ISSCC Dig. Tech. Papers, 2014, pp. 10-14.
[21] P. Judd, J. Albericio, T. Hetherington, T. M. Aamodt, and A. Moshovos, “Stripes: Bit-serial deep neural network computing,” in Proc. MICRO, 2016, pp. 1-12.
[22] L. Hsu, C. Chiu, K. Lin, H. Chou, Y. Pu, “ESSA: An energy-aware bit-serial streaming deep convolutional neural network accelerator,” in J. Syst. Archit., 2020.
[23] 黃啟翔 (2020)。降低類神經網路卷積層計算複雜度之硬體設計。國立成功大學工程科學研究所碩士論文,台南市。
[24] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” in Proc. IEEE, 1998, pp. 2278-2324.
[25] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in Proc. ICLR, 2015.
[26] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Proc. CVPR, 2015, pp. 1-9.
[27] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. CVPR, 2016, pp. 770-778.
[28] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei, “ImageNet large scale visual recognition challenge,” in Int. J. Comput. Vis., 2015, pp. 211-252.