| 研究生: |
黃聖傑 Huang, Sheng-Jie |
|---|---|
| 論文名稱: |
多神經網路動態執行加速器之設計 Dynamic Execution Accelerator Design for Multi-NNs |
| 指導教授: |
周哲民
Jou, Jer-Min |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2022 |
| 畢業學年度: | 110 |
| 語文別: | 中文 |
| 論文頁數: | 54 |
| 中文關鍵詞: | 多神經網路 、機器學習 、動態指令排程 、加速器 |
| 外文關鍵詞: | Multi-Neural Networks, Machine Learning, Dynamic Instruction Scheduling, Accelerator |
| 相關次數: | 點閱:64 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
多神經網路執行已成為當今神經網路加速器最重要的設計方向之一。例如,由雲端服務提供商設計單個異構人工智慧(Artificial Intelligence ; AI)加速器為大量用戶提供神經網路訓練的環境,藉此提升成本效益。因此對於理想的AI加速器應該要能執行多種神經網路,同時充分利用加速器上的硬體資源,然而在多種神經網路模型之間,可能存在著本質上的差異,這主要來自於不同神經網路模型,本身就有不同的模型參數與運算模式。
在本文中,我們提出多神經網路動態執行加速器之設計,是一種將神經網路模型內各層的運算轉換成指令,再由指令作為加速器的執行依據,同時我們在硬體設計上完成指令動態執行控制器,其指令之間的優先權以記憶體資源為參考依據,藉此來降低神經網路大量資料傳輸,而導致的記憶體頻寬問題。最後以實驗說明我們的多神經網路動態執行加速器將可指令化不同的神經網路模型,並實現動態排程完成指令執行的優先順序判斷。
Multi-neural network execution has become one of the most important design directions of neural network accelerators. For example, a single heterogeneous artificial intelligence (AI) accelerator designed by a cloud service provider provides an environment for neural network training to a large number of users, thereby improving cost-effectiveness. Therefore, an ideal AI accelerator should be able to execute a variety of neural networks while making full use of the hardware resources on the accelerator. However, there may be essential differences between various neural network models, which are mainly due to different neural network models. The road model itself has different model parameters and operation modes.
In this paper, we propose the design of a multi-neural network dynamic execution accelerator, which converts the operations of each layer in the neural network model into instructions, and then uses the instructions as the execution basis of the accelerator. At the same time, we complete the instruction dynamic in the hardware design. In the execution controller, the priority between the commands is based on the memory resource, thereby reducing the memory bandwidth problem caused by the transmission of a large amount of data in the neural network. Finally, the experiment shows that our multi-neural network dynamic execution accelerator can instruct different neural network models, and realizes the dynamic scheduling to complete the priority order judgment of instruction execution.
[1] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
[2] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” arXiv preprint arXiv:1704.04861, 2017.
[3] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in neural information processing systems, 2012, pp. 1097–1105.
[4] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
[5] S. Xie, R. Girshick, P. Doll´ar, Z. Tu, and K. He, “Aggregated residual transformations for deep neural networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1492–1500.
[6] Y. Wu, M. Schuster, Z. Chen, Q. V. Le, M. Norouzi, W. Macherey, M. Krikun, Y. Cao, Q. Gao, K. Macherey et al., “Google’s neural machine translation system: Bridging the gap between human and machine translation,” arXiv preprint arXiv:1609.08144, 2016.
[7] T. Chen, Z. Du, N. Sun, J. Wang, C. Wu, Y. Chen, and O. Temam,“Diannao: A small-footprint high-throughput accelerator for ubiquitous machine-learning,” in ACM Sigplan Notices, vol. 49, no. 4. ACM, 2014, pp. 269–284.
[8] Y. Chen, T. Luo, S. Liu, S. Zhang, L. He, J. Wang, L. Li, T. Chen, Z. Xu, N. Sun et al., “Dadiannao: A machine-learning supercomputer,” in Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 2014, pp. 609–622.
[9] E. Chung, J. Fowers, K. Ovtcharov, M. Papamichael, A. Caulfield, T. Massengill, M. Liu, D. Lo, S. Alkalay, M. Haselman et al., “Serving dnns in real time at datacenter scale with project brainwave,” IEEE Micro, vol. 38, no. 2, pp. 8–20, 2018.
[10] F. Altch´e and A. de La Fortelle, “An lstm network for highway trajectory prediction,” in 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC). IEEE, 2017, pp. 353–359.
[11] J. M. Alvarez, T. Gevers, Y. LeCun, and A. M. Lopez, “Road scene segmentation from a single image,” in European Conference on Computer Vision. Springer, 2012, pp. 376–389.
[12] M. Aly, “Real time detection of lane markers in urban streets,” in 2008 IEEE Intelligent Vehicles Symposium. IEEE, 2008, pp. 7–12.
[13] D. Amodei, S. Ananthanarayanan, R. Anubhai, J. Bai, E. Battenberg, C. Case, J. Casper, B. Catanzaro, Q. Cheng, G. Chen et al., “Deep speech 2: End-to-end speech recognition in english and mandarin,” in International conference on machine learning, 2016, pp. 173–182.
[14] E. Derman and A. A. Salah, “Continuous real-time vehicle driver authentication using convolutional neural network based face recognition,” in 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018). IEEE, 2018, pp. 577–584.
[15] H. M. Eraqi, M. N. Moustafa, and J. Honer, “End-to-end deep learning for steering autonomous vehicles considering temporal dependencies,” arXiv preprint arXiv:1710.03804, 2017.
[16] J. Koci´c, N. Joviˇci´c, and V. Drndarevi´c, “An end-to-end deep neural network for autonomous driving designed for embedded automotive platforms,” Sensors, vol. 19, no. 9, p. 2064, 2019.
[17] R. Valiente, M. Zaman, S. Ozer, and Y. P. Fallah, “Controlling steering angle for cooperative self-driving vehicles utilizing cnn and lstm-based deep networks,” in 2019 IEEE Intelligent Vehicles Symposium (IV). IEEE, 2019, pp. 2423–2428.
[18] Z. Yang, Y. Zhang, J. Yu, J. Cai, and J. Luo, “End-to-end multi-modal multi-task vehicle control for self-driving cars with visual perceptions,” in 2018 24th International Conference on Pattern Recognition (ICPR). IEEE, 2018, pp. 2289–2294.
[19] ALBAWI, Saad; MOHAMMED, Tareq Abed; AL-ZAWI, Saad. Understanding of a convolutional neural network. In: 2017 International Conference on Engineering and Technology (ICET). Ieee, p. 1-6. 2017.
[20] MEDSKER, Larry R.; JAIN, L. C. Recurrent neural networks. Design and Applications, 5: 64-67, 2001.
[21] HUANG, Zhiheng; XU, Wei; YU, Kai. Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991, 2015.
[22] Iandola, Forrest N., et al. "SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size." arXiv preprint arXiv:1602.07360 (2016).
[23] Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).
[24] Szegedy, Christian, et al. "Going deeper with convolutions." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
[25] B. Y-Lan, P. Jean, and L. Yann, "A Theoretical Analysis of Feature Pooling in Visual Recognition.," In Int. Conf. on Machine Learning, 2010.
[26] LeCun, Yann, et al. "Gradient-based learning applied to document recognition." Proceedings of the IEEE 86.11 (1998): 2278-2324.
校內:2027-08-10公開