簡易檢索 / 詳目顯示

研究生: 黃聖傑
Huang, Sheng-Jie
論文名稱: 多神經網路動態執行加速器之設計
Dynamic Execution Accelerator Design for Multi-NNs
指導教授: 周哲民
Jou, Jer-Min
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2022
畢業學年度: 110
語文別: 中文
論文頁數: 54
中文關鍵詞: 多神經網路機器學習動態指令排程加速器
外文關鍵詞: Multi-Neural Networks, Machine Learning, Dynamic Instruction Scheduling, Accelerator
相關次數: 點閱:64下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 多神經網路執行已成為當今神經網路加速器最重要的設計方向之一。例如,由雲端服務提供商設計單個異構人工智慧(Artificial Intelligence ; AI)加速器為大量用戶提供神經網路訓練的環境,藉此提升成本效益。因此對於理想的AI加速器應該要能執行多種神經網路,同時充分利用加速器上的硬體資源,然而在多種神經網路模型之間,可能存在著本質上的差異,這主要來自於不同神經網路模型,本身就有不同的模型參數與運算模式。
    在本文中,我們提出多神經網路動態執行加速器之設計,是一種將神經網路模型內各層的運算轉換成指令,再由指令作為加速器的執行依據,同時我們在硬體設計上完成指令動態執行控制器,其指令之間的優先權以記憶體資源為參考依據,藉此來降低神經網路大量資料傳輸,而導致的記憶體頻寬問題。最後以實驗說明我們的多神經網路動態執行加速器將可指令化不同的神經網路模型,並實現動態排程完成指令執行的優先順序判斷。

    Multi-neural network execution has become one of the most important design directions of neural network accelerators. For example, a single heterogeneous artificial intelligence (AI) accelerator designed by a cloud service provider provides an environment for neural network training to a large number of users, thereby improving cost-effectiveness. Therefore, an ideal AI accelerator should be able to execute a variety of neural networks while making full use of the hardware resources on the accelerator. However, there may be essential differences between various neural network models, which are mainly due to different neural network models. The road model itself has different model parameters and operation modes.
    In this paper, we propose the design of a multi-neural network dynamic execution accelerator, which converts the operations of each layer in the neural network model into instructions, and then uses the instructions as the execution basis of the accelerator. At the same time, we complete the instruction dynamic in the hardware design. In the execution controller, the priority between the commands is based on the memory resource, thereby reducing the memory bandwidth problem caused by the transmission of a large amount of data in the neural network. Finally, the experiment shows that our multi-neural network dynamic execution accelerator can instruct different neural network models, and realizes the dynamic scheduling to complete the priority order judgment of instruction execution.

    摘要 I SUMMARY II OUR PROPOSED DESIGN II EXPERIMENTS IV CONCLUSION VII 誌謝 VIII 目錄 IX 表目錄 X 圖目錄 X 第一章 緒論 1 1.1研究背景 1 1.2研究動機與目的 1 1.3論文架構 2 第二章 背景知識與相關研究 3 2.1 常見神經網路 3 2.1.1 卷積神經網路(Convolutional Neural Network;CNN) 3 2.1.2 循環神經網路(Recurrent neural network;RNN) 6 2.1.3 長短期記憶網路(Long Short-Term Memory;LSTM) 7 2.2 神經網路加速器優化方法 8 2.2.1 資料重用 9 2.2.2 資料平行 13 2.2.3 模型平行 14 2.3 粗粒度動態執行 15 第三章 基於指令動態執行控制單元之加速器分析 17 3.1 指令動態執行控制單元分析 18 3.1.1 神經網路指令化分析 18 3.1.2 動態式晶片上記憶體選擇控制分析 21 3.2 卷積運算架構資料流與管線化分析 26 3.2.1 資料重用與計算平行 26 3.2.2 輸入與輸出資料流 27 第四章 基於指令動態執行控制單元之加速器設計 30 4.1 階層式管線控制器之硬體設計 31 4.1.1 第一層管線控制器 32 4.1.2 第二層管線控制器 32 4.2 指令動態執行控制器設計 35 4.3 卷積運算架構設計 36 4.3.1 卷積運算之分配及排程設計 37 4.3.2 16-32-16 bits 混精度脈動陣列(Systolic Array) 38 第五章 實驗環境與實驗結果 40 5.1 開發環境 41 5.2 使用Python構建Lenet-5、VGG-16之神經網路架構 42 5.3 多神經網路動態執行加速器之實驗結果 45 5.3.1 指令化執行Lenet-5 46 5.3.2 指令化執行VGG-16 48 5.3.3 使用Design Compiler進行合成 51 第六章 結論與未來展望 52 參考文獻 53

    [1] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
    [2] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” arXiv preprint arXiv:1704.04861, 2017.
    [3] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in neural information processing systems, 2012, pp. 1097–1105.
    [4] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
    [5] S. Xie, R. Girshick, P. Doll´ar, Z. Tu, and K. He, “Aggregated residual transformations for deep neural networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1492–1500.
    [6] Y. Wu, M. Schuster, Z. Chen, Q. V. Le, M. Norouzi, W. Macherey, M. Krikun, Y. Cao, Q. Gao, K. Macherey et al., “Google’s neural machine translation system: Bridging the gap between human and machine translation,” arXiv preprint arXiv:1609.08144, 2016.
    [7] T. Chen, Z. Du, N. Sun, J. Wang, C. Wu, Y. Chen, and O. Temam,“Diannao: A small-footprint high-throughput accelerator for ubiquitous machine-learning,” in ACM Sigplan Notices, vol. 49, no. 4. ACM, 2014, pp. 269–284.
    [8] Y. Chen, T. Luo, S. Liu, S. Zhang, L. He, J. Wang, L. Li, T. Chen, Z. Xu, N. Sun et al., “Dadiannao: A machine-learning supercomputer,” in Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 2014, pp. 609–622.
    [9] E. Chung, J. Fowers, K. Ovtcharov, M. Papamichael, A. Caulfield, T. Massengill, M. Liu, D. Lo, S. Alkalay, M. Haselman et al., “Serving dnns in real time at datacenter scale with project brainwave,” IEEE Micro, vol. 38, no. 2, pp. 8–20, 2018.
    [10] F. Altch´e and A. de La Fortelle, “An lstm network for highway trajectory prediction,” in 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC). IEEE, 2017, pp. 353–359.
    [11] J. M. Alvarez, T. Gevers, Y. LeCun, and A. M. Lopez, “Road scene segmentation from a single image,” in European Conference on Computer Vision. Springer, 2012, pp. 376–389.
    [12] M. Aly, “Real time detection of lane markers in urban streets,” in 2008 IEEE Intelligent Vehicles Symposium. IEEE, 2008, pp. 7–12.
    [13] D. Amodei, S. Ananthanarayanan, R. Anubhai, J. Bai, E. Battenberg, C. Case, J. Casper, B. Catanzaro, Q. Cheng, G. Chen et al., “Deep speech 2: End-to-end speech recognition in english and mandarin,” in International conference on machine learning, 2016, pp. 173–182.
    [14] E. Derman and A. A. Salah, “Continuous real-time vehicle driver authentication using convolutional neural network based face recognition,” in 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018). IEEE, 2018, pp. 577–584.
    [15] H. M. Eraqi, M. N. Moustafa, and J. Honer, “End-to-end deep learning for steering autonomous vehicles considering temporal dependencies,” arXiv preprint arXiv:1710.03804, 2017.
    [16] J. Koci´c, N. Joviˇci´c, and V. Drndarevi´c, “An end-to-end deep neural network for autonomous driving designed for embedded automotive platforms,” Sensors, vol. 19, no. 9, p. 2064, 2019.
    [17] R. Valiente, M. Zaman, S. Ozer, and Y. P. Fallah, “Controlling steering angle for cooperative self-driving vehicles utilizing cnn and lstm-based deep networks,” in 2019 IEEE Intelligent Vehicles Symposium (IV). IEEE, 2019, pp. 2423–2428.
    [18] Z. Yang, Y. Zhang, J. Yu, J. Cai, and J. Luo, “End-to-end multi-modal multi-task vehicle control for self-driving cars with visual perceptions,” in 2018 24th International Conference on Pattern Recognition (ICPR). IEEE, 2018, pp. 2289–2294.
    [19] ALBAWI, Saad; MOHAMMED, Tareq Abed; AL-ZAWI, Saad. Understanding of a convolutional neural network. In: 2017 International Conference on Engineering and Technology (ICET). Ieee, p. 1-6. 2017.
    [20] MEDSKER, Larry R.; JAIN, L. C. Recurrent neural networks. Design and Applications, 5: 64-67, 2001.
    [21] HUANG, Zhiheng; XU, Wei; YU, Kai. Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991, 2015.
    [22] Iandola, Forrest N., et al. "SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size." arXiv preprint arXiv:1602.07360 (2016).
    [23] Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).
    [24] Szegedy, Christian, et al. "Going deeper with convolutions." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
    [25] B. Y-Lan, P. Jean, and L. Yann, "A Theoretical Analysis of Feature Pooling in Visual Recognition.," In Int. Conf. on Machine Learning, 2010.
    [26] LeCun, Yann, et al. "Gradient-based learning applied to document recognition." Proceedings of the IEEE 86.11 (1998): 2278-2324.

    無法下載圖示 校內:2027-08-10公開
    校外:2027-08-10公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE