簡易檢索 / 詳目顯示

研究生: 沈育同
Shen, Yu-Tong
論文名稱: 高效能局部二值化卷積神經網路之超大型積體電路實現
A High-Performance VLSI Implementation of Local Binary Convolutional Neural Network
指導教授: 林英超
Lin, Ing-Chao
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2019
畢業學年度: 108
語文別: 英文
論文頁數: 36
中文關鍵詞: 卷積神經網路局部二值化卷積神經網路人工智慧晶片設計稀疏
外文關鍵詞: convolutional neural network, local binary convolutional neural network, artifi-cial intelligence (AI), IC design, sparsity
相關次數: 點閱:87下載:3
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在局部二值化卷積神經網路中,把原本的卷積層拆解為兩個較小的卷積層子層,第一子層為稀疏三元權重的卷積層,第二子層為1x1權重的卷積層。透過這樣拆解,局部二值化卷積神經網路比起傳統卷積神經網路有更低的計算複雜度及記憶體使用量。
    為了更進一步提升效能,局部二值化卷積神經網路的硬體電路已被提出。由於第一子層結構為三元權重的卷積層,僅需要加法運算即可,因此在之前提出的硬體電路採取在一個週期內處理多筆加法運算使得整體效能提升。然而,這仍然需要大量的記憶體存取,故還有很大的改善空間。
    在這篇論文中,我們提出一個高效能局部二值化卷積神經網路硬體架構,利用局部二值化卷積神經網路非常稀疏的特點,節省大部分不必要的運算並減少記憶體使用量,同時設計硬體電路架構改善記憶體存取次數。實驗結果顯示,我們提出的硬體電路架構比先前提出的架構面積減少20%、加法運算量減少93%、乘法運算量減少47%及總共所需運算週期數減少86%。

    In order to simplify convolutional neural network (CNN), local binary convolutional neural network (LBCNN) has been proposed.
    In the local binary convolutional neural network (LBCNN), a convolutional layer is divided into two sublayers, Sublayer 1 and Sublayer 2. Sublayer 1 is a sparse ternary-weighted convolutional layer, and Sublayer 2 is a 1x1 convolutional layer. By using two sublayers, LBCNN has lower computational complexity and lower memory usage than CNN.
    To further improve the performance, an accelerator circuit designed for LBCNN has been proposed. Because Sublayer 1 of LBCNN is ternary-weighted, it only requires addition operations to perform convolutional operations and it can perform several additions in a cycle to improve the overall performance. However, it still requires significant memory accesses, which still has room for improvement. In this work, we propose an accelerator architecture to take advantage of sparsity in LBCNN and design a layer accelerator to reduce computational complexity and memory accesses. Experimental results show that under 20% area reduction, addition operations are reduced by 93%, multiplication operations are reduced by 47%, and clock cycles are reduced by 86%.

    摘要 ..... i Abstract ..... ii Table of Contents ..... iii List of Tables ..... v List of Figures ..... vi Chapter 1 Introduction ..... 1 Chapter 2 Background ..... 4 2.1 CNN and BNN ..... 4 2.2 Local Binary Convolutional Neural Networks ..... 6 Chapter 3 Proposed Platform ..... 8 3.1 Platform Overview ..... 8 3.2 Weight Preprocessing Module ..... 9 3.3 Hardware Design of the Layer Accelerator ..... 11 3.3.1 Sublayer 1 Architecture ..... 12 3.3.2 Sublayer 2 Architecture ..... 14 Chapter 4 Experimental Setup and Results ..... 17 4.1 Implementation in ASIC ..... 17 4.1.1 Experimental Setup ..... 17 4.1.2 Clock Cycle Comparison ..... 19 4.1.3 Computations Comparison ..... 21 4.1.4 Memory Usage Comparison ..... 22 4.1.5 Clock Period, Cell Area, and Power Comparison ..... 23 4.2 Implementation in FPGA ..... 24 4.2.1 Experimental Setup ..... 24 4.2.2 Demo Platform ..... 26 4.2.3 Resources Utilization ..... 27 Chapter 5 Conclusion ..... 34 References ..... 35

    [1] K. Ando, K. Ueyoshi, K. Orimo, H. Yonekawa, S. Sato, H. Nakahara, S. Takamaeda-Yamazaki, M. Ikebe, T. Asai, T. Kuroda, and M. Motomura. Brein memory: A single-chip binary/ternary reconfigurable in-memory deep neural network accelerator achieving 1.4 tops at 0.6 w. In IEEE Journal of Solid-State Circuits, page 983–994, 2018.
    [2] M. Bojarski, D. D. Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D. Jackel,M. Monfort, U. Muller, J. Zhang, et al. End to end learning for self-driving cars.arXivpreprint arXiv:1604.07316, 2016.
    [3] Y. H. Chen, T. Krishna, J. S. Emer, and V. Sze. Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks. In IEEE Journal of Solid-State Circuits, page 127–138, 2016.
    [4] M. Courbariaux, Y. Bengio, and J. P. David. Binaryconnect: Training deep neural networks with binary weights during propagations. In Advances in Neural InformationProcessing Systems, page 3105–3113, 2015.
    [5] Design Automation Standards Committee. Ieee standard for systemverilog uni-fied hardware design, specification, and verification language standard ieee 1800.http://www.edastds.org/sv/, 2017.
    [6] Z. Hu, J. Tang, Z. Wang, K. Zhang, L. Zhang, and Q. Sun. Deep learning for image-based cancer detection and diagnosis- a survey.Pattern Recognition,83:134–149,2018.
    [7] F. Juefei-Xu, V. N. Boddeti, and M. Savvides. Local binary convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recog-nition, pages 19–28, 2017.
    [8] A. Krizhevsky, I. Sutskever, and G. E. Hinton. ImageNet Classification with Deep Convolutional Neural Networks. InAdvances in Neural Information Processing Systems(NIPS), page 1097–1105, 2012.
    [9] F. Li, B. Zhang, and B. Liu. Ternary weight networks. In arXiv preprint arXiv:1605.04711., pages 525–542, 2016.
    [10] M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi. Xnor-net: Imagenet classification using binary convolutional neural networks. In European Conferenceon ComputerVision, pages 525–542, 2016.
    [11] M. Shimoda, S. Sato, and H. Nakahara. All binarized convolutional neural network and its implementation on an fpga. In International Conference on Field Programmable Technology (ICFPT), page 291–294, 2017.
    [12] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
    [13] H. M. Wallach. Topic modeling: beyond bag-of-words. In Proceedings of the 23rd international conference on Machine learning, pages 977–984. ACM, 2006.
    [14] H. Yonekawa and H. Nakahara. On-chip memory based binarized convolutional deep neural network applying batch normalization free technique on an fpga. In IEEE IPDPSW, page 98–105, 2017.
    [15] A. Zhakatayev and J. Lee. Efficient FPGA implementation of local binary convolutional neural network. In Proceedings of the 24th Asia and South Pacific Design Automation Conference, pages 699–704, 2019.

    無法下載圖示 校內:2025-01-21公開
    校外:不公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE