簡易檢索 / 詳目顯示

研究生: 陳皙蔆
Chen, Hsi-Ling
論文名稱: 基於CNN架構之模型壓縮與加速
Model Compression and Acceleration of CNN-based Architectures
指導教授: 楊家輝
Yang, Har-Ferr
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電腦與通信工程研究所
Institute of Computer & Communication Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 英文
論文頁數: 45
中文關鍵詞: 深度學習卷積類神經網路模型壓縮與加速剪枝演算法
外文關鍵詞: deep learning, convolutional neural networks, model compression and acceleration, pruning algorithms
相關次數: 點閱:122下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 卷積類神經網路自誕生以來,被大量應用在各種電腦視覺任務中,成績斐然,然而卷積運算帶來的大量計算量以及其餘相關配置所導致的高參數量,皆導致這些模型在應用層面備受限制。受限於邊緣裝置的小儲存空間與不夠強大的運算能力,許多優秀卷積神經網路無法被部署在行動,AR、VR眼鏡等裝置上。鑒於影響卷積神經網路性能的關鍵因素在於卷積濾波器的設計,我們提出適應性剪枝方法(Adaptable Pruning Method),藉由塊狀剪枝(Block-wise Pruning, BP)演算法來探索卷積濾波器的最佳結構,大量減少卷積濾波器中存在的冗餘,以達到模型壓縮與加速的目的。並且我們考慮不同任務的使用情境,進一步提出延後歸零塊狀剪枝(Delayed Zero Block-wise Pruning, DZBP)演算法、結構式塊狀剪枝(Structured Block-wise Pruning, SB)演算法、基於輸入的塊狀剪枝(Input-dependent Block-wise Pruning, IBP)演算法等三種演算法,使模型可以在不同部署條件下保有最佳的性能或壓縮與加速的表現。實驗結果顯示,本論文所提出的方法可以大幅減少主卷積神經網路架構的參數量與計算複雜度,並且可以在不同條件下使用最適合的演算法實現模型的壓縮與加速。

    The convolutional neural networks have been widely used in a variety of computer vision tasks with great success. Due to the small storage space and insufficient computational ability of edge devices, many excellent convolutional neural networks cannot be deployed on devices such as AR and VR glasses. Considering that the key factor affecting the performance of the convolutional neural network is the design of the convolutional filter, we propose the Adaptable Pruning Method to explore the optimal structure of the convolutional filter by the Block-wise Pruning (BP) algorithm, and to significantly reduce the redundancy in the convolutional filter to achieve the purpose of model compression and acceleration. Moreover, we further propose delayed zero block-wise pruning (DZBP), structured block-wise pruning (SBP) and input-dependent block-wise pruning (IBP) under different downstream task usage scenarios. With these algorithms, we could allow the model to maintain the best performance with compression and acceleration under different deployment conditions. The experimental results show that the proposed methods can significantly reduce the number of parameters and computational complexity of mainstream convolutional neural network architectures, and can achieve model compression and acceleration under different conditions using the most suitable algorithms.

    摘要 I Abstract II 誌謝 III Contents IV List of Tables VI List of Figures VII Chapter 1 Introduction 1 1.1 Research Background 1 1.2 Motivations 3 1.3 Thesis Organization 4 Chapter 2 Related Work 5 2.1 Knowledge Distillation 7 2.2 Low-rank Approximation 8 2.3 Quantization 10 2.4 Model Pruning 11 2.4.1 Filter-wise Pruning 12 2.4.2 Channel-wise Pruning 14 2.4.3 Stripe-wise Pruning 14 Chapter 3 The Proposed Adaptable Pruning Method 16 3.1 Problem Formulation 16 3.2 Adaptable Pruning Method 17 3.2.1 Block-wise Pruning Algorithm 17 3.2.2 Delayed Zero Block-wise Pruning Algorithm 19 3.2.3 Structured Block-wise Pruning Algorithm 20 3.2.4 Input-dependent Block-wise Pruning Algorithm 22 3.2.5 Fine-tune for BP Algorithm 24 Chapter 4 Experiment Results 26 4.1 Environment Settings 26 4.2 Comparing Results 27 4.2.1 Experimental results of BP algorithm 28 4.2.2 Evaluations of the proposed algorithms 32 4.2.3 Ablation Study 34 4.3 Pruning precision depth estimation network 35 4.3.1 Model architecture and experimental settings 35 4.3.2 Experimental Results 37 Chapter 5 Conclusions 39 Chapter 6 Future Work 41 References 42

    [1] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks," in Advances in Neural Information Processing Systems, vol. 25, 2012.
    [2] G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “Mobilenets: Efficient convolutional neural networks for mobile vision applications”, arXiv:1704.04861, 2017.
    [3] X. Zhang, X. Zhou, M. Lin, and J. Sun,”Shufflenet: An extremely efficient convolutional neural network for mobile devices”, arXiv:1707.01083, 2017.
    [4] F. N. Iandola, M. W. Moskewicz, K. Ashraf, S. Han, W. J. Dally, and K. Keutzer,” Squeezenet: Alexnet-level accuracy with 50x fewer parameters and <0.5mb model size”, arXiv preprint arXiv:1602.07360, 2016.
    [5] S. Xie, R. Girshick, P. Dollar, Z. Tu, and K. He, "Aggregated residual transformations for deep neural networks," in Computer Vision and Pattern Recognition Conference (CVPR), pp. 5987–5995, 2017.
    [6] J. Frankle and M. Carbin, "The Lottery Ticket Hypothesis: Training Pruned Neural Networks," in CoRR, vol. abs/1803.03635, 2018. [Online]. Available: http://arxiv.org/abs/1803.03635.
    [7] G.E. Hinton, O. Vinyals, and J. Dean, "Distilling the Knowledge in a Neural Network," arXiv:1503.02531, 2015.
    [8] Y. Bengio, A.C. Courville, and P. Vincent, "Representation Learning: A Review and New Perspectives," IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, pp. 1798-1828, 2013.
    [9] S.H. Lee, D.H. Kim, and B.C. Song, "Self-supervised Knowledge Distillation Using Singular Value Decomposition," in Proceedings of the Computer Vision-ECCV 2018-15th European Conference, Munich, Germany, 2018, vol. 11210, pp. 339-354.
    [10] J. Yim, D. Joo, J. Bae, and J. Kim, "A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning," in Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 2017, pp. 7130-7138.
    [11] S. Swaminathan, D. Garg, R. Kannan, and F. Andres, "Sparse low rank factorization for deep neural network compression," Neurocomputing, vol. 398, pp. 185-196, 2020.
    [12] M. Jaderberg, A. Vedaldi, and A. Zisserman, "Speeding up Convolutional Neural Networks with Low Rank Expansions," CoRR, vol. abs/1405.3866, 2014.
    [13] H. Pouransari, Z. Tu, and O. Tuzel, "Least squares binary quantization of neural networks," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020, pp. 698-699.
    [14] M. Rastegari, V. Ordonez, J. Redmon and A. Farhadi, "Xnor-Net: ImageNet Classification Using Binary Convolutional Neural Networks," in Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part IV, pp. 525-542, Sep. 2016, doi: 10.1007/978-3-319-46448-0_32.
    [15] S. A. Tailor, J. Fernandez-Marques, and N. D. Lane, "Degree-quant: Quantization-aware training for graph neural networks," arXiv preprint arXiv:2008.05000, Aug. 2020.
    [16] B. O. Ayinde and J. M. Zurada, “Building Efficient Convnets Using Redundant Feature Pruning”, arXiv preprint arXiv:1802.07653, 2018.
    [17] M. Lin, R. Ji, Y. Wang, Y. Zhang, B. Zhang, Y. Tian, and L. Shao, “Hrank: Filter pruning using high-rank feature map”, in Proceedings of Computer Vision and Pattern Recognition Conference (CVPR), pp. 1529–1538, 2020.
    [18] X. Gao, Y. Zhao, Ł. Dudziak, R. Mullins, and C.-zhong Xu, “Dynamic channel pruning: Feature boosting and suppression”, International Conference on Learning Representations(ICLR), 2019.
    [19] Z. Liu, J. Li, Z. Shen, G. Huang, S. Yan, and C. Zhang, ” Learning efficient convolutional networks through network slimming”, International Conference on Computer Vision (ICCV), pp. 2755–2763, 2017.
    [20] F. Meng, H. Cheng, K. Li, H. Luo, X. Guo, G. Lu, and X. Sun, "Pruning filter in filter," in Advances in Neural Information Processing Systems, vol. 33, 2020, pp. 17629-17640.
    [21] K. Chellapilla, S. Puri, and P. Simard, "High performance convolutional neural networks for document processing," in Tenth international workshop on frontiers in handwriting recognition, Suvisoft, 2006.
    [22] S. Srinivas, A. Kuzmin, M. Nagel, M. van Baalen, A. Skliar, and T. Blankevoort, "Cyclical pruning for sparse neural networks," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2762-2771, 2022.
    [23] K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556, 2014.
    [24] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770-778.
    [25] T. Zhuang, Z. Zhang, Y. Huang, X. Zeng, K. Shuang, and X. Li, “Neuron-level Structured Pruning using Polarization Regularizer”, Advances in Neural Information Processing Systems (NIPS), pp. 9865-9877, 2020.
    [26] Y. He, P. Liu, Z. Wang, Z. Hu, and Y. Yang, “Pruning Filter via Geometric Median for Deep Convolutional Neural Networks Acceleration”, Proceedings of Computer Vision and Pattern Recognition Conference (CVPR), pp. 4340–4349, 2019.
    [27] Y. Li, S. Gu, C. Mayer, L. Van Gool, and R. Timofte, “Group Sparsity: The Hinge Between Filter Pruning and Decomposition for Network Compression”, Proceedings of Computer Vision and Pattern Recognition Conference (CVPR), pp. 8018–8027, 2020.
    [28] N. Silberman, D. Hoiem, P. Kohli, and R. Fergus, "Indoor Segmentation and Support Inference from RGBD Images," in Proceedings of the European Conference on Computer Vision (ECCV), 2012.
    [29] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, "Attention is all you need," in Advances in neural information processing systems, vol. 30, 2017.

    無法下載圖示 校內:2028-08-13公開
    校外:2028-08-13公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE