成功大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	鄭子皇 Cheng, Zih-Huang
論文名稱：	應用逐位元量化感知訓練稀疏化神經網路 Bit-Wise Quantization-Aware Training for Sparsifying Neural Networks
指導教授：	郭致宏 Kuo, Chih-Hung
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 電機工程學系 Department of Electrical Engineering
論文出版年：	2023
畢業學年度：	111
語文別：	中文
論文頁數：	60
中文關鍵詞：	卷積神經網路、量化感知訓練、記憶體內運算
外文關鍵詞：	Convolution Neural Network, Quantization-Aware Training, Computing-In-Memory
相關次數：	點閱：100 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

深度神經網路透過大量資料訓練已在多個領域得到超乎想像的成果，例如圖像分類、物件辨識、語義分割與自然語言處理等領域。在追求網路準確度的同時模型複雜度不斷提升，帶來參數量與運算量的額外負擔。上述問題使范紐曼 (Von Neumann) 瓶頸越發明顯，因此已經有許多文獻提出在記憶體單元內「原地運算」 (in-situ computation) 以降低高頻的資料交換。然而，深度神經網路需要大量乘積累加運算，使類比記憶體內運算在轉換至數位領域需要使用高解析度的類比數位轉換器，反而導致高面積成本與能源消耗。本論文提出針對權重位元的量化感知訓練，並增強位元稀疏度以減少記憶體內運算的累加值，從而使數位化的過程減少解析的位元數，隨之降低類比數位轉換器的功耗。本論文針對二補數記憶體內運算提升稀疏度，採用對應轉換形式訓練權重。實驗結果顯示，8位元整數VGG-16訓練在CIFAR-10可以達到98.28% 位元層級稀疏度並維持93.78% 準確率。透過稀疏性降低類比數位轉換器功耗可以有效減輕記憶體內運算的功耗瓶頸。

Deep neural networks have achieved results beyond expectation in various fields such as image classification, object recognition, semantic segmentation, and natural language processing. As the pursuit of network accuracy, the model complexity becomes elevating, imposing an additional burden of parameters and computational operations. This scenario brings attention to the Von Neumann bottleneck, leading to numerous research efforts proposing in-situ computation within memory units called Computing-In-Memory (CIM) to reduce the data access times. However, a large amount of multiply-accumulate operations for deep neural networks requires high-resolution analog-to-digital converters for digitalizing analog computations in CIM, which results in high area costs and energy consumption. This paper presents a method of quantization-aware training specific to weight bits, enhancing bit-level sparsity to reduce the accumulated value of computations in CIM. The sparsity allows for a reduction in bit resolution for the digitalization process, consequently reducing the power consumption of analog-to-digital converters. This paper focuses on enhancing sparsity for two's complement in-memory computation, thus adopting two's complement conversion for weight training. Experimental results show that training the 8-bit integer VGG-16 model on CIFAR-10 achieves a bit-level sparsity of 98.28% while maintaining an accuracy of 93.78%. With the sparsity, the bottleneck of the power consumption in CIM is relieved.

中文摘要	I
誌謝	XVIII
目錄	XIX
表目錄	XXII
圖目錄	XXIII
第壹章	緒論	1
第一節	前言	1
第二節	研究動機	2
第三節	研究貢獻	3
第四節	論文架構	4
第貳章	相關研究背景介紹	5
第一節 深度學習	5
第一小節	深度神經網路	5
第二小節	反向傳播法		7
第三小節   卷積神經網路	8
第二節	卷積神經網路架構	10
第一小節	視覺幾何組網路 (VGGNet)	11
第二小節	殘差網路 (ResNet)	12
第三小節	YOLO物件偵測網路	13
第三節	神經網路壓縮介紹	15
第一小節	低秩分解 (Low-rank factorization)	15
第二小節	知識蒸餾 (Knowledge Distillation)	15
第三小節	緊湊網路設計 (Compact network design)	15
第四小節	網路架構搜索 (Neural Architecture Search, NAS)	16
第五小節	參數剪枝 (Parameter pruning)	16
第六小節	參數量化 (Parameter quantization)	16
第四節	記憶體內運算 (Computing-In-Memory) 介紹	18
第参章	量化相關文獻回顧	19
第一節	網路量化壓縮技術	19
第一小節	對稱式 (Symmetric) 整數量化	19
第二小節	非對稱式 (Asymmetric) 整數量化	20
第三小節	訓練後量化 (Post-Training Quantization)	21
第四小節	量化感知訓練 (Quantization-Aware Training, QAT)	22
第二節	網路量化方法比較	24
第肆章	基於逐位元權重量化感知訓練	26
第一節	逐位元權重感知訓練 (Bit-Wise Quantization Aware Training, BWQAT)	29
第一小節	逐位元權重量化感知訓練之前向傳播	30
第二小節	逐位元權重量化感知訓練之反向傳播	32
第三小節	批量正規化摺疊	33
第四小節	整數運算推論模型 (Integer-only arithmetic)	36
第二節	逐位元正則器 (Bit-Wise Regularizer)	38
第三節	損失函數 (Loss function)	38
第伍章	實驗環境與數據分析	39
第一節	資料集 (Dataset)	39
第二節	實驗操作細節	39
第三節	量化模型評估策略	40
第四節	量化實驗結果	43
第一小節	網路量化結果與比較	43
第二小節	網路量化方法比較	46
第三小節	壓縮網路於CIM模擬之效能	48
第四小節	l1 正則器與 l2 正則器之稀疏化比較	52
第五小節	位元訓練與偽位元訓練	53
第六小節	量化網路訓練成本	54
第陸章	結論與未來展望	55
第一節	結論	55
第二節	未來展望	55
參考文獻	56
                                    

[1] D. Silver, A, Huang, C. J. Maddison, A. Guez, L. Siffire, G. Van Dan Driessche, J.Schrittwieser, I. Antonoglou, V. Panneershelvam, and M. Lanctot, “Mastering the Game of Go with Deep Neural Networks and Tree Search.” Nature, vol. 529, no.7587, pp.484-489, 2016.
[2] OpenAI, “GPT-4 Technical Report”, arXiv e-prints, 2023.doi:10.48550/arXiv.2303.08774.
[3] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet Classification with Deep Convolutional Neural Networks,” Advances in Neural Information Processing Systems, vol. 25, pp. 1097-1105, 2012.
[4] Y. Lecun, L.Bottou, Y. Bengio, and P. Haffner, “Gradient-Based Learning Applied to Document Recognition,” Proceedings of the IEEE, vol. 86, pp. 2278-2324, Nov 1998
[5] J. L. Elman, “Finding Structure in Time,” Cognitive science, vol. 14, no. 2, pp. 179-211, 1990.
[6] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, and Y. Bengio, “Generative Adversarial Nets,” Advances in Neural Information Processing Systems, vol. 27, 2014.
[7] C. Subakan, M. Ravanelli, S. Cornell, M. Bronzi and J. Zhong, "Attention Is All You Need In Speech Separation," ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada, 2021, pp. 21-25, doi: 10.1109/ICASSP39728.2021.9413901.
[8] D. E. Rumelhart, G. E. Hintion, and R. J. Williams, “Learning Representations by Back-propagation Errors,” Cognitive Modeling, vol. 5, no. 3, p. 1, 1998.
[9] S. Ioffe, and S. Christian. “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift,” International Conference on Machine Learning, pp. 448-456, 2015.
[10] J. Zhang, H. Yang, F. Chen, Y. Wang and H. Li, "Exploring Bit-Slice Sparsity in Deep Neural Networks for Efficient ReRAM-Based Deployment," 2019 Fifth Workshop on Energy Efficient Machine Learning and Cognitive Computing - NeurIPS Edition (EMC2-NIPS), Vancouver, BC, Canada, 2019, pp. 1-5, doi: 10.1109/EMC2-NIPS53020.2019.00008.
[11] K. Simonyan, and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” 2014, arXiv:1409.1556.
[12] C. Szegedy, et al., “Going Deeper with Convolutions,” Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1-9, 2015.
[13] K. He, X. Zhang, S. Ren, and J. Sun. “Deep Residual Learning for Image Recognition,” Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770-778, 2016.
[14] G. Huang, Z Liu, L. Van Der Maaten, and K. Q. Weinberger. “Densely Connected Convolutional Networks,” Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 4700-4708, 2017.
[15] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi. “You Only Look Once: Unified, Real-time Object Detection,” Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 779-788, 2016.
[16] G. Liu, Z. Lin, S. Yan, J. Sun, Y. Yu and Y. Ma, "Robust Recovery of Subspace Structures by Low-Rank Representation," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 1, pp. 171-184, Jan. 2013, doi: 10.1109/TPAMI.2012.88.
[17] G. Hinton, O. Vinyals, and J. Dean, “Distilling the Knowledge in a Neural Network,” 2015, arXiv:1503.02531.
[18] A.G. Howard, et al. "MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications." arXiv preprint arXiv:1704.04861 (2017).
[19] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov and L. -C. Chen, "MobileNetV2: Inverted Residuals and Linear Bottlenecks," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 4510-4520, doi: 10.1109/CVPR.2018.00474.
[20] A. Howard et al., “Searching for MobileNetV3,” 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South), 2019, pp. 1314-1324 doi: 10.1109/ICCV.2019.00140.NAS
[21] B. Zoph, and Q. V. Le. “Neural Architecture Search with Reinforcement Learning,” 2016, arXiv:1611.01578.
[22] H. Li, A. Kadav, I. Durdanovic, H. Samet, and H. P. Graf, “Pruning Filters for Efficient ConvNets,” Advances in Neural Information Processing Systems, 2016.
[23] S. Han, J. Pool, J. Tran, and W. J. Dally, “Learning both Weights and Connections for Efficient Neural Network,” 2015, arXiv:1506.02626.
[24] B. Jacob, S. Kligys, B. Chen, M. Zhu, M. Tang, A. Howard, D. Kalenichenko, et al., “Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference,” Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2704-2713, 2018.
[25] Y. Bengio, N. Léonard, & A. Courville (2013). Estimating or Propagating Gradients Trough Stochastic Neurons for Conditional Computation. arXiv preprint arXiv:1308.3432
[26] H. Yang, L. Duan, Y. Chen, and H. Li. BSQ: Exploring Bit-Level Sparsity for Mixed-Precision Neural Network Quantization. In International Conference on Learning Representation, 2021
[27] S. Migacz, “Nvidia 8-bit Inference Width TensorRT,” In GPU Technology Conference, 2017.
[28] 盛祖丞, “一個採用雙路輸入架構與預先量化技巧之類比式記憶體內運算加速器,” 碩士論文, 國立成功大學電機程學系, 2023。
[29] C. Y. Wang, A. Bochkovskiy & H. Y. M. Liao. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 7464-7475).
[30] W. -T. Chang, C. -H. Kuo and L. -C. Fang, "Variational Channel Distribution Pruning and Mixed-Precision Quantization for Neural Network Model Compression," 2022 International Symposium on VLSI Design, Automation and Test (VLSI-DAT), Hsinchu, Taiwan, 2022, pp. 1-3, doi: 10.1109/VLSI-DAT54769.2022.9768055.
[31] N. P. Jouppi et al., “In-Datacenter Performance Analysis of a Tensor Processing Unit,” 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA), Toronto, On, Canada, 2017, pp. 1-12, doi: 10.1145/3079856.3080246
[32] A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, and A. Lerer, “Automatic Differentiation in PyTorch,” 2017.
[33] X. Si et al., “A Local Computing Cell and 6T SRAM-Based Computing-in-Memory Macro with 8-b MAC Operation for Edge AI Chips,” in IEEE Journal of Solid-State Circuits, vol. 56, no. 9, pp. 2817-2831, Sept. 2021, doi: 10.1109/JSSC.2021.3073254
[34] GM. Nagel, M. V. Baalen, T. Blankevoort and M. Welling, “Data-Free Quantization Through Weight Equalization and Bias Correction, ” 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South), 2019, pp. 1325-1334, doi: 10.1109/ICCV.2019.00141

校內：2025-08-28公開
校外：2025-08-28公開

簡易檢索 / 詳目顯示

相關論文