| 研究生: |
徐仁瓏 Hsu, Jen-Lung |
|---|---|
| 論文名稱: |
基於統計架構之冗餘感知自適應層剪枝方法:應用於影像分類與深偽檢測 Redundancy-Aware Adaptive Layer Pruning Based on a Statistical Framework for Image Classification and Deepfake Detection |
| 指導教授: |
許志仲
Hsu, Chih-Chung 鄭順林 Jeng, Shuen-Lin |
| 學位類別: |
碩士 Master |
| 系所名稱: |
管理學院 - 數據科學研究所 Institute of Data Science |
| 論文出版年: | 2025 |
| 畢業學年度: | 113 |
| 語文別: | 英文 |
| 論文頁數: | 117 |
| 中文關鍵詞: | 層剪枝 、統計檢定 、影像分類 、深偽影像檢測 |
| 外文關鍵詞: | Layer Pruning, Statistical Tests, Image Classification, Deepfake Detection |
| 相關次數: | 點閱:8 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本研究提出一種新穎之統計式自適應層剪枝方法,旨在有效移除神經網路中之冗餘結構,提升模型壓縮效率。不同於傳統需進行多次試驗與超參數調整之剪枝技術,本方法運用統計檢定──曼-惠特尼 U 檢定(Mann–Whitney U test)與科摩哥洛夫-史密諾夫檢定(Kolmogorov–Smirnov test)──直接分析預訓練模型中各殘差區塊之特徵分佈差異,自動判斷冗餘層並進行移除,無需額外手動設置或資料輔助,顯著降低剪枝成本並具備可解釋性。
本方法於影像分類與深偽影像檢測任務中進行廣泛驗證,涵蓋 CIFAR-10、CIFAR-100、ImageNet、FaceForensics++ 與 Celeb-DF 等資料集,並分別應用於 ResNet 及 EfficientNetV2 各變體架構。實驗結果顯示,本方法可依據任務難度與模型深度自適應調整剪枝比例,在多數設定下同時提升準確率與模型壓縮效益,展現其穩定性與泛化能力。
此外,本研究進一步比較推論延遲、剪枝搜尋時間,並進行本方法的性能與壓縮間之權衡分析,同時透過 Grad-CAM 視覺化檢視模型注意力行為,全面驗證所提方法在壓縮效率、運算成本與特徵學習層面之整體優勢。
We propose a statistical adaptive layer pruning method to efficiently remove redundancy in neural networks. Instead of requiring predefined pruning ratios, our method applies the Mann–Whitney U test and Kolmogorov–Smirnov test to directly evaluate feature distributions from pre-trained models, identifying redundant layers without additional data or manual thresholds.
The method is validated on image classification and deepfake detection tasks across CIFAR-10, CIFAR-100, ImageNet, FaceForensics++, and Celeb-DF datasets using ResNet and EfficientNetV2 architectures. Experiments demonstrate that the pruning ratios adapt to task difficulty and model depth, achieving accuracy improvements in most settings while maintaining robustness across different pruning levels.
Further evaluations on inference latency, pruning search time, and trade-off analysis confirm the method's efficiency, while Grad-CAM visualizations illustrate improved attention behavior after pruning.
[1] Yann LeCun, L´eon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
[2] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems (NeurIPS), volume 25, pages 1097–1105, 2012.
[3] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1–9, 2015.
[4] Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. In International Conference on Learning Representations (ICLR), 2015.
[5] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2016.
[6] Gao Huang, Zhuang Liu, Laurens van der Maaten, and Kilian Q Weinberger. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2261–2269, 2017.
[7] Forrest N Iandola, Song Han, Matthew W Moskewicz, Khalid Ashraf, William J Dally, and Kurt Keutzer. Squeezenet: Alexnet-level accuracy with 50x fewer parameters and <0.5mb model size. arXiv preprint arXiv:1602.07360, 2016.
[8] Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861, 2017.
[9] Fran¸cois Chollet. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1800–1807, 2017.
[10] Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, and Jian Sun. Shufflenet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
[11] Mingxing Tan and Quoc V Le. Efficientnet: Rethinking model scaling for convolutional neural networks. In Proceedings of the 36th International Conference on Machine Learning (ICML), 2019.
[12] Song Han, Jeff Pool, John Tran, and William J. Dally. Learning both weights and connections for efficient neural networks. In Advances in Neural Information Processing Systems (NeurIPS), volume 1, pages 1135–1143, 2015
[13] Yunchao Gong, Liu Liu, Ming Yang, and Lubomir Bourdev. Compressing deep convolutional networks using vector quantization. arXiv preprint arXiv:1412.6115, 2014.
[14] Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.
[15] Hao Li, Asim Kadav, Igor Durdanovic, Hanan Samet, and Hans Peter Graf. Pruning filters for efficient convnets. In International Conference on Learning Representations (ICLR), 2017.
[16] Zehao Huang and Naiyan Wang. Data-driven sparse structure selection for deep neural networks. In European Conference on Computer Vision (ECCV), pages 317–334. Springer, 2018.
[17] Shaohui Lin, Rongrong Ji, Chenqian Yan, Baochang Zhang, Liujuan Cao, Qixiang Ye, Feiyue Huang, and David Doermann. Towards optimal structured cnn pruning via generative adversarial learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 2785–2794, 2019.
[18] Mingbao Lin, Rongrong Ji, Yan Wang, Yichen Zhang, Baochang Zhang, Yonghong Tian, and Ling Shao. Hrank: Filter pruning using high-rank feature map. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1529–1538, 2020.
[19] Yang He, Guoliang Kang, Xuanyi Dong, Yanwei Fu, and Yi Yang. Soft filter pruning for accelerating deep convolutional neural networks. In International Joint Conference on Artificial Intelligence (IJCAI), pages 2234–2240, 2018.
[20] Shaochen Zhong, Guanqun Zhang, Ningjia Huang, and Shuai Xu. Revisit kernel pruning with lottery regulated grouped convolutions. In International Conference on Learning Representations (ICLR), 2022.
[21] Zhiqiang He, Yaguan Qian, Yuqi Wang, Bin Wang, Xiaohui Guan, Zhaoquan Gu, Xiang Ling, Shaoning Zeng, Haijiang Wang, and Wujie Zhou. Filter pruning via feature discrimination in deep neural networks. In Computer Vision – ECCV 2022, pages 245–261. Springer, 2022.
[22] Yang He, Ping Liu, Ziwei Wang, Zhilan Hu, and Yi Yang. Filter pruning via geometric median for deep convolutional neural networks acceleration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4335–4344, 2019.
[23] Jian-Hao Luo, Jianxin Wu, and Weiyao Lin. Thinet: A filter level pruning method for deep neural network compression. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), pages 5068–5076, 2017.
[24] Xiaohan Ding, Guiguang Ding, Yuchen Guo, Jungong Han, and Chenggang Yan. Approximated oracle filter pruning for destructive cnn width optimization. In International Conference on Machine Learning (ICML), pages 1607–1616, 2019.
[25] Mingbao Lin, Rongrong Ji, Shaojie Li, Yan Wang, Yongjian Wu, Feiyue Huang, and Qixiang Ye. Network pruning using adaptive exemplar filters. IEEE Transactions on Neural Networks and Learning Systems, 33(12):7357–7366, 2022.
[26] Manuel Nonnenmacher, Thomas Pfeil, Ingo Steinwart, and David Reeb. Sosp: Efficiently capturing global correlations by second-order structured pruning. In International Conference on Learning Representations (ICLR), 2022.
[27] Manoj Alwani, Yang Wang, and Vashisht Madhavan. Decore: Deep compression with reinforcement learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12339–12349, 2022.
[28] Di Jiang, Yuan Cao, and Qiang Yang. On the channel pruning using graph convolution network for convolutional neural network acceleration. In International Joint Conference on Artificial Intelligence (IJCAI), 2022.
[29] Yushuo Guan, Ning Liu, Pengyu Zhao, Zhengping Che, Kaigui Bian, Yanzhi Wang, and Jian Tang. Dais: Automatic channel pruning via differentiable annealing indicator search. IEEE Transactions on Neural Networks and Learning Systems, 34(12):9847–9858, 2023.
[30] Jianbo Ye, Xin Lu, Zhe Lin, and James Z. Wang. Rethinking the smaller-norm-less-informative assumption in channel pruning of convolution layers. In International Conference on Learning Representations (ICLR), 2018.
[31] Shi Chen and Qi Zhao. Shallowing deep networks: Layer-wise pruning based on feature representations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(12):3048–3056, 2019.
[32] Wenxiao Wang, Shuai Zhao, Minghao Chen, Jinming Hu, Deng Cai, and Haifeng Liu. Dbp: Discrimination based block-level pruning for deep model acceleration. arXiv preprint arXiv:1912.10178, 2019.
[33] Pengtao Xu, Jian Cao, Fanhua Shang, Wenyu Sun, and Pu Li. Layer pruning via fusible residual convolutional block for deep neural networks. arXiv preprint arXiv:2011.14356, 2020.
[34] Yao Lu, Wen Yang, Yunzhe Zhang, Zuohui Chen, Jinyin Chen, Qi Xuan, Zhen Wang, and Xiaoniu Yang. Understanding the dynamics of dnns using graph modularity. In European Conference on Computer Vision (ECCV), pages 225–242. Springer, 2022.
[35] Hui Tang, Yao Lu, and Qi Xuan. Sr-init: An interpretable layer pruning method. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5, 2023.
[36] Sara Elkerdawy, Mostafa Elhoushi, Abhineet Singh, Hong Zhang, and Nilanjan Ray. One-shot layer-wise accuracy approximation for layer pruning. In Proceedings of the IEEE International Conference on Image Processing (ICIP), pages 2940–2944. IEEE, 2020.
[37] Ke Zhang and Guangzhe Liu. Layer pruning for obtaining shallower resnets. IEEE Signal Processing Letters, 29:1172–1176, 2022.
[38] Yao Zhou, Gary G. Yen, and Zhang Yi. Evolutionary shallowing deep neural networks at block levels. IEEE Transactions on Neural Networks and Learning Systems, 33(9):4635–4647, 2022.
[39] Artur Jordao, Maiko Lie, and William Robson Schwartz. Discriminative layer pruning for convolutional neural networks. IEEE Journal of Selected Topics in Signal Processing, 14(4):828–837, 2020.
[40] Xuanyi Dong and Yi Yang. Network pruning via transformable architecture search. In Advances in Neural Information Processing Systems (NeurIPS), 2019.
[41] Sara Elkerdawy, Mostafa Elhoushi, Abhineet Singh, Hong Zhang, and Nilanjan Ray. To filter prune, or to layer prune, that is the question. pages 737–753, 2020.
[42] Henry B Mann and Donald R Whitney. On a test of whether one of two random variables is stochastically larger than the other. Annals of Mathematical Statistics, 18(1):50–60, 1947.
[43] Frank J Massey Jr. The kolmogorov-smirnov test for goodness of fit. Journal of the American Statistical Association, 46(253):68–78, 1951.
[44] Alex Krizhevsky. Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009.
[45] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Fei-Fei Li. Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2009.
[46] Andreas R¨ossler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Nießner. Faceforensics++: Learning to detect manipulated facial images. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). IEEE, 2019.
[47] Yuezun Li, Xin Yang, Pu Sun, Honggang Qi, and Siwei Lyu. Celeb-df: A large-scale challenging dataset for deepfake forensics. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
[48] Michael Zhu and Suyog Gupta. To prune, or not to prune: Exploring the efficacy of pruning for model compression. arXiv preprint arXiv:1710.01878, 2017.
[49] NVIDIA Corporation. Nvidia a100 tensor core gpu architecture. https://www.nvidia.com/en-us/data-center/a100/, 2020. Whitepaper.
[50] Guillaume Alain and Yoshua Bengio. Understanding intermediate layers using linear classifier probes. arXiv preprint arXiv:1610.01644, 2018.
[51] W.J. Conover. Practical Nonparametric Statistics. John Wiley & Sons, 1999.
[52] Xuanyi Dong, Junshi Huang, Yi Yang, and Shuicheng Yan. More is less: A more complicated network with less inference complexity. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2017.
[53] Yossi Rubner, Carlo Tomasi, and Leonidas J Guibas. The earth mover’s distance as a metric for image retrieval. International Journal of Computer Vision, 40(2):99–121, 2000.
[54] C´edric Villani. Optimal Transport: Old and New, volume 338 of Grundlehren der mathematischen Wissenschaften. Springer, 2009.
[55] Carlo Bonferroni. Teoria statistica delle classi e calcolo delle probabilità. Pubblicazioni del R Istituto Superiore di Scienze Economiche e Commerciali di Firenze, 8:3–62, 1936.
[56] Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision (ICCV), 2017.
[57] Justus Thies, Michael Zollhofer, Marc Stamminger, Christian Theobalt, and Matthias Nießner. Face2face: Real-time face capture and reenactment of rgb videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2016.
[58] Justus Thies, Michael Zollhofer, and Matthias Nießner. Deferred neural rendering: Image synthesis using neural textures. ACM Transactions on Graphics (TOG), 38(4):66:1–66:12, 2019.
[59] Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2014.
[60] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), volume 9351 of Lecture Notes in Computer Science, pages 234–241. Springer, 2015.
[61] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pretraining of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT), pages 4171–4186. Association for Computational Linguistics, 2019.
[62] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. In Advances in Neural Information Processing Systems (NeurIPS), pages 6000–6010. Curran Associates, Inc., 2017