研究生: |
董穎禪 Tung, Ying-Chan |
---|---|
論文名稱: |
運用卷積神經網路於HEVC快速編碼單元大小決策之研究 A Study of Effective CU Size Decision for HEVC Based on Convolutional Neural Networks |
指導教授: |
郭致宏
Kuo, Chih-Hung |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
論文出版年: | 2019 |
畢業學年度: | 107 |
語文別: | 中文 |
論文頁數: | 115 |
中文關鍵詞: | HEVC 、編碼單元大小 、卷積神經網路 、機器學習 、樸素貝葉斯 |
外文關鍵詞: | HEVC, CU size, CNN, Machine learning, Naive Bayes |
相關次數: | 點閱:93 下載:1 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
高效率視訊編碼與前一代的視訊壓縮標準H.264/AVC相比提高了壓縮率,但同時帶來大量的計算負擔導致影像無法即時呈現,為減少計算量我們提出一個卷積神經網路加速編碼單元大小決策的方法。卷積神經網路優異的紋理特徵擷取能力適合應用在幀內預測中,此外,同時加入模式檢查過程中產生的編碼資訊以利於幀間預測的區塊大小判斷,本論文針對三種不同卷積神經網路架構與區塊大小決策方法進行探討。小區塊的編碼單元大小預測方法,我們則簡單的利用樸素貝葉斯概念進行預測。實驗結果顯示出我們的方法與HM-16.15相比在Low-delay main配置中可節省38.19%的時間同時造成1.91%的BDBR上升與0.066dB的BDPSNR下降。
High Efficiency Video Coding (HEVC) achieves high compression ratio, but its huge computational burden makes its implementation hard to be real-time. To relief the computation load, we propose a fast coding unit (CU) size decision method by using Convolutional Neural Network (CNN). CNN captures texture features that are suitable for intra prediction. Features of fully connected layers are incorporated with the coding information that is generated in the period of mode checking process. The coding information that contains temporal correlation can help us to make CU size decision for inter prediction. For smaller CUs, the method of Naive Bayes is adopted for size decision. Experimental results show that our method can save 38.19% of the encoding time and increase 1.91% BDBR and decrease 0.066dB BDPSNR on average compared to HM-16.15 in Low-delay configuration.
[1] T. Wiegand, J. R. Ohm, G. J. Sullivan, W. J. Han, R. Joshi, T. K. Tan, and K. Ugur, “Special section on the joint call for proposals on High Efficiency Video Coding (HEVC) standardization,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 20, no.12, pp.1661-1666, Dec. 2010.
[2] G. J. Sullivan, J. R. Ohm, H. Schwarz, T. K. Tan, and T. Wiegand, “Comparison of the coding efficiency of video coding standards including high efficiency video coding (HEVC),” IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp.1669-1684, Dec. 2012.
[3] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Proc. 25th Neural Information Processing Systems (NIPS), pp. 1097-1105, 2012.
[4] WJ Han, J. Min, IK Kim, E. Alshina, A. Alshin, T. Lee, J. Chen, V. Seregin, S. Lee, YM. Hong, MS. Cheon, N. Shlyakhov, K. McCann, T. Davies, JH. Park, “Improved video compression efficiency through flexible unit representation and corresponding extension of coding tools,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 20, no. 12, pp. 1709-1720, Dec. 2010.
[5] P. Helle, S. Oudin, B. Bross, D. Marpe, K. Ugur, J. Jung, G. Glare, and T. Wiegand, “Block merging for quad-tree based partitioning in HEVC,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp.1720-1731, Dec. 2012.
[6] S. Kamp, and M. Wien, “Error accumulation in motion compensation in P and B slices,” Document JVT-AA039, Geneva, 2008.
[7] K. Ugur, A. Alshin, E. Alshina, F. Bossen, W. Han, J. Park, and J. Lainema, “Motion Compensated Prediction and Interpolation Filter Design in H.265/HEVC,” IEEE Journal of Selected Topics in Signal Processing, vol. 7, no. 6, pp. 946-956, Dec. 2013.
[8] A. Ortega, and K. Ramchandran, “Rate-distortion methods for image and video compression: an overview,” IEEEE Signal Processing Magazine, vol. 15, no. 6, pp. 23-50, Nov. 1998.
[9] GJ. Sullivan, and T. Wiegand, “Rate-distortion optimization for video compression,” IEEEE Signal Processing Magazine, vol. 15, no. 6, pp. 74-90, Nov. 1998.
[10] K. McCann, B. Bross, WJ. Han, IK. Kim, K. Sugimoto, and GJ. Sullivan, “High Efficiency Video Coding (HEVC) Test Model 13 (HM 13) Encoder Description,” Document JCTVC-O1002, Geneva, 2013.
[11] Joint Collaborative Team on Video Coding (JCT-VC), “High Efficiency Video Coding (HEVC) Test Model 16 (HM16) Encoder Description,” Document JCTVC-R102, Sapporo, 2014.
[12] Joint Collaborative Team on Video Coding (JCT-VC), “Coding tree pruning based CU early termination,” Document JCTVC-F092, Torino, 2011.
[13] Joint Collaborative Team on Video Coding (JCT-VC), “Early termination of CU encoding to reduce HEVC complexity,” Document JCTVC-F045, Torino, 2011.
[14] Joint Collaborative Team on Video Coding (JCT-VC), “Early Skip Detection for HEVC,” Document JCTVC-G543, Geneva, 2011.
[15] F. Rosenblatt, “The perceptron: A probabilistic model for information storage and organization in the brain,” Psychological Review, vol. 65, no. 6, pp. 65-386, 1958.
[16] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by back-propagating errors,” Cognitive modeling, pp. 1, 1988.
[17] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, “Backpropagation applied to handwritten zip code recognition,” Neural Computation, vol. 1, no. 4, pp. 541-551, 1989.
[18] L. Shen, Z. Zhang, and P. An, “Fast CU size decision and mode decision algorithm for HEVC intra coding,” IEEE Transactions on Consumer Electronics, vol. 59, no. 1, pp. 207-213, Feb. 2013.
[19] X. Shen, L. Yu, and J. Chen, “Fast coding unit size selection for HEVC based on Bayesian decision rule,” in Proc. 2012 Picture Coding Symposium (PCS), pp. 7-9, May 2012.
[20] X. Wang, Y. Xue, “Fast HEVC intra coding algorithm based on Otsu’s method and gradient,” in Proc. 2016 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB), pp. 1-5, June 2016.
[21] D. G. Fernandez, A. A. D. Barrio, G. Botella, C. Garcia, “Fast and effective CU size decision based on spatial and temporal homogeneity detection,” Multimedia Tools and Applications, vol. 77, no. 5, pp. 5907-5927, March 2018.
[22] W. Jiang, H. Ma, and Y. Chen, “Gradient based fast mode decision algorithm for intra prediction in HEVC,” in Proc. 2012 2nd International Conference on Consumer Electronics, Communications and Networks (CECNet), pp. 1836-1840, April 2012.
[23] L. Zhao, L. Zhang, S. Ma, and D. Zhao, “Fast mode decision algorithm for intra prediction in HEVC,” in Proc. 2011 Visual Communications and Image Processing (VCIP), pp. 1-4, Nov. 2011.
[24] L. Shen, Z. Zhang, and Z. Liu, “Adaptive inter-mode decision for HEVC jointly utilizing inter-level and spatiotemporal correlations,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 24, no. 10, pp. 1709-1722, Oct. 2014.
[25] Z. Pan, S. Kwong, M. T. Sun, and J. Lei, “Early MERGE mode decision based on motion estimation and hierarchical depth correlation for HEVC,” IEEE Transactions on Broadcasting, vol. 60, no. 2, pp. 405-412, June 2014.
[26] L. Shen, Z. Liu, X. Zhang, W. Zhao, and Z. Zhang, “An effective CU size decision method for HEVC encoders,” IEEE Transaction Multimedia, vol. 15, no. 2, pp. 465-470, Feb. 2013.
[27] S. Cho, and M. Kim, “Fast CU splitting and pruning for suboptimal CU partitioning in HEVC intra coding,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 15, no. 2, pp. 1555-1564, Feb. 2013.
[28] J. Xiong, H. Li, Q. Wu, F. Ment, “A fast HEVC inter CU selection method based on pyramid motion divergence,” IEEE Transactions on Multimedia, vol. 16, no. 2, pp.559-564, Feb. 2014.
[29] X. Shen, and L. Yu, “CU splitting early termination based on weighted SVM,” EURASIP Journal on Image and Video Processing, vol. 2013, no. 4, pp. 1-11, 2013.
[30] H. S. Kim, and R. H. Park, “Fast CU partitioning algorithm for HEVC using an online-learning-based Bayesian decision rule,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 26, no. 1, pp. 130-138, Jan. 2016.
[31] Z. Liu, X. Yu, S. Chen, D. Wang, “CNN Oriented fast HEVC intra CU mode decision,” in Proc. 2016 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 2270-2273, May 2016.
[32] S. Cho, and M. Kim, “Fast CU splitting and pruning for suboptimal CU partitioning in HEVC intra coding,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 23, no. 9, pp. 1555-1564, Sept. 2013.
[33] Joint Collaborative Team on Video Coding (JCT-VC), “Test sequence material (AHG16),” Document JCTVC-P0016, San Jose, 2014.
[34] Bjontegaard G., “Calculation of average PSNR differences between RD-curves,” ITU-T SG16 Q6 Video Coding Experts Group (VCEG), Document VCEG-M33, Austin, 2001.
[35] S. Zhou, Z. Ye, and Y. Wang, “Fast HEVC CU/PU mode decision based on ANN and texture analysis,” in Proc. 2016 6th International Conference on Image Processing Theory, Tools and Applications (IPTA), pp. 1-6, Dec. 2016.
[36] J. Chen, and L. Yu, “Effective HEVC intra coding unit size decision based on online progressive Bayesian classification,” in Proc. 2016 IEEE International Conference on Multimedia and Expo (ICME), pp. 1-6, July 2016.
[37] Y. Zhang, S. Kwong, X. Wang, H. Yuan, Z. Pan, and L. Xu, “Machine learning-based coding unit depth decisions for flexible complexity allocation in high efficiency video coding,” IEEE Transactions on Image Processing, vol. 24, no. 7, pp. 2225-2238, July 2015.
[38] Z. Xu, B. Min, and R. C. C. Cheung, “A fast inter CU decision algorithm for HEVC,” Signal Processing: Image Communication, vol. 60, no. 1, pp. 211-223, Sep. 2017.
[39] Y. Eom, S. Park, and C. W. Chung, “A New Scene Change Detection Method of Compressed and Decompressed Domain for UHD Video Systems,” in Proc. 2016 IEEE International Conference on Consumer Electronics (ICCE), pp. 229-232, Jan. 2016.
[40] Maas, Andrew L., Awni Y. Hannun, and Andrew Y. Ng., “Rectifier nonlinearities improve neural network acoustic models,” in Proc. International Conference on Machine Learning (ICML), pp. 3-8, 2013.
[41] S. Ioffe, C. Szegedy, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift,” in Proc. International Conference on Machine Learning (ICML), pp. 448-456, 2015.