簡易檢索 / 詳目顯示

研究生: 董穎禪
Tung, Ying-Chan
論文名稱: 運用卷積神經網路於HEVC快速編碼單元大小決策之研究
A Study of Effective CU Size Decision for HEVC Based on Convolutional Neural Networks
指導教授: 郭致宏
Kuo, Chih-Hung
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2019
畢業學年度: 107
語文別: 中文
論文頁數: 115
中文關鍵詞: HEVC編碼單元大小卷積神經網路機器學習樸素貝葉斯
外文關鍵詞: HEVC, CU size, CNN, Machine learning, Naive Bayes
相關次數: 點閱:93下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 高效率視訊編碼與前一代的視訊壓縮標準H.264/AVC相比提高了壓縮率,但同時帶來大量的計算負擔導致影像無法即時呈現,為減少計算量我們提出一個卷積神經網路加速編碼單元大小決策的方法。卷積神經網路優異的紋理特徵擷取能力適合應用在幀內預測中,此外,同時加入模式檢查過程中產生的編碼資訊以利於幀間預測的區塊大小判斷,本論文針對三種不同卷積神經網路架構與區塊大小決策方法進行探討。小區塊的編碼單元大小預測方法,我們則簡單的利用樸素貝葉斯概念進行預測。實驗結果顯示出我們的方法與HM-16.15相比在Low-delay main配置中可節省38.19%的時間同時造成1.91%的BDBR上升與0.066dB的BDPSNR下降。

    High Efficiency Video Coding (HEVC) achieves high compression ratio, but its huge computational burden makes its implementation hard to be real-time. To relief the computation load, we propose a fast coding unit (CU) size decision method by using Convolutional Neural Network (CNN). CNN captures texture features that are suitable for intra prediction. Features of fully connected layers are incorporated with the coding information that is generated in the period of mode checking process. The coding information that contains temporal correlation can help us to make CU size decision for inter prediction. For smaller CUs, the method of Naive Bayes is adopted for size decision. Experimental results show that our method can save 38.19% of the encoding time and increase 1.91% BDBR and decrease 0.066dB BDPSNR on average compared to HM-16.15 in Low-delay configuration.

    中文摘要.......................................I 英文摘要.......................................II 目錄...........................................XX 第一章 緒論...................................1 1.1 前言.......................................1 1.2 研究背景...................................1 1.2.1 編碼架構...............................2 1.2.1.1 編碼單元 (Coding Unit, CU).............3 1.2.1.2 預測單元 (Prediction Unit, PU).........4 1.2.1.3 轉換單元 (Transform Unit, TU)..........5 1.2.2 HEVC編碼流程...........................6 1.2.3 預測模式...............................7 1.2.3.1 幀內預測模式 (Intra prediction mode)...7 1.2.3.2 幀間預測模式 (Inter prediction mode)...11 1.2.4 運動估計 (Motion estimation)...........13 1.2.5 動態補償 (Motion compensation).........14 1.2.6 位元率-失真成本函數 (Rate-distortion cost function)...15 1.2.7 HEVC採用之快速編碼方法.................17 1.2.7.1 編碼單元提早設定 (Early CU setting, ECU)...18 1.2.7.2 基於編碼區塊旗標之快速模式 (Coded block flag Fast Mode, CFM)...19 1.2.7.3 Skip模式提早偵測法 (Early Skip Detection, ESD)...20 1.2.8 神經網路 (Neural networks).............21 1.2.8.1 人工神經網路 (Artificial neural networks)...21 1.2.8.2 反向傳播 (Back propagation)............22 1.2.8.3 卷積神經網路 (Convolutional Neural Networks, CNN)...23 1.2.8.4 激活函數 (Activation function).........25 1.2.8.5 批正歸化 (Batch normalization).........26 1.2.9 樸素貝葉斯分類器 (Naïve Bayes classifier)...27 1.3 研究動機與貢獻.............................28 1.4 論文架構...................................29 第二章 HEVC快速編碼之文獻回顧.................30 2.1 基於HEVC之快速編碼單元大小選擇演算法.......31 2.2 應用於HEVC幀內編碼之快速CU分割與終止方法...34 2.3 基於運動變化之快速CU選擇演算法.............37 2.4 基於貝式決策理論之快速CU分割方法...........38 2.5 基於支持向量機的CU提早終止演算法...........39 2.6 基於卷積神經網路之快速Intra畫面區塊大小決策...41 2.7 總結.......................................43 第三章 基於機器學習之快速編碼單元大小決策演算法...45 3.1 統計與分析.................................47 3.2 應用於快速編碼單元大小決策之卷積神經網路...50 3.2.1 卷積神經網路...........................51 3.2.1.1 區塊層級分割機率預測網路...............51 3.2.1.2 根部層級分割機率預測網路...............54 3.2.1.3 位元率-失真成本預測網路................58 3.2.1.4 卷積神經網路比較.......................60 3.2.2 編碼參數...............................65 3.2.2.1 鄰近區塊深度...........................65 3.2.2.2 位元率-失真成本........................67 3.2.2.3 Skip模式旗標...........................69 3.2.2.4 運動向量 (Motion Vector, MV)...........70 3.2.2.5 量化參數 (Quantization Parameter, QP)..71 3.2.2.6 畫面邊界...............................73 3.2.2.7 編碼參數比較...........................74 3.3 應用編碼資訊之樸素貝葉斯加速編碼方法.......75 3.3.1 編碼資訊的選擇.........................77 3.3.2 資料蒐集方法...........................78 3.3.2.1 固定個數畫面...........................78 3.3.2.2 週期性更新.............................79 3.3.2.3 週期性調整搭配場景轉換偵測方法.........80 3.3.2.4 資料蒐集方法比較.......................82 3.4 演算法總結.................................84 3.4.1 整體演算法.............................84 3.4.2 演算法步驟.............................86 第四章 實驗結果與分析.........................88 4.1 實驗環境與測試序列.........................88 4.1.1 實驗環境...............................88 4.1.2 Low-delay配置檔........................89 4.1.3 訓練資料...............................90 4.1.4 測試序列...............................91 4.2 演算法實驗結果與比較.......................96 4.2.1 卷積神經網路加速方法實驗結果...........96 4.2.1.1 卷積神經網路加速方法之編碼結果.........96 4.2.1.2 卷積神經網路加速方法之時間分析.........98 4.2.2 樸素貝葉斯加速方法實驗結果.............100 4.2.2.1 一般序列編碼結果.......................100 4.2.2.2 場景轉換序列編碼結果...................102 4.2.3 卷積神經網路與樸素貝葉斯加速方法比較...103 4.2.4 整體演算法實驗結果.....................104 第五章 結論與未來展望..........................110 5.1 結論.......................................110 5.2 未來展望...................................110 參考文獻.......................................111

    [1] T. Wiegand, J. R. Ohm, G. J. Sullivan, W. J. Han, R. Joshi, T. K. Tan, and K. Ugur, “Special section on the joint call for proposals on High Efficiency Video Coding (HEVC) standardization,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 20, no.12, pp.1661-1666, Dec. 2010.
    [2] G. J. Sullivan, J. R. Ohm, H. Schwarz, T. K. Tan, and T. Wiegand, “Comparison of the coding efficiency of video coding standards including high efficiency video coding (HEVC),” IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp.1669-1684, Dec. 2012.
    [3] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Proc. 25th Neural Information Processing Systems (NIPS), pp. 1097-1105, 2012.
    [4] WJ Han, J. Min, IK Kim, E. Alshina, A. Alshin, T. Lee, J. Chen, V. Seregin, S. Lee, YM. Hong, MS. Cheon, N. Shlyakhov, K. McCann, T. Davies, JH. Park, “Improved video compression efficiency through flexible unit representation and corresponding extension of coding tools,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 20, no. 12, pp. 1709-1720, Dec. 2010.
    [5] P. Helle, S. Oudin, B. Bross, D. Marpe, K. Ugur, J. Jung, G. Glare, and T. Wiegand, “Block merging for quad-tree based partitioning in HEVC,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp.1720-1731, Dec. 2012.
    [6] S. Kamp, and M. Wien, “Error accumulation in motion compensation in P and B slices,” Document JVT-AA039, Geneva, 2008.
    [7] K. Ugur, A. Alshin, E. Alshina, F. Bossen, W. Han, J. Park, and J. Lainema, “Motion Compensated Prediction and Interpolation Filter Design in H.265/HEVC,” IEEE Journal of Selected Topics in Signal Processing, vol. 7, no. 6, pp. 946-956, Dec. 2013.
    [8] A. Ortega, and K. Ramchandran, “Rate-distortion methods for image and video compression: an overview,” IEEEE Signal Processing Magazine, vol. 15, no. 6, pp. 23-50, Nov. 1998.
    [9] GJ. Sullivan, and T. Wiegand, “Rate-distortion optimization for video compression,” IEEEE Signal Processing Magazine, vol. 15, no. 6, pp. 74-90, Nov. 1998.
    [10] K. McCann, B. Bross, WJ. Han, IK. Kim, K. Sugimoto, and GJ. Sullivan, “High Efficiency Video Coding (HEVC) Test Model 13 (HM 13) Encoder Description,” Document JCTVC-O1002, Geneva, 2013.
    [11] Joint Collaborative Team on Video Coding (JCT-VC), “High Efficiency Video Coding (HEVC) Test Model 16 (HM16) Encoder Description,” Document JCTVC-R102, Sapporo, 2014.
    [12] Joint Collaborative Team on Video Coding (JCT-VC), “Coding tree pruning based CU early termination,” Document JCTVC-F092, Torino, 2011.
    [13] Joint Collaborative Team on Video Coding (JCT-VC), “Early termination of CU encoding to reduce HEVC complexity,” Document JCTVC-F045, Torino, 2011.
    [14] Joint Collaborative Team on Video Coding (JCT-VC), “Early Skip Detection for HEVC,” Document JCTVC-G543, Geneva, 2011.
    [15] F. Rosenblatt, “The perceptron: A probabilistic model for information storage and organization in the brain,” Psychological Review, vol. 65, no. 6, pp. 65-386, 1958.
    [16] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by back-propagating errors,” Cognitive modeling, pp. 1, 1988.
    [17] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, “Backpropagation applied to handwritten zip code recognition,” Neural Computation, vol. 1, no. 4, pp. 541-551, 1989.
    [18] L. Shen, Z. Zhang, and P. An, “Fast CU size decision and mode decision algorithm for HEVC intra coding,” IEEE Transactions on Consumer Electronics, vol. 59, no. 1, pp. 207-213, Feb. 2013.
    [19] X. Shen, L. Yu, and J. Chen, “Fast coding unit size selection for HEVC based on Bayesian decision rule,” in Proc. 2012 Picture Coding Symposium (PCS), pp. 7-9, May 2012.
    [20] X. Wang, Y. Xue, “Fast HEVC intra coding algorithm based on Otsu’s method and gradient,” in Proc. 2016 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB), pp. 1-5, June 2016.
    [21] D. G. Fernandez, A. A. D. Barrio, G. Botella, C. Garcia, “Fast and effective CU size decision based on spatial and temporal homogeneity detection,” Multimedia Tools and Applications, vol. 77, no. 5, pp. 5907-5927, March 2018.
    [22] W. Jiang, H. Ma, and Y. Chen, “Gradient based fast mode decision algorithm for intra prediction in HEVC,” in Proc. 2012 2nd International Conference on Consumer Electronics, Communications and Networks (CECNet), pp. 1836-1840, April 2012.
    [23] L. Zhao, L. Zhang, S. Ma, and D. Zhao, “Fast mode decision algorithm for intra prediction in HEVC,” in Proc. 2011 Visual Communications and Image Processing (VCIP), pp. 1-4, Nov. 2011.
    [24] L. Shen, Z. Zhang, and Z. Liu, “Adaptive inter-mode decision for HEVC jointly utilizing inter-level and spatiotemporal correlations,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 24, no. 10, pp. 1709-1722, Oct. 2014.
    [25] Z. Pan, S. Kwong, M. T. Sun, and J. Lei, “Early MERGE mode decision based on motion estimation and hierarchical depth correlation for HEVC,” IEEE Transactions on Broadcasting, vol. 60, no. 2, pp. 405-412, June 2014.
    [26] L. Shen, Z. Liu, X. Zhang, W. Zhao, and Z. Zhang, “An effective CU size decision method for HEVC encoders,” IEEE Transaction Multimedia, vol. 15, no. 2, pp. 465-470, Feb. 2013.
    [27] S. Cho, and M. Kim, “Fast CU splitting and pruning for suboptimal CU partitioning in HEVC intra coding,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 15, no. 2, pp. 1555-1564, Feb. 2013.

    [28] J. Xiong, H. Li, Q. Wu, F. Ment, “A fast HEVC inter CU selection method based on pyramid motion divergence,” IEEE Transactions on Multimedia, vol. 16, no. 2, pp.559-564, Feb. 2014.
    [29] X. Shen, and L. Yu, “CU splitting early termination based on weighted SVM,” EURASIP Journal on Image and Video Processing, vol. 2013, no. 4, pp. 1-11, 2013.
    [30] H. S. Kim, and R. H. Park, “Fast CU partitioning algorithm for HEVC using an online-learning-based Bayesian decision rule,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 26, no. 1, pp. 130-138, Jan. 2016.
    [31] Z. Liu, X. Yu, S. Chen, D. Wang, “CNN Oriented fast HEVC intra CU mode decision,” in Proc. 2016 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 2270-2273, May 2016.
    [32] S. Cho, and M. Kim, “Fast CU splitting and pruning for suboptimal CU partitioning in HEVC intra coding,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 23, no. 9, pp. 1555-1564, Sept. 2013.
    [33] Joint Collaborative Team on Video Coding (JCT-VC), “Test sequence material (AHG16),” Document JCTVC-P0016, San Jose, 2014.
    [34] Bjontegaard G., “Calculation of average PSNR differences between RD-curves,” ITU-T SG16 Q6 Video Coding Experts Group (VCEG), Document VCEG-M33, Austin, 2001.
    [35] S. Zhou, Z. Ye, and Y. Wang, “Fast HEVC CU/PU mode decision based on ANN and texture analysis,” in Proc. 2016 6th International Conference on Image Processing Theory, Tools and Applications (IPTA), pp. 1-6, Dec. 2016.
    [36] J. Chen, and L. Yu, “Effective HEVC intra coding unit size decision based on online progressive Bayesian classification,” in Proc. 2016 IEEE International Conference on Multimedia and Expo (ICME), pp. 1-6, July 2016.
    [37] Y. Zhang, S. Kwong, X. Wang, H. Yuan, Z. Pan, and L. Xu, “Machine learning-based coding unit depth decisions for flexible complexity allocation in high efficiency video coding,” IEEE Transactions on Image Processing, vol. 24, no. 7, pp. 2225-2238, July 2015.
    [38] Z. Xu, B. Min, and R. C. C. Cheung, “A fast inter CU decision algorithm for HEVC,” Signal Processing: Image Communication, vol. 60, no. 1, pp. 211-223, Sep. 2017.
    [39] Y. Eom, S. Park, and C. W. Chung, “A New Scene Change Detection Method of Compressed and Decompressed Domain for UHD Video Systems,” in Proc. 2016 IEEE International Conference on Consumer Electronics (ICCE), pp. 229-232, Jan. 2016.
    [40] Maas, Andrew L., Awni Y. Hannun, and Andrew Y. Ng., “Rectifier nonlinearities improve neural network acoustic models,” in Proc. International Conference on Machine Learning (ICML), pp. 3-8, 2013.
    [41] S. Ioffe, C. Szegedy, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift,” in Proc. International Conference on Machine Learning (ICML), pp. 448-456, 2015.

    無法下載圖示 校內:2021-09-01公開
    校外:不公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE