簡易檢索 / 詳目顯示

研究生: 謝毅賢
Shie, Yi-Shian
論文名稱: 高效能視訊編碼標準之環內濾波器在圖形處理器上平行實現
Parallel Implementation for In-Loop Filter of HEVC on GPU
指導教授: 郭致宏
Kuo, Chih-Hung
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2014
畢業學年度: 102
語文別: 中文
論文頁數: 109
中文關鍵詞: 高效能視訊編碼環內濾波器平行處理圖行處理器
外文關鍵詞: HEVC, In-Loop Filter, Parallel Processing, GPU
相關次數: 點閱:87下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文為了減少環內濾波器的執行時間提出一個運行在圖形處理器與中央處理器上的平行程式架構。環內濾波器包含去方塊濾波器與樣本自適應性補償。在去方塊濾波器,我們使用邊緣層級的資料並行性過濾區塊邊緣,且跳過四分樹區塊分割演算法與Z掃描處理順序。在樣本自適應性補償,根據處理流程在每一步的功能將演算法分割成資料統計單元、參數決定單元、與樣本補償單元。對於資料統計單元,我們採用原子加法與平行縮減克服記憶體平行累加的問題。在參數決定單元為了實現位元率平行估算,我們採用資訊估算函數取代上下文自適應性二元算術編碼。最後樣本補償單元基於樣本之間的資料並行性平行補償每一個樣本。實驗結果顯示所提出的平行程式架構在去方塊濾波器可達到5.0倍的增速,樣本自適應性補償可達到10.5倍的增速。

    This thesis proposes a parallel program architecture which is running on GPU and CPU to reduce execution time for in-loop filter of HEVC. The in-loop filter includes de-blocking filter and sample adaptive offset. In the de-blocking filter, we use edge-level data parallelism to filter block edges in parallel that skips quadtree decomposition algorithm and z-scan order process. For the sample adaptive offset, we divide sample adaptive offset into statistics calculation, parameters decision, and sample compensation. In the statistics calculation, we use atomic addition and parallel reduction methods to overcome the issue about parallel accumulate for memory. Moreover, we employ a function of information estimation to estimate bitrate instead of context-adaptive binary arithmetic coding for the parameters decision. Finally, the sample compensation compensates samples parallelly based on sample-level data parallelism. Experimental results show that the proposed parallel program architecture achieve 5.0 speedup for de-blocking filter and 10.5 speedup for sample adaptive offset.

    中文摘要 I Abstract II 誌謝 VI 目錄 VII 表目錄 X 圖目錄 XII 第一章 緒論 1 1-1 研究動機 1 1-2 研究貢獻 3 1-3 論文架構 5 第二章 研究背景 6 2-1 去方塊濾波器(De-blocking Filter, DBF)概述 6 2-1-1 邊界強度計算(Boundary Strength Calculation) 9 2-1-2 過濾決策(Filtering Decisions) 10 2-1-3 過濾運算(Filtering Operations) 12 2-1-4 過濾參數(Filtering Parameters) 14 2-2 樣本自適應性補償(Sample Adaptive Offset, SAO)概述 16 2-2-1 邊緣補償(Edge Offset, EO) 17 2-2-2 頻帶補償(Band Offset, BO) 19 2-2-3 語法元素(Syntax Element) 20 2-2-4 快速失真估算法(Fast Distortion Estimation Method) 22 2-2-5 切片層級開關控制(Slice-level On/Off Control) 23 2-3 相關論文研究 23 2-3-1 比較三種不同的去方塊濾波器平行處理方法 24 2-3-2 基於方向非循環圖形基於順序實現平行處理 27 2-3-3 基於CTU與CTU列的資料並行性實現平行處理 29 2-4 開放計算語言(Open Computing Language, OpenCL) 30 2-4-1 Work-item之配置 31 2-4-2 記憶體階層架構 32 2-4-3 GPU與CPU協同工作 33 第三章 去方塊濾波器平行程式架構 34 3-1 去方塊濾波器處理流程與資料相依分析 34 3-2 去方塊濾波器平行程式架構設計 37 3-3 亮度邊緣平行過濾演算法 38 3-4 色度邊緣平行過濾演算法 39 第四章 樣本自適應性補償平行程式架構 41 4-1 樣本自適應性補償處理流程 41 4-2 樣本自適應性補償平行程式架構設計 42 4-3 平行資料統計演算法 43 4-3-1 基於原子加法實現平行資料統計演算法 45 4-3-2 基於原子加法結合並行分類改善平行資料統計演算法 47 4-3-3 加入本地記憶體改善原子加法的記憶體存取時間 49 4-3-4 基於平行縮減實現平行資料統計演算法 51 4-3-5 基於平行縮減結合並行分類改善平行資料統計演算法 54 4-4 參數決定平行程式架構 56 4-4-1 基於事前off-line統計之資訊位元估算 58 4-4-2 實現補償、失真、位元率平行計算之程式架構 59 4-4-3 實現平行新參數決定之程式架構 61 4-4-4 實現平行參數決定之程式架構 62 4-5 平行樣本補償演算法 64 第五章 模擬環境設定與實驗結果 68 5-1 測試條件與實驗平台 68 5-2 去方塊濾波器平行程式架構之模擬結果 70 5-3 樣本自適應性補償平行程式架構之模擬結果 76 5-3-1 平行資料統計演算法之比較 76 5-3-2 參數決定平行程式架構之比較 85 5-3-3 樣本自適應性補償之比較 95 5-4 解碼端之環內濾波器平行程式架構與模擬結果 99 第六章 結論與未來展望 101 6-1 結論 101 6-2 未來展望 102 參考文獻 103

    [1] G. J. Sullivan, J. R. Ohm, W. J. Han, and T. Wiegand, "Overview of the High Efficiency Video Coding (HEVC) Standard," IEEE Transactions on Circuits and Systems for Video Technology, vol. 12, no. 22, pp. 1649-1668, Dec. 2012.
    [2] A. Norkin, G. Bjontegaard, A. Fuldseth, M. Narroschke, M. Ikeda, K. Andersson, M. Zhou, and G. Van der Auwera, "HEVC Deblocking Filter," IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1746-1754, Dec. 2012.
    [3] C. M. Fu, E. Alshina, A. Alshin, Y. W. Huang, C. Y. Chen, C. Y. Tsai, C. W. Hsu, S. M. Lei, J. H. Park, and W. J. Han, "Sample Adaptive Offset in the HEVC Standard," IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1755-1764, Dec. 2012.
    [4] C. C. Chi, M. Alvarez-Mesa, B. Juurlink, G. Clare, F. Henry, S. Pateux, and T. Schierl, "Parallel Scalability and Efficiency of HEVC Parallelization Approaches," IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1827-1838, Dec. 2012.
    [5] I. K. Kim, J. Min, T. Lee, W. J. Han, and J. H. Park, "Block Partitioning Structure in the HEVC Standard," IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1697-1706, Dec. 2012.
    [6] W. S. Kim and D. K. Kwon, "Improved sample adaptive offset for HEVC," IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, May 2013, pp. 1700-1703.
    [7] C. M. Fu, C. Y. Chen, Y. W. Huang, and S. Lei, "Sample Adaptive Offset for HEVC," IEEE International Workshop on Multimedia Signal Processing, Hangzhou, Oct. 2011, pp. 1-5.
    [8] W. Shen, Q. Shang, S. Shen, Y. Fan, and X. Zeng, "A High-Throughput VLSI Architecture for Deblocking Filter in HEVC," IEEE International Symposium on Circuits and Systems, Beijing, May 2013, pp. 673-676.
    [9] S. Park and K. Ryoo, "The Hardware Design of Effective SAO for HEVC Decoder," IEEE 2nd Global Conference on Consumer Electronics, Tokyo, Oct. 2013, pp. 303-304.
    [10] J. Zhu, D. Zhou, G. He, and S. Goto, "A Combined SAO and De-blocking Filter Architecture for HEVC Video Decoder," IEEE International Conference on Image Processing, Melbourne, VIC, Sept. 2013, pp. 1967-1971.
    [11] E. Ozcan, Y. Adibelli, and I. Hamzaoglu, "A High Performance Deblocking Filter Hardware for High Efficiency Video Coding," IEEE International Conference on Field Programmable Logic and Applications, Porto, Sept. 2013, pp. 1-4.
    [12] M. Narroschke, "Parallelized Deblocking Filter for Hybrid Video Coding," IEEE Conference on Visual Communications and Image Processing, Tainan, Nov. 2011, pp. 1-4.
    [13] J. Joo, Y. Choi, and K. Lee, "Fast Sample Adaptive Offset Encoding Algorithm for HEVC based on Intra Prediction Mode," IEEE Third International Conference on Consumer Electronics, Berlin, Sept. 2013, pp. 50-53.
    [14] A. M. Kotra, M. Raulet, and O. Deforges, "Comparison of Different Parallel Implementations for Deblocking Filter of HEVC," IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, May 2013, pp. 2721-2725.
    [15] C. Yan, Y. Zhang, F. Dai, X. Wang, L. Li, and Q. Dai, "Parallel Deblocking Filter for HEVC on Many-core Processor," Electronics Letters, vol. 50, no. 5, pp. 1367-368, Feb. 2014.
    [16] H. Jo, D. Sim, and B. Jeon, "Hybrid Parallelization for HEVC Decoder," IEEE International Congress on Image and Signal Processing, Hangzhou, Dec. 2013, pp. 170-175.
    [17] Matthew Scarpino, OpenCL in Action, Shelter Island: Matthew Scarpino, November 2011.
    [18] 戴顯權, 資料壓縮, 台北: 旗標, 2007.
    [19] A. Norkin, K. Andersson, and V. Kulyk, "Two HEVC encoder methods for block artifact reduction," IEEE International Conference on Visual Communications and Image Processing, Kuching, Nov. 2013, pp. 1-6.
    [20] B. Luo, X. Guo, G. Cheng, X. Liang, and L. Yu, "A new SAO based on histogram analysis in HEVC," IEEE International Conference on Picture Coding Symposium, San Jose, CA, Dec. 2013, pp. 49-52.
    [21] J. P. Lopez, D. Jimenez, A. Cerezo, J. M. Menendez, "No-reference algorithms for video quality assessment based on artifact evaluation in MPEG-2 and H.264 encoding standards," IEEE International Symposium on Integrated Network Management, Barcelona, Spain, July 2011, pp. 1-6.
    [22] C. C. Kuo, "A double-filter design of deblocking filter for H.264/AVC macro-block adaptive frame field coding," IEEE International Conference on Visual Communications and Image Processing, Tainan, NY, Nov. 2011, pp. 1-4.
    [23] E. Nadernejad, N. Burini, S. Forchhammer, "Adaptive deblocking and deringing of H.264/AVC video sequences," IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, May 2013, pp. 2508-2512.
    [24] H. Zhang, O. C. Au, Y. Shi, W. Zhu, V. Jakhetiya, and L. Jia, "Improved sample adaptive offset for HEVC," IEEE International Conference on Signal and Information Processing Association Annual Summit and Conference, Kaohsiung, Nov. 2013, pp. 1-4.
    [25] H. Su, C. Zhang, J. Chai, and Q. Yang, "A efficient parallel deblocking filter based on GPU: Implementation and optimization," IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, Victoria, BC, Aug. 2011, pp. 280-285.
    [26] J. LI, O. C. AU, F. Lu, S. Lin, W. Sun, D. Soysa, "A parallel deblocking filter based on H.264/AVC video coding standard," IEEE International Symposium on Circuits and Systems, Beijing, May 2013, pp. 233-236.
    [27] J. Zhou, D. Zhou, H. Zhang, Y. Hong, P. Liu, and S. Goto, "A 136 cycles/MB, luma-chroma parallelized H.264/AVC deblocking filter for QFHD applications," IEEE International Conference on Multimedia and Expo, New York, NY, July. 2009, pp. 1134-1137.
    [28] J. P. Lopez, D. Jimenez, A. Cerezo, J. M. Menendez, "No-reference algorithms for video quality assessment based on artifact evaluation in MPEG-2 and H.264 encoding standards," IEEE International Symposium on Integrated Network Management, Ghent, May 2013, pp. 1336-1339.
    [29] K. Hanke, P. Hosten, and F. Jager, "Content-adaptive encoder optimization of the H.264/AVC deblocking filter for visual quality improvement," IEEE International Conference on Visual Communications and Image Processing, Tainan, Nov. 2011, pp. 1-4.
    [30] M. Naccari, C. Brites, J. Ascenso, and F. Pereira, "Low complexity deblocking filter perceptual optimization for the HEVC codec," IEEE International Conference on Image Processing, Brussels, Sept. 2011, pp. 737-740.
    [31] T. Cervero, A. Otero, S. Lopez, E. de la Torre, G. Callico, R. Sarmiento, and T. Riesgo, "A novel scalable Deblocking Filter architecture for H.264/AVC and SVC video codecs," IEEE International Conference on Multimedia and Expo, Barcelona, July 2011, pp. 1-6.
    [32] P. List, A. Joch, J. Lainema, G. Bjøntegaard, and M. Karczewicz, "Adaptive deblocking filter," IEEE Transactions on Circuits and Systems for Video Technology, vol. 13, no. 7, pp. 614-619, July 2003.
    [33] P. N. Subramanya, R. Adireddy, and D. Anand, "SAO in CTU decoding loop for HEVC video decoder," IEEE International Conference on Signal Processing and Communication, Noida, Dec. 2013, pp. 507-511.
    [34] G. B. Praveen and R. Adireddy, "Analysis and approximation of SAO estimation for CTU-level HEVC encoder," IEEE International Conference on Visual Communications and Image Processing (VCIP), Kuching, Nov. 2013, pp. 17-20.
    [35] S. Ahn, M. Kim, and S. Park, "Fast decision of CU partitioning based on SAO parameter, motion and PU/TU split information for HEVC," IEEE International Conference on Picture Coding Symposium, San Jose, CA, Dec. 2013, pp. 113-116.
    [36] S. Jo and Y. H. Song, "Graph-based parallelization algorithm for deblocking filter in H.264/AVC," IEEE International Conference on Consumer Electronics, Las Vegas, NV, Jan. 2013, pp. 342-343.
    [37] S. S. Yang, S. W. Wang, and f. L. Wu, "A Parallel Algorithm for H.264/AVC Deblocking Filter Based on Limited Error Propagation Effect," IEEE International Conference on Multimedia and Expo, Beijing, July 2007, pp. 1858-1861.
    [38] S. Vijay, C. Chakrabarti, and L. J. Karam, "Parallel deblocking filter for H.264 AVC/SVC," IEEE Workshop on Signal Processing Systems, San Francisco, CA, Oct. 2010, pp. 116-121.
    [39] T. Damak, I. Werda, N. Masmoudi, S. Bilavarn, "Fast prototyping H.264 Deblocking filter using ESL tools," IEEE International Multi-Conference on Systems, Signals and Devices, Sousse, March 2011, pp. 1-4.
    [40] T. Moriyoshi and S. Miura, "Real-time H.264 Encoder with Deblocking Filter Parallelization," IEEE International Workshop on Consumer Electronics, Las Vegas, NV, Jan. 2008, pp. 1-2.
    [41] T. Cervero, A. Otero, S. López, E. De La Torre, G. Callicó, R. Sarmiento, T. Riesgo, "A novel scalable Deblocking Filter architecture for H.264/AVC and SVC video codecs," IEEE International Conference on Multimedia and Expo, Barcelona, July 2011, pp. 1-6.
    [42] T. Liu, E. Yang, R. Cheng, and Y. Fu, "CUDA-based H.264/AVC deblocking filtering," in IEEE International Conference on Audio Language and Image Processing, Shanghai, Nov. 2010, pp. 1547-1551.
    [43] T. Liu, C. Chen, and E. Yang, "CUDA-based acceleration of post deblocking filter," Asia Pacific Conference on Postgraduate Research in Microelectronics and Electronics, Macau, Oct. 2011, pp. 53-56.
    [44] W. H. Mahjoub, H. Osman, and G. M. Ali, "H.264 deblocking filter enhancement," IEEE International Conference on Computer Engineering & Systems, Cairo, Dec. 2011, pp. 219-224.
    [45] W. Pu, J. Chen, K. Rapaka, X. Li, and M. Karczewicz, "High Frequency SAO for scalable extension of HEVC," IEEE International Conference on Picture Coding Symposium, San Jose, CA, Dec. 2013, pp. 121-124.
    [46] Y. M. Huang, J. J. Leou, and M. H. Cheng, "A Post Deblocking Filter for H.264 Video," IEEE International Conference on Computer Communications and Networks, Honolulu, HI, Aug. 2007, pp. 1137-1142.
    [47] Y. Li, N. Han, and C. Chen, "A Novel Deblocking Filter Algorithm in H.264 for Real Time Implementation," IEEE International Conference on Multimedia and Ubiquitous Engineering, Qingdao, June 2009, pp. 26-30.
    [48] Z. Chan, J. Shuyun, and L. Fan, "An Improved deblocking filter for H.264," TENCON 2008 - 2008 IEEE Region 10 Conference, Hyderabad, Nov. 2008, pp. 1-4.

    下載圖示 校內:2016-09-10公開
    校外:2016-09-10公開
    QR CODE