研究生: |
曾昭雄 Tseng, Chao-Hsuing |
---|---|
論文名稱: |
用於先進視訊編碼之離散餘弦轉換的設計 Designs of Discrete Cosine Transform for Advanced Video Coding |
指導教授: |
楊家輝
Yang, Jar-Ferr |
學位類別: |
博士 Doctor |
系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
論文出版年: | 2006 |
畢業學年度: | 94 |
語文別: | 英文 |
論文頁數: | 147 |
中文關鍵詞: | 先進視訊編碼 、離散餘弦轉換 |
外文關鍵詞: | Advanced Video Coding, Discrete Cosine Transform |
相關次數: | 點閱:100 下載:3 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文提出數個快速離散餘弦轉換演算法和整數轉換來減少視訊編解碼器之計算複雜度,並達到更好能量壓縮的能力。本論文研究的主題,首先是發展快速二維離散餘弦轉換和反離散餘弦轉換演算法,可利用其架構規則和模組化的特性,減少計算複雜度;再者,提出完整設計整數離散餘弦轉換(Integer DCT)和整數正交式離散餘弦轉換(IODCT)的理論,有更好能量緊縮的能力,以改善視訊編解碼器的效能;最後,在H.264/AVC編碼器內部預測模式下提出「加強式流量─失真成本函數」,可用它來改進模式決定的計算量並增加編碼的效能。其詳細的研究方法將說明於後。
我們所提出的快速離散餘弦轉換演算法是基於規則性四分陣列的處理程序,採用直接計算演算法所得到的,藉由這處理程序的不斷分解及組合,可容易的擴展到高階DCT和IDCT。因為它有規則性處理程序,所有複雜的計算量可被轉換成一些相同計算核心的組合,每計算核心只需要三個乘法器和八個加法器。由於本演算法的高規則性和低計算量的特性,在軟體和硬體的實現上都可得到很好的結果。
整數轉換因沒有離散餘弦轉換之數值漂流的問題,已經被大量的研究,在文獻中各種H.264/AVC版本所用的整數轉換是最受人注目。我們提出完整發展整數離散餘弦轉換(Integer DCT)和整數正交式離散餘弦轉換(IODCT)的理論,以此理論我們可找到壓縮特性更好及計算量更少的整數轉換。使用遞迴設計的方法,我們可得到許多組的IODCT和快速計算法,它們是依靠選擇正規化係數和核心離散餘弦整數而得到。我們使用壓縮數碼增益係數來當能量壓縮的評量指標去選擇最適合的離散轉換器。我們發現使用於H.264/AVC編碼下的整數轉換,其特性接近原始DCT,全屬於IODCT的一支。從模擬的結果顯示,我們所提得IODCT可比原始DCT和H.264/AVC的整數轉換達到較好的能量壓縮,可得到較好的編碼效能。我們相信,擁有效率的計算和較好的能量壓縮好處的IODCT必能有效率地可用於高級視訊編碼系統。
在H.264 高級視訊編碼器(AVC)標準下,使用周邊方塊做為預測機制的內部預測是種非常重要的壓縮方法,其最佳預測模式的尋得,除了使用「最佳流量─失真」評量法則外,H.264參考軟體也建議「絕對差值和」(SAD)和經過哈答瑪轉換(Hadamard Transform)而成的「絕對轉換差值和」(SATD)等兩種評量法則,但這兩種評量法則所尋得最佳預測模式,所呈現的編碼效能和使用「最佳流量─失真」評量法則相比,卻得到更差的效能,於是我們在H.264/AVC編碼器4x4內部預測模式下提出「加強式流量─失真成本函數」,可用它來增加編碼的效能;並使用線性轉換的特性和各種模式下預測像素在方向空間上的關係,發展SATD 和SAITD的快速演算法去改進計算量。模擬的結果顯示,我們所提得「加強式流量─失真成本函數」其所尋得最佳預測模式,所呈現的編碼效能好於使用SAD或SATD等評量指標,在低資料流量時甚至接近使用「最佳流量─失真」評量指標。並且所發展SATD快速演算法在4x4內部預測模式下可節省大約54%的計算量;而所發展SAITD快速演算法可節省大約30%的計算量。
In the dissertation, we proposed several fast discrete cosine transform algorithms and integer transforms to reduce the computational complexity and achieve better energy compaction of video coders. First, the fast two-dimensional discrete cosine transform (DCT) and inverse discrete cosine transform (IDCT) algorithms were proposed to reduce computational complexity with regular and modular architecture. Then, a systematic design procedure of integer discrete cosine transforms (integer DCTs) and integer orthogonal discrete cosine transforms (IODCTs) was proposed to achieve better energy compaction and improve video coder performance. Finally, an enhanced rate-distortion cost function was proposed to improve the coding performance for H.264/AVC intra mode decision. The detailed discussions are addressed in the following:
The proposed fast DCT and IDCT algorithms by using the direct computation approach are based on regular quad-matrix process. Since the algorithms through decomposition and reconstruction procedures can be repeatedly performed, we can easily extend them for the higher-dimension DCT and IDCT computations. With regularized procedures, all the heavy computations can be realized by the same computational kernel, which demands three multiplications and eight additions for each kernel. With high regular architecture and low computational complexity, the proposed algorithms after feasibility design show their advantages in both software and hardware implementation.
The integer transform without drifting problems has been widely investigated. Among these researches, the integer transforms in various versions of H.264/AVC are the most attractive. We proposed a systematic design procedure of integer discrete cosine transforms (integer DCTs) and integer orthogonal discrete cosine transforms (IODCTs). Based on the proposed methods, we can design optimal integer transforms with better compaction ability and less computational complexity. With recursive design method, we can get many IODCTs and their reduced computations. The IODCTs depend on selections of normalization factors and cosine kernel integers. We use the compaction coding gain as the criterion to verify the performance of energy compaction to select a proper discrete transform. We found the famous integer transforms which achieve good approximations of the original DCT suggested in H.264/AVC coder all belong to IODCTs. Simulations show that the proposed IODCTs achieve better energy compaction and coding performances than the original DCT and integer transforms in H.264 coder. With advantages of computational efficiency and energy compaction, we believe that the proposed IODCTs could be efficiently and effectively used in advanced video coding systems.
In H.264 advanced video coding (AVC) standard, the intra prediction plays an important role in compression of intraframes by referring surrounding coded blocks. It is obvious that either the SAD or SATD criterion suggested in the reference software will cause the worse coding performance compare to RD-optimized criterion. We first propose an enhanced cost function for intra 4x4 mode decision in H.264/AVC and then develop fast computation algorithms of the SATD and the SAITD to reduce the computation by using the property of linear transform and fixed spatial relation of predicted pixels in each intra mode. Simulation results show that when we adopt the enhanced cost function to select the best mode, the coding performance is better than the SAD (or SATD) criterion and is very similar to the RD optimized criterion in low bit rate. Moreover, with the developing fast algorithm of the SATD, we can reduce about 54% computation of the original SATD algorithm for intra 4x4 mode decision. And we can further reduce about 30% the computation of the original SAITD algorithm when computing the enhanced cost function.
[1] N. Ahmcd, T. Natarajan, and K. R. Rao, “Discrete cosine transform,” IEEE Transactions on Communications, vol. 23, pp. 90-93, Jan. 1974.
[2] K. R. Rao and P. Yip, Discrete Cosine Transform: Algorithms, Advantages, and Applications, San Diego, CA: Academic, 1990.
[3] G. Strang and T. Nguyen, Wavelets and Filter Banks, Wellesley, MA: Wellesley-Cambridge, 1997.
[4] W. H. Chen and S. C. Fralick, “Image enhancement using cosine transform filtering,” in Proceedings Symposium on Current Mathematic Problems in Image Science, Monterey, CA, pp. 186-192, Nov. 1976.
[5] D. J. Mulvaney, D. E. Newland, and K. F. Gill, “A comparison of orthogonal transforms in their application to surface texture analysis,” Proc. Inst. of Mechanical Engineers, vol. 200, part C, pp. 407-414, 1986.
[6] H. A. Barger and K. R. Rao, “Evaluation of discrete transforms for use in digital speech recognition,” Journal of Computers and Electrical Engineering, vol. 6, pp. 183-197, Apr. 1979.
[7] ISO/IEC JTC1 10918-1 (ITU-T Rec. T.81), Information technology – Digital compression and coding of continuous-tone still images: Requirements and guidelines, 1994.
[8] ITU Telecom. Standardization Sector of ITU, “Video codecs for audiovisual services at p x 64kb/s”, ITU-T Recommendation H.261, Mar. 1993.
[9] ITU Telecom. Standardization Sector of ITU, “Video coding for low bitrate communication,” ITU-T Recommendation H.263, Mar. 1996.
[10] ISO/IEC 11172-2, Information technology — Coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbit/s – Part 2: Video, MPEG-1, 1993.
[11] ISO/IEC 13181-2, Information technology — Generic coding of moving pictures and associated audio information: Video, MPEG-2, 2000.
[12] ISO/IEC 14496-2, Information technology – Coding of audio-visual objects – Part 2: Visual, MPEG-4, 2003.
[13] M. A. Haque, “A two-dimensional fast cosine transform,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 33, no. 6, pp. 1532-1539, Dec. 1985.
[14] S. C. Chan and K. L. Ho, “A new two-dimensional fast cosine transform algorithm,” IEEE Transactions on Signal Processing, vol. 39, no. 2, pp. 481-484, Feb. 1991.
[15] J. F. Yang and C. P. Fan, “Compact recursive structures for discrete cosine transform,” IEEE Transactions on Circuits and Systems, Part II: Analog and Digital Signal Processing, vol. 47, no. 4, pp. 314-321, Apr. 2000.
[16] J. F. Yang and C. P. Fan, “Fast structural two dimensional discrete cosine transform algorithms,” IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, vol. E81-A, no. 6, pp. 1210-1215, Jun. 1998.
[17] M. J. Narasimha and A. M. Peterson, “On the computation of the discrete cosine transform,” IEEE Transactions on Communication, vol. 26, no. 6, pp. 934-936, Jun. 1978.
[18] J. Markoul, “A fast cosine transform in one and two dimension,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 28, no. 1, pp. 27-34, Feb. 1980.
[19] T. S. Chang, C. S. Kung, and C. W. Jen, “A simple processor core design for DCT/IDCT,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 10, no. 3, pp. 439-447, Apr. 2000.
[20] J. F. Yang, B. L. Bai, and S. C. Hsia, “An efficient two-dimensional inverse discrete cosine transform algorithm for HDTV receivers,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 5, no. 1, pp. 25-30, Feb. 1995.
[21] IEEE Standard 1180-1990, “IEEE standard specifications for the implementation of 8x8 inverse discrete cosine transforms”, Dec. 1990.
[22] G. P. Abousleman, M. W. Marcellin, and B. R. Hunt, “Compression of hyperspectral imagery using the 3-D DCT and hybrid DPCM/DCT,” IEEE Trans. on Geoscience and Remote Sensing, vol. 33, no. 1, pp. 26-34, Jan. 1995.
[23] T. Wiegand, G. J. Sullivan, G. Bjontegaard, and A. Luthra, “Overview of the H.264 / AVC video coding standard,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 13, no. 7, pp. 560-576, July 2003.
[24] “ITU-T Q.15/SG16, H.26L test model long term number 5 (TML-5) draft0”, Doc. Q15-K-59, Sept. 2000.
[25] ITU-T Rec. H.264/ISO/IEC 11496-10, “Advanced video coding for generic audiovisual services”, Prepublished version, Mar. 2005.
[26] ITU-T Rec. H.264/ISO/IEC 11496-10, “Advanced video coding”, Final Committee Draft, Doc. JVT-G050, Mar. 2003.
[27] A. Hallapuro and M. Karczewicz, “Low complexity transform and quantization – Part 1: Basic implementation,” in Joint Video Team (JVT) Doc. JVT- B038, Feb. 2001.
[28] S. Gordon, D. Marpe, and T. Wiegand,” Simplified use of 8x8 transforms – Updated proposal & results,” in Joint Video Team (JVT) Doc. JVT- K028, Mar. 2004.
[29] P. Z. Lee and F. Y. Huang, “Restructured recursive DCTs and DST algorithms,” IEEE Transactions on Signal Processing, vol. 42, no. 7, pp. 1600-1609, July 1994.
[30] G. Morrison, “Video transcoders with low delay”, IEICE Transactions on Communications, vol. E80-B, pp. 963-969, June 1997.
[31] G. Bi, G. Li, K. K. Ma, and T. C. Tan “On the computation of two-dimensional DCT”, IEEE Transactions on Signal Processing, vol. 48, no. 4, pp. 1171-1183, Apr. 2000.
[32] R. A. Horn and C. R. Johnson, Topics in Matrix Analysis, Cambridge University, 1991.
[33] S. C. Pei and J. J. Ding, “The integer transform analogous to discrete trigonometric transforms,” IEEE Transactions on Signal Processing, vol. 48, no. 12, pp. 3345-3364, Dec. 2000.
[34] W. K. Cham, “Development of integer cosine transforms by the principle of dyadic symmetry,” IEE Proceedings I Communications, Speech and Vision, vol. 136, no. 4, pp. 276-282, Aug. 1989.
[35] K. T. Lo and W. K. Cham, ”Development of simple orthogonal transforms for image compression,” IEE Proceedings Vision, Image and Signal Processing, vol. 142, no. 1, pp. 22-26, Feb. 1995
[36] J. Liang and T. D. Tran, “Fast multiplierless approximations of the DCT with the lifting scheme,” IEEE Transactions on Signal Processing, vol. 49, no. 12, pp. 3032-3044, Dec. 2001.
[37] Y. J. Chen, S. Oraintara, and T. Nguyen,” Video compression using integer DCT,” in Proceedings International Conference on Image Processing (ICIP), vol. 2, Vancouver, BC, Canada, pp. 844-845, Sep. 2000.
[38] Y. Zeng, L. Cheng, G. Bi, and A. C. Kot, “Integer DCTs and fast algorithms,” IEEE Transactions on Signal Processing, vol. 49, no. 12, pp. 2774-2782, Dec. 2001.
[39] L. Z. Cheng, H. Xu, and Y. Luo, “Integer discrete cosine transform and its fast application,” IEE Electronics Letters, vol. 37, no. 1, pp. 64-65, Jan. 2001.
[40] F. Bossen, “ABT cleanup and complexity reduction,” in Joint Video Team (JVT) Doc. JVT-E087, Geneva, CH, Oct. 2002.
[41] M. Wien and A. Dahlhoff, ”Adaptive block transforms,” in SG16/Q.6 Doc. VCEG-M62r1, Austin, TX, Apr. 2001.
[42] H. Malvar, A. Hallapuro, M. Karczewicz, and L. Kerofsky, “Low-complexity transform and quantization in H.264/AVC,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 13, no. 7, pp. 598-603, July 2003.
[43] Reference Model JM6.1d, JVT. [Online]. Available: http://bs.hhi.de/~suehring/tml/ download/ Unofficial/.
[44] S. Vankataraman, V. R. Kanchan, K. R. Rao, and M. Mohanty, “Discrete transforms via the Walsh-Hadamard transform,” Signal Processing, vol. 14, no. 4, pp. 371–382, June 1988.
[45] D. S. Taubman and M.W. Marcellin, JPEG2000- Image Compression Fundamentals, Standards, and Practice, Dordrecht, The Netherlands: Kluwer, 2002.
[46] W. H. Fang, N. C. Hu, and S. K. Shih, “Recursive fast computation of the two-dimensional discrete cosine transform,” IEE Proceedings Vision, Image and Signal Processing, vol. 146, no. 1, pp. 25–33, Feb. 1999.
[47] W. H. Chen, C. H. Smith, and S. C. Fralick, “A fast computational algorithm for the discrete cosine transform,” IEEE Transactions on Communications, vol. 25, no. 9, pp. 1004-1009, 1977.
[48] A. K. Jain, “A sinusoid family of unitary transforms”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 1, no. 4, pp. 356-365, Oct. 1979.
[49] Reference Model JM10.1, JVT. [Online]. Available: http://www.itu.int.
[50] G. Bjontegaard, “Calculation of average PSNR differences between RD-curves,” in SG16/Q.6 Doc. VCEG-M33, Austin, TX, Apr. 2001.
[51] M. Cohn and A. Lempel, “On fast M-sequences transforms,” IEEE Transactions on Information Theory, vol. 23, no. 1, pp. 135–137, Jan. 1977.
[52] C. P. Fan and J. F. Yang, “Fast Center Weighted Hadamard Transform Algorithms,” IEEE Transactions on Circuits and Systems, Part II: Analog and Digital Signal Processing, vol. 45, no. 3, pp. 429-432, Mar. 1998.
[53] C. P. Fan and J. F. Yang, “Fixed-Pipeline Two Dimensional Hadamard Transform Algorithms,” IEEE Transactions on Signal Processing, vol. 45, no. 6, pp. 1669-1674, June 1997.
[54] I. E G Richardson, “H.264 / MPEG-4 Part 10: Transform and quantization,” H.264/ MPEG-4 Part 10 White Paper, Mar. 2003.
[55] I. E G Richardson, “H.264/MPEG-4 Part 10: Variable length coding,” H.264/MPEG-4 Part 10 White Paper, Oct. 2002.