| 研究生: |
張世承 Chang, Shih-Cheng |
|---|---|
| 論文名稱: |
視訊編碼之有效率位元控制與快速動態預估 Efficient Rate Control and Fast Motion Estimation for Video Encoders |
| 指導教授: |
楊家輝
Yang, Jef-Ferr 黃正能 Hwang, Jenq-Neng |
| 學位類別: |
博士 Doctor |
| 系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2003 |
| 畢業學年度: | 91 |
| 語文別: | 中文 |
| 論文頁數: | 99 |
| 中文關鍵詞: | 位元控制 、動態預估 、視訊編碼 |
| 外文關鍵詞: | rate control, motion estimation, video coding |
| 相關次數: | 點閱:77 下載:1 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在此論文中, 我們提出了一個有效率的位元控制機制, 可以有效的將視訊壓縮的位元控制在一個很小的變動範圍. 並且提出一個快速的動態預估.
In this thesis, we propose several techniques to improve video compression quality. First, human visual system (HVS) is included in rate control to improve the visual quality of reconstruct video sequences. A fast motion estimation is proposed to improve compression speed. At last, a predict rate control based on DCT indexes is applied to improve buffer fullness and frame skip situation.
First, a new rate control scheme based on the human visual perception is proposed to enhance the visual quality for very low bit-rate video coding. The human visual perception developed in the DCT domain jointly considers both human visual systems and image activity measurements. Compared to the test model TMN8, which is developed under the optimal variance attribution to encoded bits, the proposed coding control scheme is designed for optimal visual power designation. Simulations demonstrate that the proposed coding control scheme achieves better visual quality in the same PSNR but encodes more frames than the TMN8 in the same bit-rate constraints.
Second, another new rate control scheme based on human visual perception is proposed to enhance the visual image quality of very low bit-rate video transmission. With weighted DCT coefficients, the proposed rate control scheme can be easily designed with considerations of both the human visual system and skin enhancement in mind, to achieve a higher visual image quality for human perception. In comparison to the test model 8 (TMN8), which encodes the bits with optimal variance attribution, the proposed coding control scheme is designed to achieve the optimal visual power ascription. Simulation results demonstrate that the proposed coding control scheme encodes the same number of frames with nearly the same PSNR but achieves better visual quality than the TMN8 under the same bit rate constraints.
Third, computation reduction algorithms of motion estimation are developed for low rate video coders. By jointly considering motion estimation, discrete cosine transform (DCT), and quantization, the all-zero and zero-motion detection algorithms are suggested to reduce the computation of motion estimation. Simulation results show that many unnecessary computations in motion estimation are greatly reduced. For Akiyo sequence in the QCIF format (176´144), the average number of the full search points is 787.88. The proposed algorithms, for examples, help three-step search and diamond search algorithms to reduce the average numbers of search points from 28.90 and 4.99 to 2.89 and 1.40 for each macroblock, respectively.
Finally, we propose a model-based rate predictor, which is based on the indices of quantized DCT coefficients. The proposed rate predictor, which can characterize the detailed changes during encoding processing, provides more precise rate estimation than the traditional rate control methods, which are based on pixel variance. Based on the models of the proposed rate predictor, the optimal quantization step size can be easily obtained from a given bit rate constrain. Finally, the rate control mechanism based on the precise rate predictor is then developed. Built upon MPEG-4 video verification model (VM), our simulation results show that the proposed algorithm achieves superior rate control ability and improves PSNR of reconstructed video signal at the same time.
[1] Ding W. and Liu B., Feb. 1996, “Rate control of MPEG video coding and recording by rate-quantization modeling,” IEEE Transaction on Circuits System Video Technology., Vol. 6, pp. 12–19.
[2] Lin L.-J., Ortega A., and Kuo C.-C. J., Mar 1996, “Rate control using spline-interpolated R-D characteristics,” in Proceeding VCIP, Orlando, FL, pp. 111–122.
[3] ITU-T/SG15, Feb. 1997, Video codec test model, TMN7, Nice.
[4] ITU-T/SG15, June 1997, Video codec test model, TMN8, Portland.
[5] Netravali A. N. and Prasada B., 1997 Apr., “Adaptive quantization of picture signals using spatial masking,” Proceeding IEEE, Vol. 65, pp. 536–548.
[6] Jayant N., Johnston J., and Safranek R., Oct. 1993, “Signal compression based on models of human perception,” Proceeding IEEE, Vol. 81, pp. 1385–1422.
[7] Chitprasert Bowonkoon and Rao K. R., July 1990, “Human Visual Weighted Progressive Image Transmission,” IEEE Transaction on Communications, Vol. 38, No. 7.
[8] Lee Y. H., Huang J. H. and Yang J. F., “Assessment of Activity in Transform Domains”, Signal Processing, (submitted).
[9] Chou C. H. and Li Y. C., Dec. 1995, “A perceptually tuned subband image coder based on the measure of just-noticeable-distortion profile,” IEEE Trans. Circuits and Syst. For Video Technology, Vol. 5, pp. 467-486.
[10] Jain Anil K., “Fundamentals of Digital Image Processing”, pp51.
[11] Ribas-Corbera J. and Lei S., Feb. 1999, “Rate control in DCT video coding for low-delay communications”, IEEE Trans. on Circuits and System for Video Technology, Vol. 9, no.1., pp. 172~185.
[12] Ribas-Corbera J. and Neuhoff D. L., Jan. 1998, “Optimizing block size in motion-compensated video coding,” J. Electron. Imaging, Vol. 7, pp. 155–165.
[13] Nicoulin A., Mattavelli M. , Li W., Basso A., Popat A., and Kunt M., 1993, “Image sequence coding using motion-compensated subband decomposition,” in Motion Analysis and Image Sequence Processing,M. I. Sezan and R. L. Lagendijk, Eds. Norwell, MA: Kluwer Academic, pp. 225–256.
[14] Pierre D. A., 1986, Optimization Theory with Applications. New York: Dover.
[15] Littmann E. and Ritter H., Jan. 1997, ” Adaptive color segmentation-a comparison of neural and statistical methods,” IEEE Transaction on Neural Networks, Vol. 8, No. 1, pp. 175-185.
[16] Liu J. and Yang Y.-H” Multiresolution color image segmentation, July 1994, ” IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol. 16, No. 7, pp. 689-700.
[17] Shafarenko L., Petrou H., and Kittler J., Sep. 1998 “Histogram-based segmentation in perceptually uniform color space,” IEEE Transaction on Image Processing, Vol. 7, No. 9, pp. 1354-1358.
[18] Wan X. and Kuo C.-C. J., Sep. 1998, ” A New Approach to Imave Retrieval with Hierarchical Color Clustering,” IEEE Transaction on Circuit and System for Video Technology, Vol. 8, No. 5, pp. 628-643
[19] Antoszczyszyn P. M., Hannah J. M., and Grant P. M., Aug. 1998, “Reliable tracking of facial features in semantic-based video coding,” IEE Proceedings-Vision, Image and Signal Processing, Vol. 145, No. 4, pp. 257-263.
[20] Special Issue on Very Low Bit Rate Video Coding, June 1994, “IEEE Trans. on Circuits and Systems for Technology”, Vol. 4, No. 3.
[21] Cai J., Goshtasby A. and Yu C., 1998, “Detecting human faces in color images”, Proceedings of the International Workshop in Multi-Media Database Management System, OH, USA, pp. 124-131.
[22] Chai D. and Ngan K. N., 1998, “Locating facial region of a head-and shoulders color image”, Proceedings of the Third IEEE International Conference on Automatic Face and Gesture Recognition, WA, Australia, pp. 124-129
[23] Wu X., Chen H. Q., and Yachida M., 1996, “Detectiong human face in color images”, IEEE International Conference on Systems, Man and Cybernetics, Japan, pp. 2232-2237.
[24] ISO/IEC “Video for audio visual services at P×64kbits/s”, CCITT Recommendation H.261, COM XV-R 37-E, 1990.
[25] ISO/IEC “Video coding for low bitrate communication”, CCITT Recommendation H.263, 1996
[26] ISO/IEC CD 11172 “Coding of moving pictures and associated audio for digital storage media at up to 1.5Mbits/sec”, MPEG-1: Video Document, Dec. 1991
[27] ISO/IEC CD 13818-2, “Generic coding of moving pictures and associated audio”, MPEG-2: Video Document, Nov. 1993
[28] ISO/IEC Final Draft of International Standard, 14496-2 “Information Technology - generic coding of audio-visual objects”, MPEG-4: Video Document, March 1996
[29] T. Koga, K. Iinuma, Y. Hirano, Y. Iijima, and T. Ishiguro, “Motion-compensated interframe coding for video conferencing,” Proc. Nat. Telecommun. Conf., pp.G.5.3.1-G5.3.5, New Orleans, Nov. 1981.
[30]S. Kappagantula and K. R. Rao, "Motion compensated predictive coding, SPIE 27th Proc., vol. 432, pp.G4-70, 1983.
[31] J. F. Yang, S. S. Hao, and W. Y. Lu, “New block matching criteria for VLSI implementation of motion estimation”, IEICE Trans. on Information and Systems, vol. E83-D, no. 4, pp.922-930, April 2000.
[32] H. Gharavi and M. Mills, “Block-matching motion estimation algorithms: new results,” IEEE Trans. on Circ. and Syst., vol. 37, pp.649-651, 1990.
[33] M. J. Chen, L. G. Chen, and T. D. Chiueh, “One-dimensional full search motion estimation algorithm for video coding,“ IEEE Trans. Circuits and Syst. Video Technol., vol. 4, no. 5, pp.504-509, October 1994.
[34] M. Ghanbari, “The cross-search algorithm for motion estimation,“ IEEE Trans. on Commun., vol. 38, pp.950-953, 1990.
[35] B. Liu and A. Zaccarin, “New fast algorithms for the estimation of block motion vectors,” IEEE Trans. on Circ. and Syst., Video Tech., vol. 3, no. 2, pp.148-157, Apr. 1993.
[36] R. Li, B. Zeng, and M. L. Liou, “A new three step search algorithm for block motion estimation,” IEEE Trans. on Circuits Syst. Video Tech., vol. 4, pp. 438-442, Oct. 1994.
[37] L. M. Po and W. C. Ma, ‘A novel four-step search algorithm for fast block motion estimation’, IEEE Trans. on Circuit and Syst. for Video Tech., no. 3, pp. 313-317, 1996.
[38] S. Zhu and K. K. Ma, “A new diamond search algorithm for fast block matching motion estimation,” Information, Communications and Signal Processing, Proceedings of International Conference, vol. 1, pp. 292 –296, 1997.
[39] L. K. Liu and E. Feig, “A block-based gradient descent search algorithm for block motion estimation in video coding,” IEEE Trans. on Circuit and Syst. for Video Tech., vol. 6, no. 4, pp. 419-422, 1996.
[40] L. W. Lee, J. F. Wang, J. Y. Lee and J. D. Shie, “Dynamic search window adjustment and interlaced search for block-matching algorithm,” IEEE Trans. on Circuit and Syst. for Video Tech., vol. 1, pp. 378-385, Feb, 1993.
[41] H. S. Oh and H. K. Lee, “Adaptive adjustment of the search window for block-matching algorithm with variable block size,” IEEE Trans. on Consumer Electronics, vol. 44, no. 3, pp. 659-666, Aug, 1998.
[42] J. B. Xu, L. M. Po, and C. K. Cheung, “A new prediction model search algorithm for fast block motion estimation,” Proc. ICIP, pp. 610-613, 1997.
[43] D. W. Kim, J. S. Choi, and J. T. Kim, “Adaptive motion estimation based on spatial-temporal correlation, ” Signal Processing: Image Comm., vol. 13, pp. 161-170, 1998.
[44] J. R. Jain and A. K. Jain, “Displacement measurement and its application in interframe image coding,” IEEE Trans. on Communications, vol. COM-29, no. 12, pp.1799-1808, Dec. 1981.
[45] R. Srinivasan and K. R. Rao, "Predictive coding based on efficient motion estimation," IEEE Trans. on Communications, vol. COM-33, no. 8, pp.888-896, Aug. 1985.
[46] J. N. Kim and T. S. Choi, “A fast motion estimation for software based real-time video coding,” IEEE Trans. on Consumer Electronics, vol. 45, no. 2, pp. 417-426, May, 1999.
[47] Y. G. Shi, Y. Zhang and L. N. Wu, “Adaptive thresholding for motion estimation prejudgement,” Electronic Letters, vol. 34, no. 21, Oct. 1998.
[48] D. F. Elliot and K. R. Rao, Fast Transforms, Algorithms, Applications, New York, Academic, 1982.
[49] A. K. Jain, “A sinusoidal family of unitary transforms,” IEEE Trans. on Pattern Analysis Machine Intelligence, vol. PAMI-1, pp. 356-365, Oct.1979.
[50] J. F. Yang and C. P. Fan, "Fast structural two dimensional discrete cosine transform algorithms", IEICE Trans. on Fundamentals of Electronics Communications and Computer Sciences, vol. E81-A, no. 6, pp. 1210~1215, June 1998.
[51] S. C. Chan and K. L. Ho, “A new two-dimensional fast cosine transform algorithm”, IEEE Transactions on Signal Processing, vol. 39, no. 2, pp.481-484, 1991.
[52] M. A. Haque, “A two-dimensional fast cosine transforms,” IEEE Transactions on Acoust., Speech, signal Processing, vol. 33, no. 6, pp. 1532-1539, 1985.
[53] H. S. Hou, “A fast recursive algorithm for computing the discrete cosine transform”, IEEE Trans. on Acoust., Speech, Signal Processing, vol. 35, pp. 1455-1461, 1987.
[54] I. M. Pao and M. T. Sun, “Computation reduction for discrete cosine transform”, IEEE Trans. on Circuit and Syst. for Video Tech., vol. 8, no. 3, pp. 264-268, June 1998.
[55] I. M. Pao and M. T. Sun, “Modeling DCT coefficients for fast video encoding,” IEEE Trans. on Circuit and Syst. for Video Tech., vol. 9, no. 4, pp. 608-616, June 1999.
[56] X. Zhou, Z. H. Yu, and S. Y. Yu, “Method for detecting all-zero DCT coefficients ahead of discrete cosine transformation and quantisation,” Electronic Letters, vol. 34, no. 19, Sep. 1998.
[57] ITU-T, SG16, Video codec test model, near term, version 8 (TMN8), June 1997.
[58] ISO/IEC JTC 1/SC 29/WG 11 Coding of Moving Pictures and Audio, Sydney, July 2001.
[59] ITU-T, “Video coding for low bit rate communications,” ITU-T Recommendation H.263, version 1, version 2, Jan. 1998.
[60] T. Chiang, Y.-Q. Zhang, “A New Rate Control Scheme Using Quadratic Rate Distortion Model,” IEEE Trans. Circuits Syst. Video Technol., vol. 7, (Feb. 1997), 246-250
[61] H. Lee, T. Chiang, Y.-Q. Zhang, “Scalable Rate Control for MEPG-4 Video,” IEEE Trans. Circuits Syst. Video Technol., vol. 10, no. 6, (Sep. 2000), 878-894
[62] M.-T. Sun, T. Wu, J.-N. Hwang, “Dynamic Bit Allocation in Video Combining for Multipoint Conferencing,” IEEE Trans. Circuits Syst. Video Technol., vol. 45, no. 5, (May 1998), 644-648
[63] J. Ronda, M. Eckert, F. Jaureguizar, N. Garcia, “Rate Control and Bit Allocation for MPEG-4,” IEEE Trans. Circuits Syst. Video Technol., vol. 9, no. 8, (Dec 1999), 1243-1258.
[64] A. Vetro, H. Sun, Y. Wang, “MPEG-4 Rate Control for Multiple Video Objects,” IEEE Trans. Circuits Syst. Video Technol., vol. 9, no. 1, (Feb. 1999), 186-199.
[65] J. Ribas-Corbera, S. Lei, “Rate Control in DCT Video Coding for Low-Delay Communications,” IEEE Trans. Circuits Syst. Video Technol,. Vol. 7, no. 1, (Feb. 1999), 172-185.
[66 J. Ribas-Corbera, S. Lei, “A Frame-Layer Bit Allocation for H.263+,” IEEE Trans. Circuits Syst. Video Technol., vol. 10, no. 7, (Oct. 2000), 1154-1158.
[67] J. Woods, G. Lilienfield, “A Resolution and Frame-Rate Scalable Subband/Wavelet Video Coder,” IEEE Trans. Circuits Syst. Video Technol., vol. 11, no. 9, (Sep. 2001), 1035-1044.
[68] Y. Wang, S. Hermami, “Rate Control for VBR Video Over ATM Simplification and Implementation,” IEEE Trans. Circuits Syst. Video Technol., vol. 11, no. 9, (Sep. 2001), 1045-1058.
[69] Z. He, Y. Kim, S. Mitra, “Low-Delay Rate Control for DCT Video Coding via -Domain Source Modeling,” IEEE Trans. Circuits Syst. Video Technol., vol. 11, no. 8, (Aug. 2001), 928-940.
[70] K. Ikeda, S. Tanaka, Y. Wang, “Convergence Rate Analysis of Fast Predictor-Based Lease Squares Algorithm,” IEEE Trans. Circuits Syst. Video Technol., vol. 49, no. 1, (Jan. 2002), 11-15.
[71] J. Jiang, E. Edirisinghe, “A Hybrid Scheme for Low Bit-Rate Coding of Stereo Images,” IEEE Trans. Circuits Syst. Video Technol., vol. 11, no. 2, (Feb. 2002), 123-134.
[72] H. Feng, M. Effros, “On the Rate Distortion Performance and Computational Efficiency of the K-L Transform for Lossy Data Compression,” IEEE Trans. Circuits Syst. Video Technol., vol. 11, no. 2, (Feb. 2002), 113-122.
[73] Q. Wang, Z. Xiong, F. Wu, S. Li, “Optimal Rate Allocation for Progresive Fine Granularity Scalable Video Coding,” IEEE Trans. Circuits Syst. Video Technol., vol. 9, no. 2, (Feb. 2002), 33-39.
[74] A. Docef, F. Kossentini, K. Nahuuyen-Phi, I. Ismaeil, “The Quantized DCT and its Application to DCT-Based Video Coding,” IEEE Trans. Circuits Syst. Video Technol., vol. 11, no. 3, (Mar. 2002), 177-187.
[75] Z. He, J. Cai, C. Cheng, “Joint Source Channel Rate-Distortion Analysis for Adaptive Mode Selection and Rate Control in Wireless Video Coding,” IEEE Trans. Circuits Syst. Video Technol., vol. 12, no. 6, (June 2002), 511-523.