研究生: |
周韋志 Chou, Wei-Chih |
---|---|
論文名稱: |
在視訊編碼中利用顯著特點增強視覺呈現之研究 Enhancement of Visual Presentation by Prominent Features in Video Coding |
指導教授: |
郭致宏
Kuo, Chih-Hung |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
論文出版年: | 2011 |
畢業學年度: | 99 |
語文別: | 中文 |
論文頁數: | 103 |
中文關鍵詞: | 可調式編碼 、感興趣區塊 、任意視訊影像縮放 、畫框內預測 、多視角及深度影像壓縮 |
外文關鍵詞: | Region-of-Interest, video resizing, intra-prediction, depth image |
相關次數: | 點閱:105 下載:3 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本篇論文提出兩種利用顯著特點來增強視覺效果呈現的方法。首先提出在視訊影像調整時能維護感興趣區域原始長寬比的方法,並且應用於可調式視訊編碼標準,可以快速且有效率地決定視訊影像中運動物體為感興趣區域。實驗結果可證明我們所提出的方法不僅允許任意調整視訊影像大小,而且還可以保持重要顯著特點的原始比例,讓觀看者對於感興趣區塊有更好的品質。影像解碼時,小幅度的時間增加可以證明我們所提出的架構具有低計算複雜度的特性。接下來我們也提出了一種多視角影像及深度影像壓縮的方法,包含快速決定將使用的畫框內預測模式和多視角影像的預測及建立。前者是基於圖像內容的物體邊緣方向來進行模式預測;後者使用一個單一視角影像和相對應的深度影像來預測另一視角的影像,取代現今壓縮標準中在預測時所需大量計算來降低編碼時所需時間。
In this thesis, we propose two different algorithms to enhance the visual presentation by prominent features in video coding. First, we proposed a video resizing method for Scalable Video Coding(SVC)to keep the original aspect ratio of region-of-interest(ROI). The ROI is determined effectively by extracting the moving objects in the video sequence. Experimental results show that our method not only allows arbitrary video resizing, but also maintains the original ratio of important prominent features. Small overhead of decoding time justifies that the proposed framework has low computational complexity. Second, we also present a new method for multiview and depth image compression, including fast intra-prediction mode decision and multiview image reconstruction. The former is based on the direction detection of the objects boundary inside the image. The latter uses one single view and the corresponding depth image to predict the image of the other. The proposed scheme saves a lot of computing in the prediction process to reduce the encoding time.
[1] W.H. Cheng, W.T. Chu, J.H. Kuo, and J.L. Wu, “Automatic video region-of-interest determination based on user attention model,” in Proceedings of IEEE International Symposium on Circuits and Systems, vol. 4, pp. 3219-3222, May 2005.
[2] http://www.hitech-projects.com/euprojects/attest/summary.htm
[3] Christoph Fehn, “A 3D-TV system based on video plus depth information,” Asilomar Conference on Signal, System and Computers, vol. 2, pp. 1529-1533, 2003
[4] A. Smolic, H. Kimata, and A. Vetro, “Development of MPEG standards for 3-D and free viewpoint video,” in Proc. SPIE vol. 6016, Boston, MA, 2005, pp. 60160R-1–60160R-12.
[5] Markus Flierl and Bernd Girod, “Multiview video compression,” IEEE Signal Processing Mag. , Nov. 2007.
[6] Sunghwan Chun, Seoyoung Lee, Kwangmu Shin, and Kidong Chung, “An enhanced multi-view video compression using the constraint inter-view prediction,” in Proc. ACM symposium on Applied Computing, Mar. 2009.
[7] F. Christoph, “Depth-image-based rendering (DIBR), compression, and transmission for a new approach on 3D-TV,” SPIE on Stereoscopic Displays and Virtual Reality Systems XI, vol.5291, pp.93-104, 2004
[8] H. Schwarz, D. Marpe, and T. Wiegand, “Overviews of the scalable video coding extension of the H.264/AVC standard,” IEEE Trans. on Circuits and Systems for Video Technology, vol. 17, no. 9, pp. 1103-1120, 2007
[9] Q. Wang, C. Hu, and Z. Wang, “Spatially Scalable Video Coding Based on Hybrid Epitomic Resizing,”Data Compression Conference (DCC), pp. 139-148, Apr. 2010.
[10] 3GPP2 C.S002-C Version 1.0, “Physical Layer Standard for CDMA2000 Spread Spectrum Systems”, May 2002.
[11] D.J. Le Gall, “The MPEG video compression algorithm,” Signal Processing: Image Communication, vol. 4, pp. 129-140, 1992.
[12] ISO/IEC JTC1/SC29/WG11, “MPEG-4 video verification model version 18.0,” N3908, Jan. 2001.
[13] S. Liu, P. Lai, D. Tian, Gomila, C. and C. W. Chen, “Sparse Dyadic Mode For Depth Map Compression,” in Proc. IEEE International Conference on Image Processing (ICIP), pp. 3421-24, Dec. 2010.
[14] Yanjie Li and Lifeng Sun, “A Novel Upsampling Scheme For Depth Map Compression In 3DTV System,” in Proc. Picture Coding Symposium (PCS), pp. 186-90, Dec. 2010.
[15] H. Schwarz, D. Marpe, and T. Wiegand, “Hierarchical B pictures,” Joint Video. Team, Doc. JVT-P014, Poznan, Poland, 2005.
[16] H. Schwarz, D. Marpe, and T. Wiegand, “Analysis of hierarchical B pictures and MCTF,” in IEEE International Conference on Multimedia and Expo, pp. 1929–1932, 2006.
[17] S. Cei and P. Cosman, “Comparison of error concealment strategies for MPEG video,” in IEEE Wireless Communications and Networking Conference, 1999. WCNC, pp. 329–333, 1999.
[18] H. Schwarz, D. Marpe, and T. Wiegand, “Overview of the scalable H. 264/MPEG4-AVC extension,” in 2006 IEEE International Conference on Image Processing, pp. 161–164, 2006.
[19] P. Lambert, W. De Neve, Y. Dhondt, and R. Van de Walle, “Flexible macroblock ordering in H.2641AVC,”Elsevier Journal of Visual Communication and Image Representation, vol. 17, no. 2, pp. 358-375, April 2006.
[20] S. Wenger and M. Horowitz, “FMO: flexible macroblock ordering,” ITU-T JVT- C, 2002.
[21] M. Tagliasacchi, A. Trapanese, S. Tubaro, J. Ascenso, C. Brites, and F. Pereira, “Intra mode decision based on spatio-temporal cues in pixel domain Wyner-Ziv video coding,” in Proc. ICASSP, Toulouse, France, May 2006, pp. 57–60.
[22] Z.H. Wang, W. Gao, D.B. Zhao, and Q.M. Huang, “A Fast Intra Mode Decision Algorithm For AVS To H.264 transcoding,” International Conference on Multimedia and Expo (ICME), 2006
[23] H. Yang, J. Huo, Y. Chang, S. Lin, S. Gao, and L. Xiong, “Inter-view motion skipped multi-view video coding with fine motion matching,” ISO/IEC JTC 1/WG11, doc. JVT-Y037, Oct. 2007.
[24] K.J. Oh and Y.S. Ho, “Global disparity compensation for multi-view video coding,” Journal of the Korean Society of Broadcast Engineers, vol. 12, no. 6, 2007.
[25] T.Y. Kuo, C.K. Yeh, and H.Y. Tsai, “A novel method for global disparity vector estimation in multiview video coding,” IEEE International Symposium on Circuits and Systems (ISCAS), pp. 864-867, May 2009.
[26] D.H. Han, Y.L. Lee, “Fast mode decision using global disparity vector for multiview video coding,” in Proc. International Conference on Future Generation Communication and Networking Symposia, Dec. 2008.
[27] T.Y. Kuo, Y.Y. Lai, and H.Y. Tsai, “Fast mode decision for non-Anchor picture in multiview video coding,” IEEE International Symposium on Broadband Multimedia Systems and Broadcasting, Mar. 2010.
[28] Y. Yang, G. Jiang, M. Yu, R. Fu, and Y.S. Ho, “Analysis on transmission performance for multiview video coding schemes,” 4th International Conference on Image and Graphics, Aug. 2007.
[29] L.F. Ding, W.Y. Chen, P.K. Tsung, T.D. Chuang, P.H. Hsiao, Y.H. Chen, H.K. Chiu, S.Y. Chien, and L.G. Chen, “A 212 MPixels/s 4096×2160p multiview video encoder chip for 3D/quad full HDTV applications,” in IEEE Journal of Solid-state Circuits, vol. 45, no. 1, Jan, 2010.
[30] M.C. Chi, C.H. Yeh, and M.J. Chen, “Robust region-of-interest determination based on user attention model through visual rhythm analysis,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 19, No. 7, pp. 1025-1038, Jul. 2009.
[31] Y. Wang, C. Tai, O. Sorkine, and T. Y. Lee, “Optimized Scale-and-Stretch for Image Resizing,” ACM Transactions, Graph., 27(5), 118:1–118:8, Dec. 2008.
[32] K. Ugur, H. Liu, J. Lainema, M. Gabbouj, and H. Li, “Parallel encoding - decoding operation for multiview video coding with high coding efficiency,” 3D TV conference, pp. 1-4, May 2007.
[33] W.G. Lin, A.C. Tsai, J.F. Wang, and J.F. Yang, “A Simple Direction Detection Algorithm for Fast H.264 Intra Prediction,” in Proc. IEEE TENCON, pp.1-4, Oct. 2007.
[34] L. Itti, C. Koch, and E. Niebur, “A model of saliency-based visual attention for rapid scene analysis, “ IEEE Trans. Pattern Anal. Machine Intell., vol. 20, no. 11, pp. 1254–1259, Nov. 1998
[35] S. Avidan and A. Shamir, “Seam carving for content-aware image resizing,” in Proceedgins of SIGGRAPH, 2007.
[36] Rubinstein M., Shamir A., and Avidan S., “Improved seam carving for video retargeting,” ACM Transaction , Graphics (SIGGRAPH), vol. 27, 2008.
[37] L. Wolf, M. Guttmann, and D. Cohen-Or, “Non-homogeneous content-driven video-retargeting,” In Proceedings of the Eleventh IEEE International Conference on Computer Vision (ICCV-07), 2007.
[38] R. Cheng and K. Nahrstedt, “Empirical study of 3d video source coding for autostereoscopic displays,” in MM. ACM, pp. 573–576, 2007.
[39] SG Chang, Z. Cvetkovic, and M. Vetterli, “Locally adaptive wavelet-based image interpolation,” TIP, vol. 15, no. 6, pp. 1471–1485, 2006.
[40] F. Pan, X. Lin, S. Rahardja, K. P. Lim, Z. G. Li, D. Wu, and S. Wu, “Fast mode decision algorithm for intra prediction in H.264/AVC video coding,” IEEE Trans. on Circuits and Syst. Video Technol., vol. 15, no. 7, pp. 813–822, July 2005.
[41] Q. Yuan, Y. Chen, and Y. Kang, ”A Fast Region-Based Inter Mode Selection Algorithm,” International Conference on Multimedia and Expo(ICME), pp. 829-832, Apr. 2008.
[42] D.Y. Wang and S.X. Sun, ”Region-based Rate Control and Bit Allocation For Video Coding,” International Conference on Apperceiving Computing and Intelligence Analysis (ICACIA), pp. 147-152, Dec. 2008.
[43] “SVC reference software.” http://ip.hhi.de/imagecom_G1/savce/downloads/ SVC-Reference-Software.htm.
[44] Reference Software JM17.0 Joint Video Team. [Online]. Available: http://iphome.hhi.de/suehring/tml/download/
[45] A. Ganguly and A. Mahanta, “Fast Mode Decision Algorithm for H.264/AVC Intra Prediction,” in TENCON, 2009.
[46] Q. Wang, C. Hu, and Z. Wang, “Spatially Scalable Video Coding Based on Hybrid Epitomic Resizing,”Data Compression Conference (DCC), pp. 139-148, Apr. 2010.
[47] D.Y. Chen and Y.S. Luo, “Content-Aware Video Seam Carving Based on Bag of Visual Cubes,” International Conference on Intelligent Information Hiding and Multimedia Signal Processing, pp. 615-618, Oct. 2010