簡易檢索 / 詳目顯示

研究生: 周韋志
Chou, Wei-Chih
論文名稱: 在視訊編碼中利用顯著特點增強視覺呈現之研究
Enhancement of Visual Presentation by Prominent Features in Video Coding
指導教授: 郭致宏
Kuo, Chih-Hung
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2011
畢業學年度: 99
語文別: 中文
論文頁數: 103
中文關鍵詞: 可調式編碼感興趣區塊任意視訊影像縮放畫框內預測多視角及深度影像壓縮
外文關鍵詞: Region-of-Interest, video resizing, intra-prediction, depth image
相關次數: 點閱:105下載:3
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本篇論文提出兩種利用顯著特點來增強視覺效果呈現的方法。首先提出在視訊影像調整時能維護感興趣區域原始長寬比的方法,並且應用於可調式視訊編碼標準,可以快速且有效率地決定視訊影像中運動物體為感興趣區域。實驗結果可證明我們所提出的方法不僅允許任意調整視訊影像大小,而且還可以保持重要顯著特點的原始比例,讓觀看者對於感興趣區塊有更好的品質。影像解碼時,小幅度的時間增加可以證明我們所提出的架構具有低計算複雜度的特性。接下來我們也提出了一種多視角影像及深度影像壓縮的方法,包含快速決定將使用的畫框內預測模式和多視角影像的預測及建立。前者是基於圖像內容的物體邊緣方向來進行模式預測;後者使用一個單一視角影像和相對應的深度影像來預測另一視角的影像,取代現今壓縮標準中在預測時所需大量計算來降低編碼時所需時間。

    In this thesis, we propose two different algorithms to enhance the visual presentation by prominent features in video coding. First, we proposed a video resizing method for Scalable Video Coding(SVC)to keep the original aspect ratio of region-of-interest(ROI). The ROI is determined effectively by extracting the moving objects in the video sequence. Experimental results show that our method not only allows arbitrary video resizing, but also maintains the original ratio of important prominent features. Small overhead of decoding time justifies that the proposed framework has low computational complexity. Second, we also present a new method for multiview and depth image compression, including fast intra-prediction mode decision and multiview image reconstruction. The former is based on the direction detection of the objects boundary inside the image. The latter uses one single view and the corresponding depth image to predict the image of the other. The proposed scheme saves a lot of computing in the prediction process to reduce the encoding time.

    中文摘要 I ABSTRACT II 誌謝 III 目錄 IV 圖目錄 VII 表目錄 XII 第一章 緒論 1 1-1 感興趣區塊定義及影像尺寸調整 1 1-2 三維影像簡介 2 1-3 研究動機 5 1-3-1 針對感興趣區塊做影像尺寸重新調整 5 1-3-2 3D影像快速壓縮 6 1-4 研究貢獻 7 1-5 論文架構 8 第二章 研究背景 9 2-1 可調式視訊編碼(SVC)概述 9 2-1-1 可調式編解碼器(SVC codec) 10 2-1-2 階層式B畫面架構(Hierarchical B Frame) 12 2-1-3 時間可調性(Temporal Scalability) 13 2-1-4 空間可調性(Spatial Scalability) 14 2-1-5 訊雜比可調性(SNR Scalability) 16 2-2 彈性巨方塊排序(Flexible Macro-block Ordering) 16 2-3 H.264/AVC視訊編碼器 18 2-4 多視角視訊壓縮(Multi-view Video Coding)簡介 24 2-5 相關論文研究 25 2-4-1 利用視覺節奏分析定義感興趣區塊 26 2-4-2 影像重要圖及尺寸大小調整 33 2-4-3 高效率的多視角影像平行編解碼架構 36 2-4-4 深度影像壓縮方法 37 2-4-5 快速畫框間預測模式 39 第三章 感興趣區塊偵測及尺寸重新調整 42 3-1 視頻序列中的移動區域 43 3-2 感興趣區塊偵測之實現流程 45 3-2-1 對比度擴展 45 3-2-2 快速區分感興趣區塊演算法 47 3-2-3 修正判斷錯誤的巨方塊 50 3-3 視訊影像尺寸重新調整之方法 53 3-3-1 圖像群組中的感興趣區塊 53 3-3-2 感興趣區域形狀修正 54 3-3-3 視訊影像尺寸調整 55 第四章 深度影像壓縮與多視角影像預測 58 4-1 快速決策畫框內預測模式 59 4-1-1 畫框內預測模式選擇 59 4-1-2 畫框內4×4模式預測 62 4-1-3 畫框內16×16模式預測 63 4-1-4 畫框內8×8模式預測 66 4-2 畫框內預測模式的相似性 67 4-3 視角影像預測(Inter-view prediction) 69 4-3-1 數學模型推導 70 4-3-2 未知係數估測 71 4-3-3 視角影像預測結果 73 4-3-4 預測影像的編碼 75 第五章 系統模擬與實驗結果分析 77 5-1 感興趣區塊偵測實驗結果與比較 77 5-2 影像尺寸調整後的畫面分析與比較 79 5-2-1 任意調整影像大小結果 80 5-2-2 連續影像呈現 84 5-2-3 不同演算法結果比較 86 5-2-4 演算法速度分析 86 5-3 快速畫框內模式預測結果與分析 87 5-3-1 H.264/AVC編解碼器設定 87 5-3-2 畫框內預測模式選擇實驗結果 88 5-3-3 快速決定畫框內預測模式實驗結果 89 5-3-4 使用深度影像編碼後資訊的實驗結果 90 5-4 視角影像預測結果與分析 91 第六章 結論與未來展望 96 6-1 結論 96 6-2 未來展望 97 參考文獻 98

    [1] W.H. Cheng, W.T. Chu, J.H. Kuo, and J.L. Wu, “Automatic video region-of-interest determination based on user attention model,” in Proceedings of IEEE International Symposium on Circuits and Systems, vol. 4, pp. 3219-3222, May 2005.
    [2] http://www.hitech-projects.com/euprojects/attest/summary.htm
    [3] Christoph Fehn, “A 3D-TV system based on video plus depth information,” Asilomar Conference on Signal, System and Computers, vol. 2, pp. 1529-1533, 2003
    [4] A. Smolic, H. Kimata, and A. Vetro, “Development of MPEG standards for 3-D and free viewpoint video,” in Proc. SPIE vol. 6016, Boston, MA, 2005, pp. 60160R-1–60160R-12.
    [5] Markus Flierl and Bernd Girod, “Multiview video compression,” IEEE Signal Processing Mag. , Nov. 2007.
    [6] Sunghwan Chun, Seoyoung Lee, Kwangmu Shin, and Kidong Chung, “An enhanced multi-view video compression using the constraint inter-view prediction,” in Proc. ACM symposium on Applied Computing, Mar. 2009.
    [7] F. Christoph, “Depth-image-based rendering (DIBR), compression, and transmission for a new approach on 3D-TV,” SPIE on Stereoscopic Displays and Virtual Reality Systems XI, vol.5291, pp.93-104, 2004
    [8] H. Schwarz, D. Marpe, and T. Wiegand, “Overviews of the scalable video coding extension of the H.264/AVC standard,” IEEE Trans. on Circuits and Systems for Video Technology, vol. 17, no. 9, pp. 1103-1120, 2007
    [9] Q. Wang, C. Hu, and Z. Wang, “Spatially Scalable Video Coding Based on Hybrid Epitomic Resizing,”Data Compression Conference (DCC), pp. 139-148, Apr. 2010.
    [10] 3GPP2 C.S002-C Version 1.0, “Physical Layer Standard for CDMA2000 Spread Spectrum Systems”, May 2002.
    [11] D.J. Le Gall, “The MPEG video compression algorithm,” Signal Processing: Image Communication, vol. 4, pp. 129-140, 1992.
    [12] ISO/IEC JTC1/SC29/WG11, “MPEG-4 video verification model version 18.0,” N3908, Jan. 2001.
    [13] S. Liu, P. Lai, D. Tian, Gomila, C. and C. W. Chen, “Sparse Dyadic Mode For Depth Map Compression,” in Proc. IEEE International Conference on Image Processing (ICIP), pp. 3421-24, Dec. 2010.
    [14] Yanjie Li and Lifeng Sun, “A Novel Upsampling Scheme For Depth Map Compression In 3DTV System,” in Proc. Picture Coding Symposium (PCS), pp. 186-90, Dec. 2010.
    [15] H. Schwarz, D. Marpe, and T. Wiegand, “Hierarchical B pictures,” Joint Video. Team, Doc. JVT-P014, Poznan, Poland, 2005.
    [16] H. Schwarz, D. Marpe, and T. Wiegand, “Analysis of hierarchical B pictures and MCTF,” in IEEE International Conference on Multimedia and Expo, pp. 1929–1932, 2006.
    [17] S. Cei and P. Cosman, “Comparison of error concealment strategies for MPEG video,” in IEEE Wireless Communications and Networking Conference, 1999. WCNC, pp. 329–333, 1999.
    [18] H. Schwarz, D. Marpe, and T. Wiegand, “Overview of the scalable H. 264/MPEG4-AVC extension,” in 2006 IEEE International Conference on Image Processing, pp. 161–164, 2006.
    [19] P. Lambert, W. De Neve, Y. Dhondt, and R. Van de Walle, “Flexible macroblock ordering in H.2641AVC,”Elsevier Journal of Visual Communication and Image Representation, vol. 17, no. 2, pp. 358-375, April 2006.
    [20] S. Wenger and M. Horowitz, “FMO: flexible macroblock ordering,” ITU-T JVT- C, 2002.
    [21] M. Tagliasacchi, A. Trapanese, S. Tubaro, J. Ascenso, C. Brites, and F. Pereira, “Intra mode decision based on spatio-temporal cues in pixel domain Wyner-Ziv video coding,” in Proc. ICASSP, Toulouse, France, May 2006, pp. 57–60.
    [22] Z.H. Wang, W. Gao, D.B. Zhao, and Q.M. Huang, “A Fast Intra Mode Decision Algorithm For AVS To H.264 transcoding,” International Conference on Multimedia and Expo (ICME), 2006
    [23] H. Yang, J. Huo, Y. Chang, S. Lin, S. Gao, and L. Xiong, “Inter-view motion skipped multi-view video coding with fine motion matching,” ISO/IEC JTC 1/WG11, doc. JVT-Y037, Oct. 2007.
    [24] K.J. Oh and Y.S. Ho, “Global disparity compensation for multi-view video coding,” Journal of the Korean Society of Broadcast Engineers, vol. 12, no. 6, 2007.
    [25] T.Y. Kuo, C.K. Yeh, and H.Y. Tsai, “A novel method for global disparity vector estimation in multiview video coding,” IEEE International Symposium on Circuits and Systems (ISCAS), pp. 864-867, May 2009.
    [26] D.H. Han, Y.L. Lee, “Fast mode decision using global disparity vector for multiview video coding,” in Proc. International Conference on Future Generation Communication and Networking Symposia, Dec. 2008.
    [27] T.Y. Kuo, Y.Y. Lai, and H.Y. Tsai, “Fast mode decision for non-Anchor picture in multiview video coding,” IEEE International Symposium on Broadband Multimedia Systems and Broadcasting, Mar. 2010.
    [28] Y. Yang, G. Jiang, M. Yu, R. Fu, and Y.S. Ho, “Analysis on transmission performance for multiview video coding schemes,” 4th International Conference on Image and Graphics, Aug. 2007.
    [29] L.F. Ding, W.Y. Chen, P.K. Tsung, T.D. Chuang, P.H. Hsiao, Y.H. Chen, H.K. Chiu, S.Y. Chien, and L.G. Chen, “A 212 MPixels/s 4096×2160p multiview video encoder chip for 3D/quad full HDTV applications,” in IEEE Journal of Solid-state Circuits, vol. 45, no. 1, Jan, 2010.
    [30] M.C. Chi, C.H. Yeh, and M.J. Chen, “Robust region-of-interest determination based on user attention model through visual rhythm analysis,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 19, No. 7, pp. 1025-1038, Jul. 2009.
    [31] Y. Wang, C. Tai, O. Sorkine, and T. Y. Lee, “Optimized Scale-and-Stretch for Image Resizing,” ACM Transactions, Graph., 27(5), 118:1–118:8, Dec. 2008.
    [32] K. Ugur, H. Liu, J. Lainema, M. Gabbouj, and H. Li, “Parallel encoding - decoding operation for multiview video coding with high coding efficiency,” 3D TV conference, pp. 1-4, May 2007.
    [33] W.G. Lin, A.C. Tsai, J.F. Wang, and J.F. Yang, “A Simple Direction Detection Algorithm for Fast H.264 Intra Prediction,” in Proc. IEEE TENCON, pp.1-4, Oct. 2007.
    [34] L. Itti, C. Koch, and E. Niebur, “A model of saliency-based visual attention for rapid scene analysis, “ IEEE Trans. Pattern Anal. Machine Intell., vol. 20, no. 11, pp. 1254–1259, Nov. 1998
    [35] S. Avidan and A. Shamir, “Seam carving for content-aware image resizing,” in Proceedgins of SIGGRAPH, 2007.
    [36] Rubinstein M., Shamir A., and Avidan S., “Improved seam carving for video retargeting,” ACM Transaction , Graphics (SIGGRAPH), vol. 27, 2008.
    [37] L. Wolf, M. Guttmann, and D. Cohen-Or, “Non-homogeneous content-driven video-retargeting,” In Proceedings of the Eleventh IEEE International Conference on Computer Vision (ICCV-07), 2007.
    [38] R. Cheng and K. Nahrstedt, “Empirical study of 3d video source coding for autostereoscopic displays,” in MM. ACM, pp. 573–576, 2007.
    [39] SG Chang, Z. Cvetkovic, and M. Vetterli, “Locally adaptive wavelet-based image interpolation,” TIP, vol. 15, no. 6, pp. 1471–1485, 2006.
    [40] F. Pan, X. Lin, S. Rahardja, K. P. Lim, Z. G. Li, D. Wu, and S. Wu, “Fast mode decision algorithm for intra prediction in H.264/AVC video coding,” IEEE Trans. on Circuits and Syst. Video Technol., vol. 15, no. 7, pp. 813–822, July 2005.
    [41] Q. Yuan, Y. Chen, and Y. Kang, ”A Fast Region-Based Inter Mode Selection Algorithm,” International Conference on Multimedia and Expo(ICME), pp. 829-832, Apr. 2008.
    [42] D.Y. Wang and S.X. Sun, ”Region-based Rate Control and Bit Allocation For Video Coding,” International Conference on Apperceiving Computing and Intelligence Analysis (ICACIA), pp. 147-152, Dec. 2008.
    [43] “SVC reference software.” http://ip.hhi.de/imagecom_G1/savce/downloads/ SVC-Reference-Software.htm.
    [44] Reference Software JM17.0 Joint Video Team. [Online]. Available: http://iphome.hhi.de/suehring/tml/download/
    [45] A. Ganguly and A. Mahanta, “Fast Mode Decision Algorithm for H.264/AVC Intra Prediction,” in TENCON, 2009.
    [46] Q. Wang, C. Hu, and Z. Wang, “Spatially Scalable Video Coding Based on Hybrid Epitomic Resizing,”Data Compression Conference (DCC), pp. 139-148, Apr. 2010.
    [47] D.Y. Chen and Y.S. Luo, “Content-Aware Video Seam Carving Based on Bag of Visual Cubes,” International Conference on Intelligent Information Hiding and Multimedia Signal Processing, pp. 615-618, Oct. 2010

    下載圖示 校內:2013-09-02公開
    校外:2013-09-02公開
    QR CODE