簡易檢索 / 詳目顯示

研究生: 史恩希
Shih, En-Shi
論文名稱: 應用於精確2D轉3D視訊之紋理深度圖內插演算法
Texture-based Depth Frame Interpolation for Precise 2D to 3D Conversion
指導教授: 楊家輝
Yang, Jar-Ferr
羅錦興
Luo, Ching-Hsing
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2018
畢業學年度: 106
語文別: 英文
論文頁數: 58
中文關鍵詞: 2D 視訊轉3D深度預測深度內插超像素
外文關鍵詞: 2D-to-3D video, Depth Estimation, Depth Interpolation, SLIC
相關次數: 點閱:59下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著3D科技的發展,3D影片的需求也迅速地增長,而裸眼3D的出現,人們可以不用配戴眼鏡舒適地觀看3D多視角影片。3D視訊須以彩色圖像加深度圖來表示,一般來說,3D多視角內容可以透過基於深度影像生成(DIBR)技術來有效產生。為呈現3D立體資訊,最重要的是產生深度圖。近年來,利用雙視角影像,許多深度估計的方法已被提出。然而,單視角影像之傳統深度圖生成方法,除了有許多條件上的限制之外,其所產生的深度圖品質也不佳。本論文提供了一精確的影片深度圖內插演算法,從原有的彩色幀和已知的關鍵幀之深度圖來產生出未知的非關鍵幀之深度圖。我們提出的深度圖生成系統包含基於紋理的深度估計、誤差補償、雜訊消除和前向/後向深度圖融合。首先預測目標幀每個像素之深度值,透過分析參考幀之顏色資訊來尋找與目標幀相似的點,利用兩種色彩空間去計算成本函數,以最小成本之參考點給定目標點之深度值。整張圖做完後即產生出一張初始深度圖,針對其包含的一些錯誤點做深度補償,接著為使深度圖看起來更加乾淨,我們引用線性疊代聚類(SLIC)來幫助清除超像素內部之雜訊,最後將雙向內插出之深度圖做線性融合。實驗結果展示了經過每一步驟處理後之效果、融合後之效果、邊緣對齊,以及與他人提出之方法做比較,從實驗數據分析效能顯示,本系統能夠成功地估計出視頻序列的深度圖。

    With the development of 3D technology, the demands of 3D films are growing fast. Since the availability of the naked-eyes 3D displays, people can comfortably watch 3D multiview movies. In general, the 3D videos are represented by texture and its depth frames. The 3D multiview contents can be effectively produced by depth-image-based rendering (DIBR). In recent years, many researches are proposed to deal with the estimation of the depth map by using stereo images. If we only have mono-view images, however, the traditional methods for mono-view images to generate their depth maps are limited by the specified scenery and lack of quality. In this thesis, we propose a precise depth map interpolation algorithm to estimate depth maps of unknown non-keyframes from the color frames and pairs of depth keyframes. The proposed depth frame interpolation system contains texture-based depth estimation, error compensation, noise elimination, and forward/backward depth map merging steps. First, we predict the depth value of each pixel of the target frame by searching the similar pixels of the reference frame. The cost function is computed by using two color spaces, and the depth value of the target pixel is assigned as the depth value of the reference pixel which has the minimum cost. After the whole image is completed, an initial depth map is generated, which contains some error pixels that cannot be matched. The error pixels are compensated by referring to the depth values of the eight neighbor pixels. In order to make the depth map look smoother, we adopt simple linear iterative clustering (SLIC) which can generate superpixels to segment the image. The internal noises can be detected and removed in each superpixel. Finally, the bi-directional depth maps are merged by linear weight. The experimental results show the effects of each step, the effect of the fusion, the edge alignment, and the comparison with the method proposed by others. The analyzed experimental results show that the proposed system can successfully estimate the depth maps of video sequences.

    摘 要 .......................................................................................................... I ABSTRACT ....................................................................................................... II CONTENTS ....................................................................................................... IV LIST OF TABLES ................................................................................................. VI LIST OF FIGURES ................................................................................................ VII CHAPTER 1 INTRODUCTION ......................................................................................... 1 1.1 Research Background ........................................................................................ 1 1.1.1 Concept of Stereoscopic Visualization .................................................................... 2 1.1.2 Stereoscopic Displays .................................................................................... 4 1.2 Motivations and Purposes ................................................................................... 7 1.3 Thesis Organization ........................................................................................ 8 CHAPTER 2 RELATED WORKS ........................................................................................ 9 2.1 Depth Estimation with Single Monocular Image ............................................................... 9 2.2 Other Automatic Method for Depth Generation ................................................................ 12 2.3 Semi-automatic Methods for Depth Estimation ................................................................ 14 2.3.1 Lin’s Method ............................................................................................. 14 2.3.2 Wang’s Method ............................................................................................ 15 2.4 SLIC ....................................................................................................... 16 CHAPTER 3 THE PROPOSED TEXTURE-BASED DEPTH INTERPOLATION SYSTEM ................................................ 21 3.1 Overview ................................................................................................... 21 3.2 Depth Estimation Unit ...................................................................................... 22 3.2.1 Depth Region Check ....................................................................................... 23 3.2.2 Texture Comparison ....................................................................................... 24 3.2.3 Similar Pixel Searching .................................................................................. 25 3.2.4 Candidate Pixel Chosen ................................................................................... 31 3.3 Errors Compensation ........................................................................................ 33 3.4 Noise Elimination .......................................................................................... 34 3.5 Depth Merging .............................................................................................. 37 CHAPTER 4 EXPERIMENTAL RESULTS ................................................................................. 39 4.1 The Subjective Results...................................................................................... 39 4.2 The Objective Performance Results .......................................................................... 49 CHAPTER 5: CONCLUSIONS ......................................................................................... 53 CHAPTER 6: FUTURE WORKS ........................................................................................ 55 REFERENCES ..................................................................................................... 56

    [1] Ogle, Kenneth N. "Researches in binocular vision." (1950).
    [2] Fehn, Christoph, René De La Barré, and Siegmund Pastoor. "Interactive
    3-DTV-concepts and key technologies." Proceedings of the IEEE 94.3 (2006):
    524-538.
    [3] Benzie, Philip, et al. "A survey of 3DTV displays: techniques and technologies."
    IEEE transactions on circuits and systems for video technology 17.11 (2007):
    1647-1658.
    [4] Izmantoko, Y. S., Andriyan Bayu Suksmono, and T. L. Mengko. "Implementation
    of anaglyph method for stereo microscope image display." Electrical Engineering
    and Informatics (ICEEI), 2011 International Conference on. IEEE, 2011.
    [5] Kejian, Shi, and Wang Fei. "The development of stereoscopic display
    technology." Advanced Computer Theory and Engineering (ICACTE), 2010 3rd
    International Conference on. Vol. 4. IEEE, 2010.
    [6] Lee, Hyo Jin, et al. "8.2: A High Resolution Autostereoscopic Display Employing
    a Time Division Parallax Barrier." SID Symposium Digest of technical papers.
    Vol. 37. No. 1. Blackwell Publishing Ltd, 2006.
    [7] Dekker, T., et al. "2D/3D switchable displays." Liquid Crystal Materials, Devices,
    and Applications XI. Vol. 6135. International Society for Optics and Photonics,
    2006.
    [8] Fehn, Christoph. "Depth-image-based rendering (DIBR), compression, and
    transmission for a new approach on 3D-TV." Stereoscopic Displays and Virtual
    Reality Systems XI. Vol. 5291. International Society for Optics and Photonics,
    2004.
    [9] Dong, Hao, et al. "An automatic depth map generation method by image
    classification." Consumer Electronics (ICCE), 2015 IEEE International
    Conference on. IEEE, 2015.
    [10] Huang, Yea-Shuan, Fang-Hsuan Cheng, and Yun-Hui Liang. "Creating depth map
    from 2D scene classification." Innovative Computing Information and Control,
    2008. ICICIC'08. 3rd International Conference on. IEEE, 2008.
    [11] Jung, Yong Ju, et al. "A novel 2D-to-3D conversion technique based on relative
    height-depth cue." Stereoscopic Displays and Applications XX. Vol. 7237.
    International Society for Optics and Photonics, 2009.
    [12] Chou, Chien-Hsing, Yu-Xiang Zhao, and Hsien-Pang Tai. "Vanishing-Point
    Detection Based on a Fuzzy Clustering Algorithm and New Clustering Validity
    Measure." 淡江理工學刊18.2 (2015): 105-116.
    [13] Basu, Sugato, Arindam Banerjee, and Raymond Mooney. "Semi-supervised
    clustering by seeding." In Proceedings of 19th International Conference on
    Machine Learning (ICML-2002. 2002.
    [14] Shi, Jianbo, and Jitendra Malik. "Normalized cuts and image segmentation." IEEE
    Transactions on pattern analysis and machine intelligence 22.8 (2000): 888-905.
    [15] Kanade, Takeo, and Masatoshi Okutomi. "A stereo matching algorithm with an
    adaptive window: Theory and experiment." IEEE transactions on pattern analysis
    and machine intelligence16.9 (1994): 920-932.
    [16] Geng, Jason. "Structured-light 3D surface imaging: a tutorial." Advances in Optics
    and Photonics 3.2 (2011): 128-160.
    [17] Gokturk, S. Burak, Hakan Yalcin, and Cyrus Bamji. "A time-of-flight depth
    sensor-system description, issues and solutions." Computer Vision and Pattern
    Recognition Workshop, 2004. CVPRW'04. Conference on. IEEE, 2004.
    [18] Barron, John L., et al. "Performance of optical flow techniques." Computer Vision
    and Pattern Recognition, 1992. Proceedings CVPR'92., 1992 IEEE Computer
    Society Conference on. IEEE, 1992.
    [19] Long, Jonathan, Evan Shelhamer, and Trevor Darrell. "Fully convolutional
    networks for semantic segmentation." Proceedings of the IEEE conference on
    computer vision and pattern recognition. 2015.
    [20] Lin, Guo-Shiang, Jian-Fa Huang, and Wen-Nung Lie. "Semi-automatic 2D-to-3D
    video conversion based on depth propagation from key-frames." Image
    Processing (ICIP), 2013 20th IEEE International Conference on. IEEE, 2013.
    [21] Wang, Hung-Ming, Chun-Hao Huang, and Jar-Ferr Yang. "Block-based depth
    maps interpolation for efficient multiview content generation." IEEE Transactions
    on Circuits and Systems for Video Technology 21.12 (2011): 1847-1858.
    [22] Achanta, Radhakrishna, et al. "SLIC superpixels compared to state-of-the-art
    superpixel methods." IEEE transactions on pattern analysis and machine
    intelligence 34.11 (2012): 2274-2282.

    無法下載圖示 校內:2023-07-16公開
    校外:不公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE