簡易檢索 / 詳目顯示

研究生: 李柏勳
Li, Bo-Syun
論文名稱: 基於移動分佈之內容適應性增強深度圖演算法
Content-adaptive Depth Map Enhancement Algorithm Based on Motion Distribution
指導教授: 李國君
Lee, Gwo Giun
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2014
畢業學年度: 102
語文別: 英文
論文頁數: 149
中文關鍵詞: 增強深度圖動態估測影像分割相機動態情境共同發生矩陣
外文關鍵詞: Depth map enhancement, motion estimation, video segmentation, camera motion scenario, co-occurrence matrix
相關次數: 點閱:98下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 此論文提出一個基於移動分佈之內容適應性增強深度圖演算,來改善深度圖的品質與降低合成虛擬視角的不自然。在特定的相機移動的情境下,可由移動分佈資訊來萃取出對應的深度線索。相機在水平方向上平移的情境下,距離相機較近的物件,則會有較大的移動量,則越遠的物件則較小。在這個情境下,可由移動分佈來萃取出影像中物件之間的距離關係。此外在相機平移與靜止的情境下,相機與一個物件的各個部位之間存在著非常相似與一致性的距離。因此,一個物件的深度值不應該會有劇烈的變化。除此之外,本論文也提供了雙向性移動補償無限脈衝響應低通濾波器來增強深度值在時間領域上的一致性。本論文的貢獻在於使用這些深度線索與移動分佈資訊來增加深度圖在時間與空間領域上的一致性與穩定性。由實驗結果以及與相關文獻之比較,本論文不管在主觀與客觀評估都可以得到更好的合成虛擬視角品質。

    This thesis proposes a motion-based content-adaptive depth map enhancement algorithm to enhance the quality of the depth map and reduce the artifacts in synthesized virtual views. A depth cue is extracted from the motion distribution at one specific moving camera scenario. In the camera horizontal panning scenario, the nearer the distance between the object and the camera, the larger the motion will be, and vice versa. The relative distances between the camera and objects will be obtained from the motion distribution in this scenario. Moreover, the distance between a moving object and the camera should be similar and consistent in either camera-fixed or camera-panning scenarios. Thus, depth values of the moving object should not be intense variant. This thesis also provides the bi-directional motion-compensated infinite impulse response depth low-pass filter to enhance the consistency of depth maps in the temporal domain. The contribution of this thesis uses these depth cues and motion distribution to enhance stability and consistency of depth maps in the spatial-temporal domain. Experimental results show that the synthesized results would be better in both objective and subjective measurement when compared with the synthesized results using original depth maps and related depth map enhancement algorithms.

    摘 要 i Abstract iii 誌 謝 v Table of Contents vii List of Tables ix List of Figures xi Chapter 1 Introduction 1 1.1 Introduction 1 1.2 Motivation 7 1.3 Organization of this Thesis 8 Chapter 2 Background Information 9 2.1 Depth Generation 9 2.2 Stereoscopic View Synthesis 11 Chapter 3 Surveys of Related Works in the Literatures 13 3.1 Statistics Feature 13 3.1.1 First-order Statistics Feature 13 3.1.2 Second-order Statistics Feature 15 3.2 Classification and Clustering 17 3.2.1 K-means Clustering 17 3.2.2 K-medoids Clustering 20 3.3 Depth Cue from Monocular Sequence 21 3.3.1 Depth Cue from the Geometry Perspective 21 3.3.2 Depth Cue from Focus or Defocus Analysis 23 3.3.3 Depth Cue from Motion Information 25 3.4 Depth Map Enhancement 28 3.4.1 Spatial Domain 29 3.4.2 Temporal Domain 32 3.4.3 Inter-view Domain 34 Chapter 4 Proposed Algorithms 37 4.1 Block Diagram 37 4.2 Moving Camera Scenario Detection 40 4.2.1 Adaptive Combine Motion Vector 44 4.2.2 Co-occurrence Matrix of Motion Vector 46 4.2.3 Camera Motion Scenario Detection based on Co-occurrence Matrix Pattern 52 4.2.4 Experiment Results and Comparison 56 4.3 Foreground and Background Segmentation 65 4.3.1 Motion-based Coarse Foreground and Background Segmentation 65 4.3.2 Color-based K-means Clustering 67 4.3.3 Refined Foreground and Background Segmentation 72 4.3.3.1 Merging Process 76 4.3.3.2 Shadow Removal Process 80 4.4 Depth Map Enhancement 84 4.4.1 Unnatural Depth Removal 86 4.4.2 Check Depth Level 90 4.4.3 Scale Change of Object 95 4.4.4 Bi-directional Motion-compensated Infinite Impulse Response Depth Low-pass Filter 97 4.4.4.1 Depth Value Extraction Combined with Color Information 101 4.4.4.2 Check the Accuracy of the Motion Vector 103 4.4.5 Comparison of Bi-directional and Uni-directional Motion-compensated Infinite Impulse Response Depth Low-pass Filter 105 4.4.6 Joint Bilateral Filter 107 Chapter 5 Experiment Results and Comparison 109 5.1 Subjective Comparison 109 5.2 Objective Comparison 134 Chapter 6 Conclusions and Future Works 141 6.1 Conclusions 141 6.2 Future Works 142 References 143

    [1] K. Muller, H. Schwarz, D. Marpe, C. Bartnik, S. Bosse, et al., "3D High-Efficiency Video Coding for Multi-View Video and Depth Data," Image Processing, IEEE Transactions on, vol. 22, pp. 3366-3378, 2013.
    [2] A.Vetro, T.Wiegand, and G. J.Sullivan, "Overview of the Stereo and Multiview Video Coding Extensions of the H.264/MPEG-4 AVC Standard," Proceedings of the IEEE, vol. 99, pp. 626-642, 2011.
    [3] M. Tanimoto, M. P. Tehrani, T. Fujii, and T. Yendo, "Free-Viewpoint TV," Signal Processing Magazine, IEEE, vol. 28, pp. 67-76, 2011.
    [4] Tanimoto Masayuki, "Overview of free viewpoint television," Signal Processing: Image Communication, vol. 21, pp. 454-461, 2006.
    [5] T. Fujii and M. Tanimoto, "Free-viewpoint TV system based on Ray-Space representation," in Proc. of SPIE, vol. 4864, 2002, pp. 175-189.
    [6] B. Javidi and F. Okano, "Three-Dimensional Television, Video, and Display Technologies," Springer, Oct. 2002.
    [7] C. Fehn, "Depth-image-based rendering (DIBR), compression, and transmission for a new approach on 3D-TV," Proc. SPIE Conf. Stereoscopic Displays and Virtual Reality Systems XI, vol. 5291, pp. 93 - 104, 2004.
    [8] Oh Kwan-Jung, Yea Sehoon, and Ho Yo-Sung, "Hole filling method using depth based in-painting for view synthesis in free viewpoint television and 3-d video," in Picture Coding Symposium, 2009. PCS 2009, 2009, pp. 1-4.
    [9] M. Tanimoto et al., "Reference software for depth estimation and view synthesis", ISO/IEC JTC1/SC29/WG11, Archamps, France, Tech. Rep. M15377, Apr. 2008.
    [10] Deqing Sun, Roth S., and Black M. J., "Secrets of optical flow estimation and their principles," in Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, 2010, pp. 2432-2439.
    [11] Fehn, K. Hopf, and Q. Quante, "Key Technologies for an Advanced 3D-TV System," in Proceedings of SPIE Three-Dimensional TV Video and Display III, pp. 66-80, Oct. 2004.
    [12] Hansard Miles, Choi Ouk, Lee Seungkyu, and Horaud Radu, Time-of-Flight Cameras: Springer, 2013.
    [13] Wallach Hans and O'connell DN, "The kinetic depth effect," Journal of experimental psychology, vol. 45, p. 205, 1953.
    [14] Ideses Ianir, Yaroslavsky Leonid P, and Fishbain Barak, "Real-time 2D to 3D video conversion," Journal of Real-Time Image Processing, vol. 2, pp. 3-9, 2007.
    [15] Kim Donghyun, Min Dongbo, and Sohn Kwanghoon, "Stereoscopic video generation method using motion analysis," in 3DTV Conference, 2007, 2007, pp. 1-4.C.
    [16] Zhang Liang and Tam Wa James, "Stereoscopic image generation based on depth images for 3D TV," Broadcasting, IEEE Transactions on, vol. 51, pp. 191-199, 2005.
    [17] Sergios Theodoridis and Konstantinos Koutroumbas, "Pattern Recognition," Fourth Edition, Academic Press, 2009.
    [18] C. M. Bishop, and SpringerLink, Pattern recognition and machine learning: Springer New York, 2006.
    [19] Ganguly D., Mukherjee S., Naskar S., and Mukherjee P., "A Novel Approach for Determination of Optimal Number of Cluster," in Computer and Automation Engineering, 2009. ICCAE '09. International Conference on, 2009, pp. 113-117.
    [20] Patil R. V. and Jondhale K. C., "Edge based technique to estimate number of clusters in k-means color image segmentation," in Computer Science and Information Technology (ICCSIT), 2010 3rd IEEE International Conference on, 2010, pp. 117-121.
    [21] Murphey Yi Lu, Murphey YL, Chen J, Crossman J, and Zhang J, "A real-time depth detection system using monocular vision," in In SSGRR conference, 2000.
    [22] Battiato S., Capra A., Curti S., and La Cascia M., "3D stereoscopic image pairs by depth-map generation," in 3D Data Processing, Visualization and Transmission, 2004. 3DPVT 2004. Proceedings. 2nd International Symposium on, 2004, pp. 124-131.
    [23] Battiato Sebastiano, Curti Salvatore, La Cascia Marco, Tortora Marcello, and Scordato Emiliano, "Depth map generation by image classification," in Proceedings of SPIE, 2004, pp. 95-104.
    [24] Xiao-Ling Deng, Xiao-Hua Jiang, Qing-Guo Liu, and Wei-Xing Wang, "Automatic Depth Map Estimation of Monocular Indoor Environments," in MultiMedia and Information Technology, 2008. MMIT '08. International Conference on, 2008, pp. 646-649.
    [25] Xiaojun Huang, Lianghao Wang, Junjun Huang, Dongxiao Li, and Ming Zhang, "A Depth Extraction Method Based on Motion and Geometry for 2D to 3D Conversion," in Intelligent Information Technology Application, 2009. IITA 2009. Third International Symposium on, 2009, pp. 294-298.
    [26] Ens J. and Lawrence P., "An investigation of methods for determining depth from focus," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 15, pp. 97-108, 1993.
    [27] Yokota Arimitsu, Yoshida Takashi, Kashiyama Hideki, and Hamamoto Takayuki, "High-speed sensing system for depth estimation based on depth-from-focus by using smart imager," in Circuits and Systems, 2005. ISCAS 2005. IEEE International Symposium on, 2005, pp. 564-567 Vol. 1.
    [28] F Y., Jayaseelan J., and Jiang J., "CUE BASED DISPARITY ESTIMATION FOR POSSIBLE 2D TO 3D VIDEO CONVERSION," in Visual Information Engineering, 2006. VIE 2006. IET International Conference on, 2006, pp. 384-388.
    [29] Asif M., Malik A. S., and Tae-Sun Choi, "3D shape recovery from image defocus using wavelet analysis," in Image Processing, 2005. ICIP 2005. IEEE International Conference on, 2005, pp. I-1025-8.
    [30] Ge Guo, Nan Zhang, Longshe Huo, and Wen Gao, "2D to 3D convertion based on edge defocus and segmentation," in Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on, 2008, pp. 2181-2184.
    [31] Ogale Abhijit S., Fermuller C., and Aloimonos Y., "Motion segmentation using occlusions," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 27, pp. 988-992, 2005.
    [32] Yeong-Kang Lai, Yu-Fan Lai, and Ying-Chang Chen, "An Effective Hybrid Depth-Generation Algorithm for 2D-to-3D Conversion in 3D Displays," Display Technology, Journal of, vol. 9, pp. 154-161, 2013.
    [33] Moustakas K., Tzovaras D., and Strintzis M. G., "Stereoscopic video generation based on efficient layered structure and motion estimation from a monoscopic image sequence," Circuits and Systems for Video Technology, IEEE Transactions on, vol. 15, pp. 1065-1073, 2005.
    [34] Pourazad M. T., Nasiopoulos P., and Ward R. K., "An H.264-based scheme for 2D to 3D video conversion," in Consumer Electronics, 2009. ICCE '09. Digest of Technical Papers International Conference on, 2009, pp. 1-2.
    [35] Donghyun Kim, Dongbo Min, and Kwanghoon Sohn, "A Stereoscopic Video Generation Method Using Stereoscopic Display Characterization and Motion Analysis," Broadcasting, IEEE Transactions on, vol. 54, pp. 188-197, 2008.
    [36] Tien-Ying Kuo, Cheng-Hong Hsieh, and Yi-Chung Lo, "Depth map estimation from a single video sequence," in Consumer Electronics (ISCE), 2013 IEEE 17th International Symposium on, 2013, pp. 103-104.
    [37] Tam Wa James, Alain Guillaume, Zhang Liang, Martin Taali, and Renaud Ronald, "Smoothing depth maps for improved steroscopic image quality," 2004, pp. 162-172.
    [38] Zhang L. and Wa James Tam, "Stereoscopic image generation based on depth images for 3D TV," Broadcasting, IEEE Transactions on, vol. 51, pp. 191-199, 2005.
    [39] Shimono Koichi, Tam Wa James, Vázquez Carlos, Speranza Filippo, and Renaud Ron, "Removing the cardboard effect in stereoscopic images using smoothed depth maps," 2010, pp. 75241C-75241C-8.
    [40] Wan-Yu Chen, Chang Yu-Lin, Shyh-Feng Lin, Li-Fu Ding, and Liang-Gee Chen, "Efficient Depth Image Based Rendering with Edge Dependent Depth Filter and Interpolation," in Multimedia and Expo, 2005. ICME 2005. IEEE International Conference on, 2005, pp. 1314-1317.
    [41] Pei-Jun Lee and Effendi, "Adaptive edge-oriented depth image smoothing approach for depth image based rendering," in Broadband Multimedia Systems and Broadcasting (BMSB), 2010 IEEE International Symposium on, 2010, pp. 1-5.
    [42] Xuyuan Xu, Lai-Man Po, Kwok-Wai Cheung, Ka-Ho Ng, Ka-Man Wong, and Chi-Wang Ting, "A foreground biased depth map refinement method for DIBR view synthesis," in Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on, 2012, pp. 805-808.
    [43] Kopf Johannes, Cohen Michael F., Lischinski Dani, and Uyttendaele Matt, "Joint Bilateral Upsampling," ACM Transactions on Graphics (TOG), v.26 n.3, July 2007.
    [44] Buyue Zhang and Allebach J. P., "Adaptive Bilateral Filter for Sharpness Enhancement and Noise Removal," Image Processing, IEEE Transactions on, vol. 17, pp. 664-678, 2008.
    [45] Liu Shujie, Lai PoLin, Tian Dong, Gomila Cristina, and Chen Chang Wen, "Joint trilateral filtering for depth map compression," in Visual Communications and Image Processing 2010, 2010, pp. 77440F-77440F-10.
    [46] Seung-Won Jung, "Enhancement of Image and Depth Map Using Adaptive Joint Trilateral Filter," Circuits and Systems for Video Technology, IEEE Transactions on, vol. 23, pp. 258-269, 2013.
    [47] Dongbo Min, Jiangbo Lu, and Do M. N., "Depth Video Enhancement Based on Weighted Mode Filtering," Image Processing, IEEE Transactions on, vol. 21, pp. 1176-1190, 2012.
    [48] Xinxin Zuo and Jiangbin Zheng, "A Refined Weighted Mode Filtering Approach for Depth Video Enhancement," in Virtual Reality and Visualization (ICVRV), 2013 International Conference on, 2013, pp. 138-144.
    [49] Huhle B., Schairer T., Jenke P., and Strasser W., "Robust non-local denoising of colored depth data," in Computer Vision and Pattern Recognition Workshops, 2008. CVPRW '08. IEEE Computer Society Conference on, 2008, pp. 1-7.
    [50] S.-Y. Kim, J. H. Cho, A. Koschan, and M. A. Abidi, "Spatial and Temporal Enhancement of Depth Images Captured by a Time-of-Flight Depth Sensor," in Pattern Recognition (ICPR), 2010 20th International Conference on, 2010, pp. 2358-2361.
    [51] Cigla C. and Alatan A. A., "Temporally consistent dense depth map estimation via Belief Propagation," in 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video, 2009, pp. 1-4.
    [52] Lee S and Ho Y, "Temporally consistent depth map estimation using motion estimation for 3DTV," in International Workshop on Advanced Image Technology, 2010.
    [53] Deliang Fu, Yin Zhao, and Lu Yu, "Temporal consistency enhancement on depth sequences," in Picture Coding Symposium (PCS), 2010, pp. 342-345.
    [54] Chia-Ming Cheng, Xiao-An Hsu, and Shang-Hong Lai, "A novel structure-from-motion strategy for refining depth map estimation and multi-view synthesis in 3DTV," in Multimedia and Expo (ICME), 2010 IEEE International Conference on, 2010, pp. 944-949.
    [55] Rana P. K. and Flierl M., "Depth consistency testing for improved view interpolation," in Multimedia Signal Processing (MMSP), 2010 IEEE International Workshop on, 2010, pp. 384-389.
    [56] Rana P. K. and Flierl M., "Depth pixel clustering for consistency testing of multiview depth," in Signal Processing Conference (EUSIPCO), 2012 Proceedings of the 20th European, 2012, pp. 1119-1123.
    [57] Ekmekcioglu E., Velisavljevic, x, V., and Worrall S. T., "Content Adaptive Enhancement of Multi-View Depth Maps for Free Viewpoint Video," Selected Topics in Signal Processing, IEEE Journal of, vol. 5, pp. 352-361, 2011.
    [58] Rana P. K., Zhanyu Ma, Taghia J., and Flierl M., "Multiview depth map enhancement by variational bayes inference estimation of Dirichlet mixture models," in Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on, 2013, pp. 1528-1532.
    [59] Ming-Jiun Wang, Chun-Fu Chen, and Gwo Giun Lee, "Motion-based depth estimation for 2D-to-3D video conversion," in Visual Communications and Image Processing (VCIP), 2013, 2013, pp. 1-6.
    [60] Gwo Giun Lee, Ciao-Siang Siao, Chunhui Cui, Chun-Fu Chen, Yan Huo, and Huan-Hsiang Lin, "Depth map enhancement based on Z-displacement of objects," in Circuits and Systems (ISCAS), 2013 IEEE International Symposium on, 2013, pp. 2361-2364.
    [61] Chun-Fu Chen, Gwo Giun Lee, Jui-Che Wu, Ching-Jui Hsiao, and Jun-Yuan Ke, "Variable Block Size Motion Estimator Design for Scan Rate Up-convertor," in Signal Processing Systems (SiPS), 2012 IEEE Workshop on, 2012, pp. 67-72.
    [62] A. M. Polidorio, F. C. Flores, N. N. Imai, G. Tommaselli Am, and C. Franco, "Automatic shadow segmentation in aerial color images," in Computer Graphics and Image Processing, 2003. SIBGRAPI 2003. XVI Brazilian Symposium on, 2003, pp. 270-277.
    [63] Haijian Ma, Qiming Qin, and Xinyi Shen, "Shadow Segmentation and Compensation in High Resolution Satellite Images," in Geoscience and Remote Sensing Symposium, 2008. IGARSS 2008. IEEE International, 2008, pp. II-1036-II-1039.
    [64] K. Müller and A. Vetro, "Common Test Conditions (CTC) of 3DV Core Experiments," JCT3V-G1100, San José, US, 11-17 Jan. 2014.
    [65] 3D-HEVC test model, HTM-8.0 (2013, Sep.) [Online]. Available: https://hevc.hhi.fraunhofer.de/svn/svn_3DVCSoftware/tags/HTM-8.0

    下載圖示 校內:2024-12-31公開
    校外:2024-12-31公開
    QR CODE