| 研究生: |
黃俊豪 Huang, Chun-Hao |
|---|---|
| 論文名稱: |
由既存關鍵影格生成深度圖之內插演算法 Depth Map Interpolation from Existing Pairs of Keyframes and Depth Maps for 3D Video Generation |
| 指導教授: |
楊家輝
Yang, Jar-Ferr |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電腦與通信工程研究所 Institute of Computer & Communication Engineering |
| 論文出版年: | 2010 |
| 畢業學年度: | 98 |
| 語文別: | 英文 |
| 論文頁數: | 82 |
| 中文關鍵詞: | 2D/3D轉換 、視差 、深度影像內插 、貝氏分類器 |
| 外文關鍵詞: | 2D-to-3D conversion, Disparity, Depth interpolation, Bayes classifier |
| 相關次數: | 點閱:72 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著立體液晶螢幕的技術逐漸進步,人們將不再需要配戴立體眼鏡即可透過螢幕欣賞立體影像。越來越多的研究放在如何將影片立體呈現。其中最常見的方式是先做深度的估測,再依深度合成出視差,進而提供我們左、右眼對應的影像,在觀看時產生立體感。然而傳統單張深度影像估測無法應付高畫框頻率的影片需求。本論文旨在提供一個快速生成影片深度圖的演算法,假使某段影片首尾深度圖已知,本論文使用動作估測與補償、貝式分類器、深度調整機制有效且快速地生成中間各影格深度圖,大幅縮減運算時間。
With the recent 3D display technology improvements, people now can perceive 3D effects in 3D display systems without wearing 3D glasses. However, due to the lack of 3D contents, 2D-to-3D conversion technique becomes more and more important in the near future. For 2D-to-3D video conversion, the most common way is to generate correspondent depth video from the original 2D video first then synthesizing the desired video by utilizing generated depth videos. The main key point for 2D-to-3D conversion is how to generate the corresponding depth video precisely and efficiently. This thesis proposes an efficient depth maps interpolation method from existing pairs of key frames and depth maps. The proposed method contains forward/backward block-based motion estimation/compensation, block refinement, object refinement, and frame selection mechanism. Experimental results show that the proposed method can successfully and effectively generate depth maps, which will be a great help for 3DTV content generation.
[1]J. H. Beck, Leonardo's rules of painting: an unconventional approach to modern art. New York: Viking Press, 1979.
[2]C. Wheatstone, "XXXVI. Contributions to the physiology of vision. Part the First. On the some remarkable, and hitherto unobserved, phanomena of binocular vision," Philosophical Magazine Series 4, vol. 3, pp. 241-267, 1852.
[3]M. Beeck, C. Fehn and I. Sexton., "Towards an optimized 3D broadcast chain," Proceedings of SPIE, vol. 4864, pp. 42-50, 2002.
[4]M. Beeck, C. Fehn and I. Sexton, "Advanced three-dimensional television system technologies," First International Symposium on 3D Data Processing Visualization and Transmission (IEEE 3DPVT), pp. 313-319, June 19, 2002.
[5]P. Harman, "Home based 3D entertainment-an overview," IEEE International Conference on Image Processing, pp. 1-4 vol.1, 2000.
[6]Despite challenges, heady growth seen for 3-D TV. http://www.eetimes.com/showArticle.jhtml?articleID=225200161
[7]K. N. Ogle, "Researches in binocular vision, " Saunders, 1950.
[8]C. Fehn and R. Pastoor, "Interactive 3-DTV-concepts and key technologies," Proceedings of the IEEE, vol. 94, pp. 524-538, 2006.
[9]P. Benzie, P. Surman, and H. Urey, "A survey of 3DTV displays: techniques and technologies," IEEE Transactions on Circuits and Systems for Video Technology, vol. 17, pp. 1647-1658, 2007.
[10]T. Dekker, S. T. de Zwart, O. H. Willemsen, M. G. H. Hiddink, and W. L. IJzerman, "2D/3D switchable displays," Proceedings of SPIE, vol. 6135, 2006
[11]G. J. Iddan and G. Yahav, "Three-dimensional imaging in the studio and elsewhere," Proceedings of SPIE, vol. 4298, pp. 48-55, 2001.
[12]G. Yahav, G.J. Iddan, and D. Mandelboum, "3D Imaging camera for gaming application," IEEE International Conference on Consumer Electronics, pp. 1-2, Apr. 2007.
[13]D. Scharstein and R. Szeliski, "High-accuracy stereo depth maps using structured light," IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol.1, pp. I-195-I-202, 2003.
[14]P. Harman, J. Flack, S. Fox, and M. Dowley, "Rapid 2D to 3D conversion," Proceedings of SPIE, p. 78, 2002.
[15]I. Ideses, L. P. Yaroslavsky and B. Fishbain, "Real-time 2D to 3D video conversion," Journal of Real-Time Image Processing, vol. 2, pp. 3-9, 2007.
[16]C. Y. Lin, C. Y. Fang, L. F. Ding, S. Y. Chen and L. G. Chen, "Depth map generation for 2D-to-3D conversion by short-term motion assisted color segmentation," IEEE International Conference on Multimedia and Expo, pp. 1958-1961, 2007.
[17]C. Y. Lin, C. Y. Fang, S. Y. Chen and L. G. Chen, "Priority depth fusion for the 2D to 3D conversion system," Proceedings of SPIE, p. 680513, 2008.
[18]W. Chenglei, G. Er, X. Xie, T. Li, X. Cao, and Q. Dai, "A novel method for semi-automatic 2D to 3D video conversion," 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video, pp. 65-68, 2008.
[19]X. Feng, G. Er, X. Xie, and Q. Dai, "2D-to-3D conversion based on motion and color mergence," 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video, pp. 205-208, 2008.
[20]G. Ge, N. Zhang, L. Huo, and W. Gao, "2D to 3D conversion based on edge defocus and segmentation," IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 2181-2184, 2008.
[21]C. C. Chung, and L. G. Chen, "A block-based 2D-to-3D conversion system with bilateral filter," IEEE International Conference on Consumer Electronics, pp. 1-2, 2009.
[22]K. T. Ng, Z. Y. Zhu, and S.C. Chan, "An approach to 2D-To-3D conversion for multiview displays," IEEE International Conference on Information, Communications and Signal Processing, pp. 1-5, 2009
[23]K. Yamada and Y. Suzuki, "Real-time 2D-to-3D conversion at full HD 1080P resolution," IEEE International Symposium on Consumer Electronics, pp. 103-106, 2009.
[24]C. C. Cheng, C. T. Li, and L. G. Chen, "A 2D-to-3D conversion system using edge information," IEEE International Conference on Consumer Electronics, pp. 377-378, 2010.
[25]C. Fehn, K. Hopf and Q. Quante, “Key Technologies for an Advanced 3D-TV System ” In Proceedings of SPIE Three-Dimensional TV, Video and Display III, Philadephia, PA, USA, pp. 66-80, 2004.
[26]L. Shang-Hong, C. W. Fu, and S. Chang, "A generalized depth estimation algorithm with a single image," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 14, pp. 405-411, 1992.
[27]J. Ens and P. Lawrence, "An investigation of methods for determining depth from focus," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15, pp. 97-108, 1993.
[28]S. A. Valencia and R. M. Rodriguez-Dagnino, "Synthesizing stereo 3D views from focus cues in monoscopic 2D images," Proceedings of SPIE, vol. 5006, pp. 377-388, 2003.
[29]J. Ko, M. Kim, and C. Kim, "2D-to-3D stereoscopic conversion: depth-map estimation in a 2D single-view image," Proceedings of SPIE, vol. 6696, p. 66962A, 2007.
[30]S. Battiato, S. Curti, M. L. Cascia, M. Tortora, and E. Scordato, "Depth map generation by image classification," Proceedings of SPIE, vol. 5302, pp. 95-104, 2004.
[31]V. Cantoni, L. Lombardi, M. Porta, and N. Sicard, "Vanishing point detection: representation analysis and new approaches," IEEE International Conference on Image Analysis and Processing, pp. 90-94, 2001.
[32]A. Almansa, A. Desolneux, and S. Vamech, "Vanishing point detection without any a priori information," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, pp. 502-507, 2003.
[33]T. Yi-Min, Y. L. Chang, and L. G. Chen, "Block-based Vanishing Line and Vanishing Point Detection for 3D Scene Reconstruction," International Symposium on Intelligent Signal Processing and Communications, pp. 586-589, 2006.
[34]S. Battiato, A. Capra, S. Curti, and L. M. Cascia, "3D stereoscopic image pairs by depth-map generation," IEEE International Symposium on 3D Data Processing, Visualization and Transmission, pp. 124-131, 2004.
[35]H. Xiaojun, L. Wang, J. Huang, D. Li, and M. Zhang, "A depth extraction method based on motion and geometry for 2D to 3D conversion," International Symposium on Intelligent Information Technology Application, pp. 294-298, 2009.
[36]Z. Guofeng, I. Jia, T. Wong, and H. Bao, "Recovering consistent video depth maps via bundle optimization," IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-8, 2008.
[37]H. D. Cheng, H. D. Cheng, X. H. Jiang, Y. Sun and J. L. Wang., "Color image segmentation: advances and prospects," International Journal on Pattern Recognition, vol. 34, pp. 2259-2281, Dec 2001.
[38]T. Li, J. Sun, C. K. Tang, and I. Y. Shum, "Lazy snapping," ACM Transactions on Graphics, vol. 23, pp. 303-308, Aug 2004.
[39]M. H. Feldman and L. Lipton, "Interactive 2D to 3D stereoscopic image synthesis," Proceedings of SPIE, vol. 5664, pp. 186-197, 2005.
[40]J. F. Yang and An-Ti Chiang, "Image segmentation for depth estimation of single view images," Master, Institute of Computer and Communication Engineering, National Cheng Kung University, Tainan, 2009.
[41]J. Flack, P. Harman, and F. Simon, "Low-bandwidth stereoscopic image encoding and transmission," Proceedings of SPIE, vol. 5006, pp. 206-214, 2003.
[42]W. Tam, G. Alain, L. Zhang, T. Martin, and R. Renaud, "Smoothing depth maps for improved stereoscopic image quality," Proceedings of SPIE, vol. 5599, pp. 162, 2004.
[43]Z. Liang and W. J. Tam, "Stereoscopic image generation based on depth images for 3D TV," IEEE Transactions on Broadcasting, vol. 51, pp. 191-199, 2005.
[44]W. J. Tam and Z. Liang, "3D-TV content generation: 2D-to-3D conversion," IEEE International Conference on Multimedia and Expo, pp. 1869-1872, 2006.
[45]M. T. Pourazad, P. Nasiopoulos, and R. K. Ward, "An H.264-based scheme for 2D to 3D video conversion," IEEE Transactions on Consumer Electronics, vol. 55, pp. 742-748, 2009.
[46]M. T. Pourazad, P. Nasiopoulos, and R. K. Ward, "Conversion of H.264-encoded 2D video to 3D format," IEEE International Conference on Consumer Electronics, pp. 63-64, 2010.
[47]I. E. G. Richardson, "H.264 and MPEG-4 video compression : video coding for next generation multimedia". Chichester, Hoboken, NJ: Wiley, 2003.
[48]D. Scharstein and R. Szeliski, "A taxonomy and evaluation of dense two-frame stereo correspondence algorithms," International Journal of Computer Vision, vol. 47, pp. 7-42, 2002.
[49]Y. Matsumoto, H. Terasaki, K. Sugimoto, and T. Arakawa, "Conversion system of monocular image sequence to stereo using motion parallax," Proceedings of SPIE, vol. 3012, pp. 108-115, 1997.
[50]S. Diplaris, N. Grammalidis, D. Tzovaras, and M. G. Strintzis, "Generation of stereoscopic image sequences using structure and rigid motion estimation by extended Kalman filters," IEEE International Conference on Multimedia and Expo, vol.2, pp. 233-236, 2002.
[51]K. Moustakas, D. Tzovaras, and M. G. Strintzis, "Stereoscopic video generation based on efficient layered structure and motion estimation from a monoscopic image sequence," IEEE Transactions on Circuits and Systems for Video Technology, vol. 15, pp. 1065-1073, 2005.
[52]T. Okino, H. Murata, K. Taima, T. Iinuma, and K. Oketani, "New television with 2D/3D image conversion technologies," Proceedings of SPIE, vol. 2653, pp. 96-103, 1996.
[53]H. M. Wang, Y. H. Chen, and J. F. Yang, "A novel matching frame selection method for stereoscopic video generation," IEEE International Conference on Multimedia and Expo, pp. 1174-1177, 2009
[54]G. Zhang, W. Hua, X. Qin, T. T. Wong, and H. Bao, "Stereoscopic Video Synthesis from a Monocular Video," IEEE Transactions on Visualization and Computer Graphics, vol. 13, pp. 686-696, 2007.
[55]Rosenfeld, A. and J. Pfaltz, "Sequential operations in digital picture processing," Journal of the Association for Computing Machinery, Vol. 13, pp. 471-494, 1966.
[56]F. Candocia and M. Adjouadi, "A similarity measure for stereo feature matching," IEEE Transactions on Image Processing, vol. 6, pp. 1460-1464, 1997.
[57]D. Van der Weken, M. Nachtegael, V. De Witte, and S. Schulte, "A survey on the use and the construction of fuzzy similarity measures in image processing," IEEE International Conference on Computational Intelligence for Measurement Systems and Applications, pp. 187-192, 2005.
[58]D. Huang, M. Yu, and Y. Yang, "Image evaluation algorithm for right view of stereoscopic video," IEEE International Conference on Signal Processing, pp. 1051-1054, 2008.
[59]D. Van der Weken, M. Nachtegael, V. De Witte, and S. Schulte, "An overview of similarity measures for images," IEEE International Conference on Acoustics, Speech, and Signal Processing, vol.4, pp. IV-3317-IV-3320, 2002.
[60]R. C. Gonzalez and R. E. Woods, "Digital image processing," 2nd edition, Prentice Hall, 2002.
[61]L. Meesters, and W. A. IJsselsteijn, "A survey of perceptual evaluations and requirements of three-dimensional TV," IEEE Transactions on Circuits and Systems for Video Technology, vol. 14, pp. 381-391, 2004.
[62]F. Lu, H. Wang, X. Ji, and G. Er, "Quality assessment of 3D asymmetric view coding using spatial frequency dominance model," 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video, pp. 1-4, 2009.
[63]C. T. E. R. Hewage, and Kondoz, A.M. "Quality evaluation of color plus depth map-based stereoscopic video," IEEE Journal of Signal Processing, vol. 3, pp. 304-318, 2009.
校內:2015-08-05公開