簡易檢索 / 詳目顯示

研究生: 林基焜
Lin, Chi-Kun
論文名稱: 三維視覺與超解析度技術
3D Visualization and Super Resolution Technologies
指導教授: 楊家輝
Yang, Jar-Ferr
學位類別: 博士
Doctor
系所名稱: 電機資訊學院 - 電腦與通信工程研究所
Institute of Computer & Communication Engineering
論文出版年: 2015
畢業學年度: 103
語文別: 英文
論文頁數: 65
中文關鍵詞: 三維重建零位移檢測雜訊去除自由視角電視視差和運動向量擴散超高解析度兩階段邊緣內插適應增強適應晃動
外文關鍵詞: 3D reconstruction, Zero-displacement detection, Noise removal, Free viewpoint, Disparity and motion vector diffusion, Super resolution, Two-pass dominated-edge interpolation, Adaptive enhancement, Adaptive dithering
相關次數: 點閱:152下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文研究三維視覺與超解析度顯示處理與重建技術,為因應目前對於多維視覺的視覺需求, 並配合超解析度(4Kx2K)顯示器之應用,本研究開發了相關應用技術分別為自動三維重建技術,自由視角電視技術,以及疊代增強式解析度擴大技術,以達到能更快速並且更好的影視處理及顯示質量。
    首先,在多維視訊的產生上, 我們提出了兩個方式, 第一個是讓使用者能透過手持裝置, 自由任意旋轉物體, 以重建該物體外型的三維重建技術,技術內容主要包含零位移檢測和最近鄰雜訊去除核心演算法, 當使用者任意旋轉手持裝置的過程中,零位移檢測演算法能為連續圖像間配對特徵點,並自動去除拍攝背景環境, 同時為了減少雜訊造成的捕捉錯誤,最近點鄰雜訊去除演算法可消除遠離雜訊和大平均值飄移雜訊,以及近鄰雜訊。模擬結果說明在不需要事先知道照相機內外參數的情形下,我們提出的增強3D重建技術,可以快速且精確地重建3D影像。
    同時我們提出另一種多維影像產生技術, 架構在自由視角的電視邊解碼上, 自由視角電視是一種可以讓觀眾一邊看電視節目一邊自由改變視角的服務,自由視角電視需由支持多視點視頻編解碼器,如H.264/ MVC,下操作。本論文基於H.264/ MVC位元流中,可產生任何視角視頻。由於視頻解碼之視角數量限制。本技術利用解碼的視差向量和運動向量之擴散,以產生虛擬視圖重建平滑視差的場域。根據運動補償解碼殘差數據被用作匹配條件。本論文所提出的系統不僅大大降低了創建視角的計算負擔,由於使用四分之一像素H.264的精度,同時也提高了合成的影像質量。
    最後,本論文基於兩階段邊緣主導內插演算法,適應增強演算法和適應晃動演算法提出疊代增強超解析度技術。超解析度技術可往上取樣視訊至更高的解析度。本技術具有規則和簡單的兩階段邊緣內插值以銳化的影像品質,而適應增強和適應晃動增強可提供高頻補償。實驗結果說明,我們提出的超解析度技術可獲得更好的訊號雜訊比,表現出更好的視訊品質,其重建超高解析度影像之平均尖峰訊號雜訊比(PSNR)可達28.748 dB及結構相似度(SSIM)可達0.917611。模擬還顯示出,本技術不只結果比其它現有方法更好,而且計算複雜度遠低於其他高品質高解析技術。

    For supporting super resolution (4Kx2K) displays, in this dissertation, entitled “Processing and Reconstruction Technologies for Super Resolution Visualization”, we developed three visualization technologies, including automatic 3D reconstruction, free viewpoint television (FTV), and iterative super resolution technologies super resolution technologies to achieve fast implementation and better visual quality.
    First, we propose an automatic 3D reconstruction technology, which allows the user to arbitrarily rotate the object for the exposure of all object surfaces to a fixed-position camera, such that it can be achieved from freely-captured images of a handheld object. In this thesis, two, zero-displacement detection and nearest-neighbors noise removal important algorithms, are proposed to achieve practical applications. During the rotation, the zero-displacement detection of the paired feature points of consecutive images can be used to effectively remove the background of the object such that the capturing environments need not to be specified. To reduce the noise induced by user’s moving hands and capturing errors, the distanced noise removal and large-mean-shift noise removal with the concept of nearest neighbors are further proposed to enhance the 3D reconstruction. Without pre-estimation of camera intrinsic and extrinsic parameters, simulation results show that the proposed system with zero-displacement detection and nearest-neighbors noise removal can reconstruct the 3D object easily and precisely.
    Free viewpoint television (FTV) is a service that allows a viewer to change view angles freely while watching TV programs. FTV requires a multi-view video coder, such as H.264/MVC. In this thesis, based onH.264/MVC bit stream, the FTV system can produce videos as perceived in any view angles. Based on limited number of viewpoints, the decoded disparity vectors and motion vectors are diffused to produce smooth disparity fields for virtual view reconstruction. Decoded residue data under motion compensation are used as a match criterion. The proposed system not only greatly reduce the computation burden in creating FTV, but also improve the synthesized viewing quality due to the use of quarter pixel precision of H.264.
    Finally, in this thesis, an iterative enhanced super resolution (IESR) system, which is based on two-pass edge-dominated interpolation, adaptive enhancement and adaptive dithering techniques, is proposed. The two-pass edge-dominated interpolation with a regular and simple kernel can sharpen visual quality, while the adaptive enhancement can provide high frequency perfection and the adaptive dithering conveys naturalization enhancement such that the proposed IESR system achieves better PSNR and exhibits better visual quality. Experimental results indicate that the proposed IESR system, which improves average PSNR up to 30.266 dB and promotes SSIM up to 0.920687 in averages, is better than the other existing methods. Simulations also exhibit that the proposed IESR system acquires lower computational complexity than the methods, which achieve similar visual quality.

    摘要 i ABSTRACT iii 1 Introduction 1 1.1 Motivation 1 1.2 Background Discussion 1 1.3 Organization 6 2 An Automatic Image-based 3D Reconstruction System 7 2.1 Computation of Point Cloud 7 2.2 Nearest-neighbors Noisy Removals 15 2.3 Experimental Results 18 3 Free Viewpoint Video Generation 24 3.1. Overview 24 3.2. H.264/MVC Based Free ViewPoint Video 25 3.3. Experiments 33 4. An Iterative Enhanced Super Resolution System 36 4.1 Overview 36 4.2 Two-pass Edge-dominated Interpolation 40 4.3 Image Enhancement and Dithering Algorithms 44 4.4 Simulation Results 46 5 Conclusion 54 5.1. Conclusion 54 5.2. Future Work 55 References 56

    [1] Forbes, K., Voight, A., Bodika, N.: 'An Inexpensive Automatic and Accurate Camera Calibration Method'. Proc. of Thirteenth Annual Symposium of the Pattern Recognition Association of South Africa, 2002, pp. 100-106
    [2] Douskos, V., Kalisperakis, I., Karras, G.: 'Automatic Calibration of Digital Cameras Using Planar Chess-board Patterns'. Proc. of the 8th Conference on Optical 3-D Measurement Techniques, 2007, pp. 132-140
    [3] Jiang, Z.T. , Zheng, B.N., Wu, et al.: 'A Fully Automatic 3D Reconstruction Method Based on Images', World Congress on Computer Science and Information Engineering, 2009, pp. 327-331
    [4] Pollefeys, M., Koch , R., Vergauwen, M., et al.: 'Automated Reconstruction of 3D Scenes from Sequences of Images', Elsevier Journal of Photogrammetry & Remote Sensing, 2000, 55, (4), pp. 251-167
    [5] Pollefeys, M., Koch , R., Vergauwen, M., et al.: 'Hand-held Acquisition of 3D Models with a Video Camera'. Proc. of the 2nd International Conference on 3-D Digital Image and Modeling, 1999, pp. 14-23
    [6] Snavely, N., Seitz, S.M., Szeliski, R.: 'Modeling the World from Internet Photo Collections', International Journal of Computer Vision, 2007, 8, (2) , pp. 189-210
    [7] Snavely, N., Seitz, S.M., Szeliski, R.: 'Photo Tourism: Exploring Photo Collections in 3D', ACM Transactions on Graphics, 2006, 25, (3), pp. 835-846
    [8] Kazhdan, M., Bolitho, M., Hoppe, H.: 'Poisson Surface Reconstruction', Proc. the 4th Eurographics Symposium of Geometry Processing, 2006, pp. 61-70
    [9] Tzur, Y., Tai, A.: 'FlexiStickers - Photogrammetric Texture Mapping using Casual Images', SIGGRAPH 2009, ACM Transactions on Graphics, 2009, 28, (3), pp. 1-10
    [10] Agarwal, S., Snavely, N., Simon, I., et al.: 'Building Rome in a Day', Proc. of, IEEE 12th International Conference on Computer Vision, 2009, pp. 72-79
    [11] Fudono, K., Sato, T., Yokoya, N.: ‘Interactive 3-D Modeling System Using a Hand-held Video Camera’. Proc. of Scandinavian Conference Image Analysis, LNCS Springer, 2005, pp.1248-1258
    [12] Tanskanen P., Kolev K., Meier L., Camposeco F, Saurer O., Pollefeys M.,: ‘Live Metric 3D Reconstruction on Mobile Phones’. IEEE International Conference on Computer Vision , pp. 65-72, Dec. 2013.
    [13] Lowe, D.G.: 'Distinctive Image Features from Scale-invariant Keypoints', International Journal of Computer Vision, 2004, 60, (2), pp. 91-110
    [14] Lowe, D.G.: 'Object Recognition from Local Scale-invariant Features'. Proc. of International Conference on Computer Vision, Corfu, Greece, 1999, pp. 1150-1157
    [15] Lowe, D.G.: 'Local Feature View Clustering for 3D Object Recognition'. Proc. of IEEE Conference on Computer Vision and Pattern Recognition, Kauai, Hawaii, 2001, pp. 682-688
    [16] Wu, C.: 'SiftGPU: A GPU Implementation of Scale Invariant Feature Transform (SIFT)', 2007, http://cs.unc.edu/~ccwu/siftgpu
    [17] Wu, C., Agarwal, S., Curless, B., et al.: 'Multicore Bundle Adjustment'. Proc. of IEEE Conference on Computer Vision and Pattern Recognition, 2011, pp. 3057-3064
    [18] Mikolajczyk, K.: 'Detection of Local Features Invariant to Affine Transformations', Ph.D. thesis, Institut National Polytechnique de Grenoble, 2002
    [19] Foley, T., Sugerman, J.: 'KD-Tree Acceleration Structures for a GPU Raytracer'. Proc. of ACM SIGGRAPH Conference on Graphics Hardware, 2005, pp. 15-22
    [20] Zhou, K., Hou, O., Wang, R., et al.: 'Real-time KD-tree Construction on Graphics Hardware', ACM Trans. on Graphics, 2008, 27, (5), p. Article 126
    [21] Fischler, M.A., Bolles, R.C.: 'Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography', Communications of ACM, 1981, 24, (6), pp. 381-395
    [22] Niste´r, D.: 'An Efficient Solution to the Five-Point Relative Pose Problem', IEEE Trans. on Pattern Analysis and Machine Intelligence, 2004, 26, (6), pp. 756-770
    [23] R Yan, L Shao, Y Liu, Nonlocal hierarchical dictionary learning using wavelets for image denoising. IEEE Trans Image Process 22(12), 4689–4698 (2013).
    [24] L Shao, H Zhang, G de Haan, An overview and performance evaluation of classification-based least squares trained filters. IEEE Trans Image Process 17(10), 1772–1782 (2008).
    [25] Saito H., Baba S., Kanade T., “Appearance-based virtual view generation from multicamera videos captured in the 3-D room,” IEEE Transactions on Multimedia, Papers 5(3), 303-316(2003).
    [26] Merkle P., Morvan Y., Smolic A., Farin D., Muller K., de With P.H.N., Wiegand T., “The effects of multiview depth video compression on multiview rendering,” ELSEVIER journal, Signal Processing: Image Communication 24 ,73–88 (2009).
    [27] Wan, Y., Miao, Z.: '3D Scene Reconstruction Based on Uncalibrated Image Sequences', Proc. of International Conference on Digital Image Processing, 2009, pp.163-166
    [28] Zhang, Z.: 'Motion and Structure from two Perspective Views: from essential parameters to Euclidean Motion through the Fundamental Matrix', Journal of Optical Society of America, Part A, 1997, 14, (11), pp. 2938-2950
    [29] Heikkila, J., Silven, O.: 'A Four-step Camera Calibration Procedure with Implicit Image Correction'. Proc. of IEEE Conference on Computer Vision and Pattern Recognition, 1997, pp. 1106-1112
    [30] Liu, S., Chan, K.C., Wang, C.C.L.: 'Iterative Consolidation of Unorganized Point Clouds', IEEE Computer Graphics and Applications, 2012, 32, (3), pp. 70-83
    [31] Huang, H., Li, D., Zhang, H., et al.: 'Consolidation of Unorganized Point Clouds for Surface Reconstruction', ACM Transactions on Graphics (Proceeding of SIGGRAPH Asia), 2009, 28, (5), p. Article 176
    [32] Beis, J.S., Lowe, D.G.: 'Shape Indexing Using Approximate Nearest-Neighbour Search in High-Dimensional Spaces'. Proc. of IEEE Conference on Computer Vision and Pattern Recognition, 1997, pp. 100-106
    [33] Arya, S., Mount, D.M., Netanyahu, N. S., et al.: ‘An Optimal Algorithm for Approximate Nearest Neighbor Searching in Fixed Dimensions‘, Journal of the ACM, 1998, 45, (6), pp. 891-923
    [34] Saito H., Baba S., Kanade T., “Appearance-based virtual view generation from multicamera videos captured in the 3-D room,” IEEE Transactions on Multimedia, Papers 5(3), 303-316(2003).
    [35] Merkle P., Morvan Y., Smolic A., Farin D., Muller K., de With P.H.N., Wiegand T., “The effects of multiview depth video compression on multiview rendering,” ELSEVIER journal, Signal Processing: Image Communication 24 ,73–88 (2009).
    [36] van Berkel C., “Image Preparation for 3D-LCD,” Proc. SPIE, Stereoscopic Displays and Virtual Reality Systems VI 3639, 84-91(1999).
    [37] Koch R., Pollefeys M., Van Gool L., "Automatic 3D Model Acquisition from Uncalibrated Image Sequences," cgi, Computer Graphics International 1998 (CGI'98), 597(1998).
    [38] Masayuki Tanimoto, “FTV (Free viewpoint TV) and Creation of Ray-Based Image Engineering,” ECTI Transaction on Electrical Engineering, Electronics and Communications, Papers 6(1), 3-14(2008).
    [39] Joint Video Team (JVT) of ISO/IEC MPEG & ITU-T VCEG, “JMVC 2.5 software for the Multiview Video Coding (MVC),” 29th Meeting: Busan, KR, 13-16(2008).
    [40] E. Maeland, "On the comparison of interpolation methods," IEEE Trans. on Medical Imaging, vol. 7, no. 7, pp.213-217, Sept. 1988.
    [41] H. S. Hou and H. Andrews, "Cubic splines for image interpolation and digital filtering", IEEE Trans. on Acoustic, Speech and Signal Processing, vol. 26, no. 6, pp..508-517, Dec. 1978.
    [42] Y. Yun, J. Bae, and J. Kim, "Adaptive multidirectional edge directed interpolation for selected edge regions", Proc. of Region 10 Conference (TENCON 2011), pp. 385-388, Bali, Nov. 2011.
    [43] D. Zhou, X. Shen, and W. Dong, "Image zooming using directional cubic convolution interpolation", IET Image Processing, vol. 6, no. 6. Pp.627-634, August 2012.
    [44] D. Zhang and X. Wu, "An edge-guided image interpolation algorithm via directional filtering and data fusion", IEEE Trans. on Image Processing, vo. 15, no. 8, pp. 2226-2238, Aug. 2006.
    [45] X. Li and N. T. Orchard, "New edge-directed Interpolation", IEEE Trans. on Image Processing, vo. 10, no. 10, pp. 1521-1527, Oct. 2001.
    [46] J. Allebach and P. W. Wong, "Edge-directed interpolation," Proc. of International Conference on Image Processing, pp.707-710 vol.3, Sep 1996.
    [47] C.-S. Wong and W.-C. Siu, "Adaptive directional window selection for edge-directed Interpolation," Proc. of 19th International Conference on Computer Communications and Networks, vol., no., pp.1-6, Aug. 201.
    [48] C.-S. Wong and W.-C. Siu, "Further improved edge-directed interpolation and fast EDI for SDTV to HDTV conversion," Proc. of European Signal Processing Conference, 23-27 Aug. 201.
    [49] W.-S. Tam, C.W. Kok, and Wan-Chi Siu, "Modified edge-directed interpolation for images", J. Electron. Imaging, no. 19, 013011, Mar. 2010).
    [50] N. Asuni and A. Giachetti, "Accuracy improvements and artifacts removal in edge based image interpolation," Proc. of the 3rd Int. Conf. on Computer Vision Theory and Applications, 2008.
    [51] S. C. Tai, T. M. Kuo, C. H. Lao, and T. W. Liao, "A fast algorithm for single image super resolution in both wavelet and spatial domain," Proc. of International Symposium on Computer, Consumer and Control, pp. 702-705, 2012.
    [52] H. Su, L. Tang, Y. Wu, D. Tretter, and J. Zhou, "Spatially adaptive block-based super-resolution", IEEE Trans. on Image Processing, vo. 21, no. 3, pp. 1031-1045, Mar. 2012.
    [53] B. Zhao, Z. Gan, Y. Zhang, F. Liu, and H. Wang, "Novel back-projection framework for single image super-resolution," Proc. of International Conf. on Signal Processing, pp. 894-898, Oct. 2012.
    [54] W. Dong, L. Zhang, G. Shi, and X. Wu. "Nonlocal back projection for adaptive image enlargement", International Conf. on Image Processing, pp. 349-352, Nov. 2009.
    [55] C. Fan, J. Zhu, J. Gong, and C. Kuang, "POCS Super-resolution sequence image reconstruction based on improvement approach of Keren registration Method," Proc. of Inter. Conf. on Intelligent System Design and Application, vol. 2, pp. 333-337, Oct. 2006.
    [56] A. J. Patti and Y. Altunbasak, "Artifact reduction for set theoretic super resolution image reconstruction with edge adaptive constraints and higher-order interpolants," IEEE Trans. on Image Processing, vol. 10, no. 1, pp. 179-186, Jan. 2002.
    [57] S. Belekos, N. Galatsanos, and A. Katsaggelos, "Maximum a posteriori video super-resolution using a new multichannel image prior," IEEE Trans. on Image Processing, vol. 19, no. 6, pp. 1451-1464, Jun. 201.
    [58] L. C. Pickup, D. P. Capel, S. J. Roberts, and A. Zisserman,"Bayesian methods for image super-resolution," The Computer Journal, vol. 52, no. 1, pp. 101-113, Oct. 2007.
    [59] W. T. Freeman, E. C. Pasztor, and O. T. Carmichael, "Learning low level vision," International Journal Computer Vision, vol. 40, no. 1, pp. 25-47, 200.
    [60] J. Yang, J. Wright, T. Huang, and Y. Ma, "Image super-resolution via sparse representation," IEEE Trans. on Image Processing, vol. 19, no. 11, pp. 2861-2873, Nov. 201.
    [61] L Shao, R Yan, X Li, Y Liu, From heuristic optimization to dictionary learning: a review and comprehensive comparison of image denoising algorithms”. IEEE Trans Cybern 44(7), 1001–1013 (2014).

    下載圖示 校內:2020-07-20公開
    校外:2020-07-20公開
    QR CODE