簡易檢索 / 詳目顯示

研究生: 李冠霆
Lee, Arboo Kuan-Ting
論文名稱: 極兼容2D視訊訊號之立體影像傳輸設計及FPGA實現
Design and FPGA Realization of 3D Imaging Systems with Nearly 2D Compatible Formats
指導教授: 楊家輝
Yang, Jar-Ferr
學位類別: 博士
Doctor
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2022
畢業學年度: 110
語文別: 英文
論文頁數: 116
中文關鍵詞: 三維視訊三維廣播紋理和深度打包格式基於深度圖多視角渲染深度卷積神經網絡圖像引導深度影像優化積體電路設計
外文關鍵詞: 3D video, 3D broadcasting, texture and depth packing formats, DIBR, multiview rendering, Deep Convolutional Neural Network, Image-guided Depth Enhancement, VLSI Design
相關次數: 點閱:209下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年來三維(3D)視覺技術逐漸成熟,在電影院透過立體影像的呈現帶給觀眾更加身歷其境地的觀賞體驗。但在其應用層面卻未能有效普及。為了提升用戶端三維視覺的廣泛性,同時兼容各族群用戶的需求,本論文完成一極兼容二維(2D)與三維視訊訊號之立體影像傳輸設計及FPGA實現,以增進立體視訊系統於各層面的應用。
    首先在3D視頻廣播部分,本論文提出了一種基於集中紋理深度打包(CTDP)格式的YCbCr顏色深度打包方法,以提供有效的3D視訊服務。具有基於深度圖像渲染引擎的3D視訊可以通過彩圖紋理和深度信息輕鬆支援所有眼鏡和裸視3D顯示器。實驗結果顯示與2D加深度包裝(2DDP)格式相比,採用YCbCr顏色深度包裝方法的CTDP格式可以不論在客觀或主觀上,實現更好紋理和深度品質。所提出的YCbCr 顏色深度包裝方法的 CTDP 格式有助於在當前的2D影像廣播系統中簡單且有效地傳遞3D立體視訊。
    其次在基於深度圖的多視角生成部分,本論文提出了一種採用加權分數形變和二向孔填充方法的精確基於深度圖渲染法系統。加權分數形變方法可以幫助將像素形變到精確的分數位置上,以減少分數視差在量化中捨入的誤差。二向空洞填充方法可以利用背景顏色邊緣的相似性來填補缺失的信息部分以增強虛擬視圖的品質。
    然後在深度圖優化的部分,本文提出了一景深邊緣強化之圖像引導網路。對於高品質的3D立體視覺體驗,已顯示出採用深度學習來增強高精確深度圖有不錯的改進。提出的網路包含深度和圖像分支,結合了圖像分支中的一組新特徵與深度中的特徵圖分支。實驗結果表明,所提出的系統比現有的先進網絡實現了更好的深度校正品質。消融研究表明所提出的損失函數在使用圖像信息時可以有效地提高深度圖的準確性。
    最後結合CTDP 解封包系統和多視角生成系統後,完成一個極兼容2D視訊訊號之立體影像傳輸設計及FPGA實現。透過輸出訊號的切換,能夠展示不同的影像顯示格式,在不同的2D或3D顯示器上運作。本系統的操作頻率可支援到594 MHz,能即時輸出 4K影像,亦能維持影像的品質。

    In recent years, three-dimensional (3D) visual technology has gradually matured, and the presentation of three-dimensional images in cinemas brings audiences more immersive viewing experiences. However, it cannot effectively popularize for practical applications. In order to improve the wide range of 3D vision on the client side and be compatible with the needs of various groups of users, this dissertation completes the design and FPGA realization of 3D imaging system with nearly 2D compatible formats. Thereby enhancing the applications of the 3D video systems to all levels.
    First, in the 3D video broadcasting part, this dissertation proposed a new YCbCr color depth packing method based on the centralized texture-depth packing (CTDP) formats to deliver effective 3D video services. With texture and depth information, the 3D videos with a depth image-based rendering engine can easily support all glasses and glasses-free 3D displays. Simulations show that the CTDP formats with YCbCr color depth packing method can achieve a better objective and subjective texture and depth quality than the 2D-plus-depth packing (2DDP) formats. The proposed CTDP formats with YCbCr color depth packing method could help to deliver 3D videos in the current 2D broadcasting systems simply and efficiently.
    Secondly, in the part of multi-view images generation based on depth map, this dissertation proposed a precision depth-image-based rendering system by using weighted fraction warping and two-pass hole filling method. The weighted fractional warping method can help to warp the pixels to the precise fractional positions to reduce the rounding errors in quantization of fractional disparity. The bidirectional hole filling method can use the similarity of the background color edge to fill the missing information part to enhance the virtual view image.
    Then, in the depth map refinement part, this dissertation proposed an image-guided network for depth edge enhancement. For a high-quality 3D visual experience, applying deep learning to enhance high-precision depth maps has shown promising improvements. The proposed network contains both depth and image branches, where we combine a new set of features from the image branch with those from the depth branch. Experimental results show that the proposed system achieves a better depth correction performance than state-of-the-art advanced networks. The ablation study reveals that the proposed loss functions in the use of image information can enhance depth map accuracy effectively.
    Finally, after combining the CTDP decapsulation system and the multi-view generation system, a nearly-compatible 2D and 3D multi-view imaging broadcasting system and its FPGA realization is completed. By switching the output signal, it can display different image display formats and operate on different 2D or 3D displays. The operating frequency of this system can support up to 594 MHz, and it can output 4K images in real time while maintaining the image quality.

    摘 要 i Abstract iii 致 謝 v Table of Contents viii List of Figures xi List of Tables xvi Chapter 1 1 Introduction 1 1.1 Background and Motivation 1 1.2 Organization of the Dissertation 3 Chapter 2 4 Basic Concepts of 3D Video Systems 4 2.1 Overview of 3D Video Packing Formats 4 2.2 Basic Functions of DIBR System 5 2.2.1 Depth-based Pixel Warping 6 2.2.2 Hole Filling 7 2.3 HDMI Interface for Hardware Design 8 2.4 Overview of Depth Refinement 11 Chapter 3 17 A YCbCr Color Depth Packing Method and Hardware Architecture 17 3.1 Centralized Texture Depth Packing Formats 17 3.1.1 Arrangements of CTDP Formats 19 3.1.2 Details of Packing/Depacking CTDP formats 21 3.2 YCbCr Color Depth Packing Method 23 3.2.1 YCbCr 4:2:0 Chroma Format 28 3.2.2 YCbCr 4:2:2 Chroma Format 30 3.3 Extended CTDP Formats 33 3.4 Experimental Results 36 3.4.1 Comparisons of Uncoded 2DDP and CTDP Formats 37 3.4.2 Comparisons of Coded 2DDP and CTDP Formats 39 3.4.3 Comparisons of CTDP-HEVC and 3D-HEVC 41 3.5 Hardware Architectures of CTDP Depacking System 46 3.5.1 Separation of Color and Depth Signals 46 3.5.2 Temporary Storage of Color Signal Data 47 3.5.3 Interpolation of Color Signal Data 48 3.5.4 Temporary Storage of Depth Signal Data 50 3.5.5 Depth Information Recovery of YCbCr Format 51 3.5.6 Interpolation of Depth Signal Data 53 Chapter 4 55 Precision Multi-view Generation System and Hardware Architecture 55 4.1 Weighted Fractional Warping 56 4.2 Two-pass Hole Filling 60 4.2.1 Top-Down Hole Filling 61 4.2.2 Bottom-Up Hole Filling 62 4.2.3 Two-Pass Filling Order Map 63 4.3 Hardware Architectures of the Multi-view Generation System 65 4.3.1 Weighted Fractional Warping Module 65 4.3.2 Two-pass Hole Filling Module 67 4.3.3 Concave Adjustment Module 68 4.3.4 Naked-eye TV Arrangement 71 4.4 Experimental Results and Demonstration 76 4.4.1 Simulation Results of Weighted Fractional Warping 76 4.4.2 Simulation Results of Two-pass Hole Filling 80 4.4.3 Demonstration of VLSI 3D Imaging System 84 Chapter 5 89 An Image-Guided Network for Depth Edge Enhancement 89 5.1 Image-Guided Depth Enhancement (IGDE) Network 90 5.1.1 Architectures of IGDE Network 90 5.1.2 Loss Functions 92 5.2 Results and Discussions 95 5.2.1 Visualization Performance of the Network 96 5.2.2 Comparisons with Quality Measures 98 5.2.3 Ablation Study 100 Chapter 6 103 Conclusions and Future Work 103 References 106 Publication and Award List 114 Biography 116

    [1] E. Stoykova, A. A. Alatan, P. Benzie, N. Grammalidis, S. Malassiotis, J. Ostermann, S. Piekh, V. Sainov, C. Theobalt, T. Thevar, and X. Zabulis, “3D Time Varying Scene Capture Technologies – A Survey,” IEEE Trans. on Circuits and Systems for Video Technology, vol. 17, no. 11, pp. 1568-1586, Nov. 2007.
    [2] A. Alatan, Y. Yemez, U. Güdükbay, X. Zabulis, K. Müller, Ç. E. Erdem, C. Weigel, and A. Smolic, “Scene Representation Technologies for 3DTV – A Survey,” IEEE Trans. on Circuits and Systems for Video Technology, vol. 17, no. 11, pp. 1587-1605, Nov. 2007.
    [3] A. Smolic, K. Mueller, N. Stefanoski, J. Ostermann, A. Gotchev, G. B. Akar, G. Triantafyllidis and A. Koz, “Coding Algorithms for 3DTV—A Survey,” IEEE Trans. on Circuits and Systems for Video Technology, vol. 17, no. 11, pp. 1606–1621, Nov. 2007.
    [4] ISO/IEC 13818-1:2007 / Amendment 7, Signaling of Stereoscopic Video in MPEG-2 Systems.
    [5] ISO/IEC 13818-2:2000 / Amendment 4, Frame packing Arrangement Signaling for 3D Content.
    [6] SCTE 187-2 2012, Stereoscopic 3D PSI Signaling
    [7] ITU-T Recommendation H.264, Advanced Video Coding for Generic Audiovisual Services
    [8] K. Hisatomi, M. Kano, K. Ikeya, M. Katayama, T. Mishina and Y. Iwad, “Depth Estimation Using an Infrared Dot Projector and an Infrared Color Stereo Camera,” IEEE Trans. on Circuits and Systems for Video Technology, vol. 27, no. 10, pp. 2086-2097, vol. 27, no. 10, Oct. 2017.
    [9] Pham, K. M. Lee, S.-K. Park, M. Kim and J. W. Jeon, “FPGA Design and Implementation of a Real-Time Stereo Vision System,” IEEE Trans. on Circuits and Systems for Video Technology, vol. 20, no. 1, pp. 15–26, Jan. 2010.
    [10] S. Ahmed, M. Hansard and A. Cavallaro, “Constrained Optimization for Plane-Based Stereo,” IEEE Trans. on Image Processing, vol. 27, no. 8, pp.3870-3882, April 2018.
    [11] M. Sharma, S. Chaudhury and B. Lall, “A Novel Hybrid Kinect-Variety-Based High-Quality Multiview Rendering Scheme for Glass-Free 3D Displays,” IEEE Trans. on Circuits and Systems for Video Technology, vol. 27, no. 10, pp.2098–2117, Oct. 2017.
    [12] J. Jin, A. Wang, Yao Zhao, C. Lin and B. Zengu, “Region-Aware 3-D Warping for DIBR,” IEEE Trans. on Multimedia, vol. 18, no. 6, pp. 953–966, June. 2016.
    [13] A. I. Purica, E. G. Mora, B. Ionescu, “Multiview Plus Depth Video Coding with Temporal Prediction View Synthesis,” IEEE Trans. on Circuits and Systems for Video Technology, vol. 26, no.2, pp. 360–374, Feb. 2016.
    [14] A. Redert, M. O. d. Beeck, C. Fehn, W. I. Jsselsteijn, M. Pollefeys, L. V. Gool, E. Ofek, I. Sextron and P. Surman, “ATTEST: Advanced Three-dimensional Television System Technologies,” Proceedings of the First International Symposium on 3D Data Processing Visualization and Transmission, pp. 313–319, 2002.
    [15] C. Fehn, “Depth-Image-Based Rendering (DIBR), Compression and Transmission for a New Approach on 3D-TV,” Proceedings of SPIE Stereoscopic Displays and Virtual Reality Systems XI, vol. 5291, pp. 93-104, 2004.
    [16] C. Fehn, “A 3D-TV Approach Using Depth-Image-Based Rendering (DIBR),” Proceeding of Visualization, Imaging, and Image Processing, pp. 482–487, Sept. 2003.
    [17] Philips 3D Solutions: 3D Interface Specifications White Paper.
    [18] B. Bross, W.-J. Han, J.-R. Ohm, G. J. Sullivan, Y.-K. Wang, T. Wigand, High Efficiency Video Coding (HEVC) Text Specification Draft 10 (for FDIS & Consent), Document no. JCTVC-L1003, Jan. 2013.
    [19] D. Flynn, K. Sharman, and C. Rosewarne, Common Test Conditions and Software Reference Configurations for HEVC range extensions, Joint Collaborative Team on Video Coding of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, Document no. JCTVC-N1006, Vienna, Aug. 2013.
    [20] G. Tech, Y. Chen, K. Müller, J.-R. Ohm, A. Vetro and Y.-K. Wang, “Overview of the Multiview and 3D Extensions of High Efficiency Video Coding,” IEEE Trans. on Circuits and Systems for Video Technology, vol. 26, no. 1, pp. 35–49, Jan. 2016.
    [21] D. A. Milovanovic, D. Kukolj and Z. S. Bojkovic, “Recent Advances on 3D Video Coding Technology: HEVC Standardization Framework,” Connected Media in the Future Internet Era, Springer, pp. 77-106, 2017.
    [22] Y. Mori, N. Fukushima, T. Yendo, T. Fujii, and M. Tanimoto, “View generation with 3D warping using depth information for FTV,” Signal Processing: Image Communication, vol. 24, no. 1-2, pp. 65-72, 2009.
    [23] P. Ndjiki-Nya, M. Koppel, D. Doshkov, H. Lakshman, P. Merkle, K. Muller, “Depth image-based rendering with advanced texture synthesis for 3-D video,” IEEE Trans. on Multimedia, vol. 13, no. 3, pp. 453-465, 2011.
    [24] W.-Y. Chen, Y.-L. Chang, S.-F. Lin, L.-F. Ding, and L.-G. Chen, “Efficient depth image based rendering with edge dependent depth filter and interpolation,” Proc. of IEEE International Conference on Multimedia and Expo, pp. 1314-1317, 2005.
    [25] W.-Y. Chen, Y.-L. Chang, S.-F. Lin, L.-F. Ding, and L.-G. Chen, “Efficient depth image based rendering with edge dependent depth filter and interpolation,” Proc. of 2005 IEEE International Conference on Multimedia and Expo, pp. 1314-1317, 2005
    [26] P. Lee and Effendi, “Nongeometric Distortion Smoothing Approach for Depth Map Preprocessing,” IEEE Trans. on Multimedia, vol. 13, no. 2, pp. 246-254, April 2011.
    [27] C.-H. Hsia, “Improved depth image-based rendering using an adaptive compensation method on an autostereoscopic 3-D display for a Kinect sensor,” IEEE Sensors Journal, vol. 15, no. 2, pp. 994-1002, 2015.
    [28] W. J. Tam, G. Alain, L. Zhang, T. Martin, and R. Renaud, “Smoothing depth maps for improved steroscopic image quality,” Three-Dimensional TV, Video, and Display III, 2004, vol. 5599: International Society for Optics and Photonics, pp. 162-173.
    [29] L. Zhang and W. J. Tam, “Stereoscopic image generation based on depth images for 3D TV,” IEEE Trans. on Broadcasting, vol. 51, no. 2, pp. 191-199, 2005.
    [30] W. Lie, C. Hsieh and G. Lin, “Key-Frame-Based Background Sprite Generation for Hole Filling in Depth Image-Based Rendering,” IEEE Trans. on Multimedia, vol. 20, no. 5, pp. 1075-1087, May 2018.
    [31] A. Oliveira, G. Fickel, M. Walter, and C. Jung, “Selective hole-filling for depth-image based rendering,” Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 1186-1190, 2015.
    [32] K.-T. Lee, “Depth map preprocessing based on inflection of gradient for virtual view synthesis,” Proc. of 3D Systems and Applications, IMID, 2017.
    [33] Y. Mao, G. Cheung, and Y. Ji, “Graph-based interpolation for DIBR-synthesized images with nonlocal means,” Proc. of IEEE Global Conference on Signal and Information Processing, pp. 451-454, 2013
    [34] Y. Mao, G. Cheung, A. Ortega, and Y. Ji, “Expansion hole filling in depth-image-based rendering using graph-based interpolation,” Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing, pp.1859-1863, 2013.
    [35] L. M. Po, S. Zhang, X. Xu, and Y. Zhu, “A new multidirectional extrapolation hole-filling method for depth-image-based rendering,” Proc. of 18th IEEE International Conference on Image Processing, pp. 2589-2592, 2011
    [36] C. Vázquez, W. J. Tam, and F. Speranza, “Stereoscopic imaging: filling disoccluded areas in depth image-based rendering,” Proc. of Three-Dimensional TV, Video, and Display V, vol. 6392, p. 63920D, 2006.
    [37] C.-M. Cheng, S.-J. Lin, S.-H. Lai, and J.-C. Yang, “Improved novel view synthesis from depth image with large baseline,” Proc. of IEEE International Conference on Pattern Recognition, pp. 1-4, 2008
    [38] High-Definition Multimedia Interface Specification Version 2.0. Available: http://www.dxdlw.com/bbsupfile/2013/10/21/2056366266/HDMISpecification2.0.pdf
    [39] Altera High-Definition Multimedia Interface (HDMI) IP Core User Guide. Available: https://cdrdv2.intel.com/v1/dl/getContent/704444?fileName=ug-hdmi-16.0-683798-704444.pdf
    [40] K. Tang, L. Shi, S. Guo, S. Pan, H. Xing, S. Su, P. Guo, Z. Chen and Y. He, “Vision locating method based RGB-D camera for amphibious spherical robots,” IEEE International Conference on Mechatronics and Automation (ICMA), 2017.
    [41] H. Zhu, J. Yin, and D. Yuan, “SVCV: segmentation volume combined with cost volume for stereo matching,” IET Computer Vision, 11.8: 733-743, 2017. doi:10.1049/iet-cvi.2016.0446
    [42] N. Chang, T. Tsai, B. Hsu, Y. Chen and T. Chang, “Algorithm and architecture of disparity estimation with mini-census adaptive support weight,” IEEE Transactions on Circuits and Systems for Video Technology, 20.6: 792-805, 2010. doi:10.1109/Tcsvt.2010.2045814
    [43] A. Roy and S. Todorovic, “Monocular depth estimation using neural regression forest,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016.
    [44] C. Godard, O. Mac Aodha, and G. Brostow, “Unsupervised monocular depth estimation with left-right consistency,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017.
    [45] S. Kumari, R. Jha, A. Bhavsar and A. Nigam, “Autodepth: Single image depth map estimation via residual cnn encoder-decoder and stacked hourglass,” IEEE International Conference on Image Processing (ICIP), 2019.
    [46] H.-M. Wang, C.-H. Huang and J.-F. Yang, “Block-based depth maps interpolation for efficient multiview content generation,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 21, no. 12, pp. 1847-1858 2011.
    [47] K. Vijayanagar, M. Loghman and J. Kim, “Refinement of depth maps generated by low-cost depth sensors,” International SoC Design Conference (ISOCC), 2012.
    [48] O. Gangwal and B. Djapic, “Real-time implementation of depth map post-processing for 3D-TV in dedicated hardware,” Digest of Technical Papers International Conference on Consumer Electronics (ICCE), 2010
    [49] J. Kopf, M. Cohen, D. Lischinski and M. Uyttendaele, “Joint bilateral upsampling,” ACM Transactions on Graphics (ToG), vol. 26, no. 3, pp. 96-es 2007.
    [50] C. Tomasi and R. Manduchi, “Bilateral filtering for gray and color images,” International Conference on Computer Vision (IEEE Cat. No. 98CH36271), 1998.
    [51] C. Dong, C. Loy, K. He and X. Tang, “Image super-resolution using deep convolutional networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 2, pp. 295-307, 2015.
    [52] Y. Zhang, Y. Tian, Y. Kong, B. Zhong and Y. Fu, “Residual dense network for image super-resolution,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018.
    [53] Y.-T. Zhou, R. Chellappa, A. Vaid and B.K. Jenkins, “Image restoration using a neural network,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 36, no. 7, pp. 1141-1151, 1988.
    [54] K. Zhang, W. Zuo, Y. Chen, D. Meng and L. Zhang, “Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising,” IEEE Transactions on Image Processing, vol. 26, no. 7, pp. 3142-3155, 2017.
    [55] H.C. Burger, C.J. Schuler and S. Harmeling, “Image denoising: Can plain neural networks compete with BM3D?,” IEEE Conference on Computer Vision and Pattern Recognition, 2012.
    [56] C. Yan, Z. Li, Y. Zhang, Y. Liu, X. Ji, Y. Zhang, “Depth image denoising using nuclear norm and learning graph model,” ACM Transactions on Multimedia Computing Communications and Applications, vol. 16(4), 1–17, November 2020. https://doi.org/10.1145/3404374
    [57] X. Zhang and R. Wu, “Fast depth image denoising and enhancement using a deep convolutional network,” IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016.
    [58] K. He, J. Sun and X. Tang, “Guided image filtering,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 6, pp. 1397-1409, 2012.
    [59] J. Zhu, J. Zhang, Y. Cao and Z. Wang, “Image guided depth enhancement via deep fusion and local linear regularization,” IEEE International Conference on Image Processing (ICIP), 2017.
    [60] K. He, X. Zhang, S. Ren and J. Sun, “Deep residual learning for image recognition,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016.
    [61] T.-Y. Lin, P. Goyal, R. Girshick, K. He and P. Dollár, “Focal loss for dense object detection,” Proceedings of the IEEE International Conference on Computer Vision, 2017.
    [62] A. Krizhevsky, I. Sutskever and G.E. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in Neural Information Processing Systems, vol. 25, pp. 1097-1105, 2012.
    [63] B. Lim, S. Son, H. Kim, S. Nah and K. Mu Lee, “Enhanced deep residual networks for single image super-resolution,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017.
    [64] G. Huang, Z. Liu, L. Van Der Maaten and K.Q. Weinberger, “Densely connected convolutional networks,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017.
    [65] T. Tong, G. Li, X. Liu and Q. Gao, “Image super-resolution using dense skip connections,” Proceedings of the IEEE International Conference on Computer Vision, 2017.
    [66] S. Ren, K. He, R. Girshick and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” Advances in Neural Information Processing Systems, vol. 28, pp. 91-99, 2015.
    [67] J Dai, Y Li, K He and J Sun, “R-fcn: Object detection via region-based fully convolutional networks,” Advances in Neural Information Processing Systems, 2016.
    [68] J Redmon, S Divvala, R Girshick and A Farhadi, “You only look once: Unified, real-time object detection,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016.
    [69] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu and A.C. Berg, “SSD: Single shot multibox detector,” European Conference on Computer Vision, 2016.
    [70] P. Wan, G. Cheung, D. Florencio, C. Zhang; O. C. Au, “Image Bit-Depth Enhancement via Maximum A Posteriori Estimation of AC Signal,” IEEE Trans. on Image Processing, vol. 25, no. 6, pp.2896-2909, June 2016.
    [71] J. Lei, L. Li, H. Yue, F. Wu, N. Ling, and C. Hou, “Depth Map Super-Resolution Considering View Synthesis Quality,” IEEE Trans. on Image Processing, vol. 26, no.4, pp.1732-1745, April 2017.
    [72] W. Liu, X. Chen, J. Yang and Q. Wu, “Variable Bandwidth Weighting for Texture Copy Artifact Suppression in Guided Depth Upsampling,” IEEE Trans. on Circuits and Systems for Video Technology, vol. 27, no. 10, pp. 2072–2085, Oct. 2017.
    [73] J. -F. Yang, H. -M. Wang, G.-C. Chen and L. Yu, Texture and Depth Frame Compatible Packing Formats in AVS Baseline Profile (Translated from Chinese), AVS-M4057, AVS 59st Meeting, Dec. 2016, Haikou, China
    [74] J. -F. Yang, G.-C. Chen, W.-J. Yang and L. Yu, “Revisions of AVS-P2-3D Display Extended Sequences (Translated from Chinese),” AVS-M4253, AVS 62nd Meeting, Aug. 2017, Dalian, China.
    [75] G. Tech, K. Wegner, Y. Chen, and S. Yea, 3D HEVC Test Model 3. Document: JCT3VC1005. Draft 3 of 3D-HEVC Test Model Description. Geneva, 2013.
    [76] D. Rusanovskyy, K. Müller, and A. Vetro, Common Test Conditions of 3DV Core Experiments, Joint Collaborative Team on 3D Video Coding Extensions of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, Document no. JC3VC-E1100, Vienna, Aug. 2013.
    [77] G. Bjontegaard, “Calculation of average PSNR differences between RD-curves,” presented at the 13th VCEG-M33 Meeting, Austin, TX, Apr. 2001.
    [78] M. Tanimoto, T. Fujii, and K. Suzuki, View Synthesis Algorithm in View Synthesis Reference Software 2.0 (VSRS2.0), ISO/IEC JTC1/SC29/WG11 M16090, Lausanne, Switzerland, Feb. 2008.
    [79] X.-Z. Zheng, AVS2-P2 Common Test Conditions, (Translation from Chinese) AVS-N2001, AVS 46th Meeting, Sep. 2013, Shenyang, China.
    [80] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Trans. on Image Processing, vol. 13, no. 4, pp. 600-612, 2004.
    [81] Middlebury Stereo Vision [Online] Available: http://vision.middlebury.edu/stereo/
    [82] Terasic Inc. Available: https://www.terasic.com.tw/
    [83] Bitec DSP solutions for industry & research. Available: https://bitec-dsp.com/
    [84] JoyVision Technology Co., Ltd. Available: http://www.joyvision3d.com/
    [85] C Yan, Y Hao, L Li, J Yin, A Liu, Z Mao, Z Chen, X Gao, “Task-adaptive attention for image captioning,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 1, pp. 43-51, Jan. 2022,
    doi: 10.1109/TCSVT.2021.3067449.
    [86] C Yan, L Meng, L Li, J Zhang, J Yin, J Zhang, Z Wang, B Zheng, “Age-invariant face recognition by multi-feature fusion and decomposition with self-attention,” ACM Transactions on Multimedia Computing Communications and Applications, vol. 18, no. 1, pp. 1–18, February 2022,
    https://doi.org/10.1145/3472810
    [87] N. Mayer, E. Ilg, P. Hausser, P. Fischer, D. Cremers, A. Dosovitskiy, and T. Brox, “A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
    [88] W. Zhou, X. Li and D. Reynolds, “Guided deep network for depth map super-resolution: How much can color help?,” IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017.
    [89] C. Yan, T. Teng, Y. Liu, Y. Zhang, H. Wang, X. Ji, “Precise no-reference image quality evaluation based on distortion identification,” ACM Transactions on Multimedia Computing Communications and Applications, vol. 17, no. 3, pp. 1–21, October 2021. https://doi.org/10.1145/3468872.

    無法下載圖示 校內:2027-08-01公開
    校外:2027-08-01公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE