簡易檢索 / 詳目顯示

研究生: 張育嘉
Jhang, Yu-Jia
論文名稱: 基於玻璃區域檢測與線性插值的室內深度補全
Glass Area Detection and Linear Interpolation for Indoor Depth Completion
指導教授: 蔡佩璇
Tsai, Pei-Hsuan
學位類別: 碩士
Master
系所名稱: 敏求智慧運算學院 - 智慧科技系統碩士學位學程
MS Degree Program on Intelligent Technology Systems
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 48
中文關鍵詞: 深度補全孔洞填充圖像分割插值玻璃
外文關鍵詞: depth completion, hole filling, image segmentation, interpolation, glass
相關次數: 點閱:131下載:10
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 市售的RGB-D攝影機在特定條件下的深度精度限制,例如在物體表面有光澤或透明、光源太強或距離太遠時,可能導致深度精度下降。深度學習模型的訓練依賴於準確的數據來學習模式和特徵,若感測器提供的資料存在誤差或雜訊,將可能使模型難以正確學習和捕捉有意義的特徵。
    本研究發現Matterport3D Dataset所產生的Ground Truth深度圖資料並不完全正確,若直接拿來做深度學習可能導致模型偏差或不準確,進而影響模型的判斷能力。使用這些數據集訓練的機器人可能無法辨識玻璃物體,導致意外發生。
    因此,本研究的目標是開發有效的方法,修補大片玻璃缺失的深度值,從而提高深度圖像的品質和可用性。在本研究中,我們提出了以下幾個方法:
    1. 使用彩色圖和深度圖的特徵辨識圖像上的大片玻璃區域,不需要大量的標注資料作訓練,也不會有模型偏差的問題。
    2. 將玻璃辨識任務由二元分類改為機率分布,可靈活地調整機率閾值以應用在不同場景。
    3. 提出一個修補大片玻璃區域的方法,使修補完的深度圖趨近於真實值,提高深度圖像的品質。
    實驗結果顯示,結合彩色圖和深度圖特徵的方法能區分玻璃和非玻璃物體,不需要事先準備大量標注資料。另外,本研究提出的補值方法能將缺失區域補出近似真實玻璃平面的深度值。

    Commercial RGB-D cameras may have limitations in depth accuracy under certain conditions, such as glossy or transparent surfaces, strong light sources, or long distances, resulting in decreased precision. Training deep learning models heavily relies on accurate data for effective learning of patterns and features. However, if the sensor data contains errors or noise, the deep learning model may struggle to capture meaningful features and lead to biases.
    The objective of this research is to develop effective methods for filling large areas of missing depth values caused by glass surfaces, thereby enhancing the quality and usability of depth images. The proposed approaches include leveraging both color and depth features to accurately identify glass areas, transforming the binary glass segmentation task into a probability distribution for flexible threshold adjustment in different scenarios, and introducing a method to repair large glass regions by approximating missing depth values to real-world values.
    Experimental results demonstrate the superior performance of the proposed methods in detecting glass objects. In comparison to prior approaches that relied solely on color features, the fusion of color and depth features substantially reduces the necessity for an extensive amount of labeled data to train the model. Furthermore, our approach effectively fills the missing regions with depth values that closely resemble real glass surfaces.

    摘要 I 誌謝 VII 目錄 VIII 表目錄 X 圖目錄 XI 第一章 緒論 1 1.1 研究動機 1 1.2 研究目的 7 1.3 論文架構 8 第二章 文獻回顧 9 2.1 基於彩色圖上語義特徵估計深度 9 2.2 基於彩色圖上多種特徵估計深度 10 2.3 基於視差估計深度 13 2.4 玻璃區域的辨識任務 14 第三章 研究方法 16 3.1 尋找玻璃區域 16 3.2 Hole filling 20 3.3 資料前處理 23 3.4 玻璃區域補值流程 23 第四章 實驗結果 25 4.1 實驗環境 25 4.2 指標介紹 26 4.3 Matterport3D Dataset 28 4.4 Semantic-Aware Glass Surface Detection Dataset 30 4.5 收集實驗資料集 31 4.6 實驗參數設置 32 4.7 定量分析 33 4.8 定性分析 40 第五章 結論與建議 43 第六章 參考文獻 45

    [1] B. Liu, S. Gould and D. Koller, "Single image depth estimation from predicted semantic labels," 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1253-1260, 2010.
    [2] Peng Wang, Xiaohui Shen, Zhe Lin, S. Cohen, B. Price and A. Yuille, "Towards unified depth and semantic prediction from a single image", 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2800-2809, 2015.
    [3] Garg, R., Kumar, B.V., Carneiro, G., & Reid, I.D. "Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue", European Conference on Computer Vision, pp 740–756, 2016.
    [4] C. Godard, O. M. Aodha and G. J. Brostow, "Unsupervised Monocular Depth Estimation with Left-Right Consistency," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6602-6611, 2017.
    [5] Zhou, Lingtao et al. "Unsupervised Video Depth Estimation Based on Ego-motion and Disparity Consensus", ArXiv abs/1909.01028, 2019.
    [6] Y. Zhang and T. Funkhouser, "Deep Depth Completion of a Single RGB-D Image," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 175-185, 2018.
    [7] Y. -K. Huang, T. -H. Wu, Y. -C. Liu and W. H. Hsu, "Indoor Depth Completion with Boundary Consistency and Self-Attention," 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 1070-1078, 2019.
    [8] S. Sajjan et al., "Clear Grasp: 3D Shape Estimation of Transparent Objects for Manipulation," 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 3634-3642, 2020.
    [9] D. Senushkin, M. Romanov, I. Belikov, N. Patakin and A. Konushin, "Decoder Modulation for Indoor Depth Completion," 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2181-2188, 2021.
    [10] H. -M. Wang, C. -H. Huang and J. -F. Yang, "Block-Based Depth Maps Interpolation for Efficient Multiview Content Generation," in IEEE Transactions on Circuits and Systems for Video Technology, vol. 21, no. 12, pp. 1847-1858, Dec. 2011.
    [11] Y. Mao, G. Cheung, A. Ortega and Y. Ji, "Expansion hole filling in depth-image-based rendering using graph-based interpolation," 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 1859-1863, 2013.
    [12] H. Tian and F. Li, "Color-guided depth refinement based on edge alignment," 2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP), pp. 1-5, 2016.
    [13] A. Le, S.-W. Jung, and C. Won, "Directional Joint Bilateral Filter for Depth Images, " Sensors, vol. 14, no. 7, pp. 11362–11378, Jun. 2014.
    [14] Y. -R. Horng, Y. -C. Tseng and T. -S. Chang, "VLSI Architecture for Real-Time HD1080p View Synthesis Engine," in IEEE Transactions on Circuits and Systems for Video Technology, vol. 21, no. 9, pp. 1329-1340, Sept. 2011.
    [15] G. Luo, Y. Zhu and B. Guo, "Fast MRF-Based Hole Filling for View Synthesis," in IEEE Signal Processing Letters, vol. 25, no. 1, pp. 75-79, Jan. 2018.
    [16] Z. Li, J. Liu and J. Cheng, "Exploiting Multi-Direction Features in MRF-Based Image Inpainting Approaches," in IEEE Access, vol. 7, pp. 179905-179917, 2019.
    [17] H. G. Kim, S. S. Yoon and Y. M. Ro, "Temporally consistent hole filling method based on global optimization with label propagation for 3D video," 2015 IEEE International Conference on Image Processing (ICIP), pp. 3136-3140, 2015.
    [18] J. F. Blinn, "Compositing. 1. Theory," in IEEE Computer Graphics and Applications, vol. 14, no. 5, pp. 83-87, Sept. 1994.
    [19] Lin, J., Yeung, Y. H., & Lau, R. W, "Exploiting Semantic Relations for Glass Surface Detection," In Advances in Neural Information Processing Systems, vol. 35, pp. 22490-22504, 2022.
    [20] J. Lin, Z. He and R. W. H. Lau, "Rich Context Aggregation with Reflection Prior for Glass Surface Detection," 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13410-13419, 2021.
    [21] Silberman, N., Hoiem, D., Kohli, P., Fergus, R, "Indoor Segmentation and Support Inference from RGBD Images," Computer Vision – ECCV 2012. Lecture Notes in Computer Science, vol 7576, pp. 746–760, 2012.
    [22] Dai, et al., "ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes," in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2432-2443, 2017.
    [23] Chang et al., "Matterport3D: Learning from RGB-D Data in Indoor Environments," 2017 International Conference on 3D Vision (3DV), pp. 667-676, 2017.
    [24] Carsten Rother, Vladimir Kolmogorov, and Andrew Blake, ""GrabCut": interactive foreground extraction using iterated graph cuts," ACM Transactions on Graphics, vol 23, pp. 309–314, Aug. 2004.

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE