| 研究生: |
王瑞評 Wang, Ruei-Ping |
|---|---|
| 論文名稱: |
雙向引導擴散式神經網路應用於立體視覺影像匹配 Dual-GDNet: Dual Guided-diffusion Network for Stereo Image Dense Matching |
| 指導教授: |
林昭宏
Lin, Chao-Hung |
| 學位類別: |
碩士 Master |
| 系所名稱: |
工學院 - 測量及空間資訊學系 Department of Geomatics |
| 論文出版年: | 2020 |
| 畢業學年度: | 108 |
| 語文別: | 英文 |
| 論文頁數: | 71 |
| 中文關鍵詞: | 立體視覺影像匹配 、深度卷積神經網路 |
| 外文關鍵詞: | Stereo matching, Deep Convolutional Neural Network |
| 相關次數: | 點閱:138 下載:8 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在3D 資訊重建中,立體視覺影像密匹配為關鍵步驟之一,其在攝影測量和計算機視覺領域方面,仍然是一項艱鉅的任務。除了基於罩窗的匹配之外,在機器學習的最新研究中,其通過使用深度卷積神經網絡(DCNN),在密匹配取得了很大進展。雙向引導擴散式神經網絡(Dual-GDNet),於本研究中提出,其不僅在網絡設計和訓練過程中,採用由左影像至右影像的傳統匹配架構,並於訓練中加入由右至左的左右一致性檢測之特性,意於減少錯誤匹配的機率。此外,抑制型回歸(Suppressed Regression) 亦於本研究中提出,其通過在視差回歸前,去除無關機率信息來估計視差,避免在多峰機率分布中,估計出不屬於任何一峰的視差結果。Dual-GDNet中的左右一致性概念,可應用於現有的DCNN 模型,以進一步改善視差估計。為了評估各新型神經網路設計之性能,選擇了GANet作為骨幹和主要比較對象,並採納Scene Flow 和KITTI 2015 立體數據集,做為訓練與評估模型的資料集。實驗結果證明,在端點誤差(EPE Error)、大於一之像素誤差比率(Error Rate) 和top2誤差方面,與相關模型比較之下,表現較為優異,其中Scene Flow 數據集的改進為2-10%,KITTI 2015 數據集的改進為2-8%。
Stereo dense matching which plays a key role in 3D reconstruction still remains a challenging task in photogrammetry and computer vision. In addition to block-based matching, recent studies based on machine learning have achieved great progress in stereo dense matching by using deep convolutional neural networks (DCNN). In this paper, a novel neural network called dual guided-diffusion network (Dual-GDNet) is proposed, which utilizes not only left-to-right but right-to-left image matchings in the network design and training with a consistentization process to reduce the possibilities of mis-matching. In addition, suppressed regression is proposed to refine disparity estimation by removing unrelated information before regression to prevent ambiguous predictions on multi-peaks probability distributions. The proposed Dual-GDNet can be applied to existing DCNN models for further improvement on disparity estimation. To estimate the performance, GA-Net is selected as the backbone, and the model was evaluated on the stereo datasets including Scene Flow and KITTI 2015. Experimental results demonstrate the superior, in terms of end-point-error, > 1 pixel error rate, and top-2 error, of the proposed model, compared with related models. An improvement of 2-10% on Scene Flow and 2-8% on KITTI 2015 datasets were obtained.
Atienza, R. (2018). Fast disparity estimation using dense networks. CoRR,abs/1805.07499. Retrieved from http://arxiv.org/abs/1805.07499
Žbontar, J., & LeCun, Y. (2015). Stereo matching by training a convolutional neural network to compare image patches. CoRR, abs/1510.05970. Retrieved from http://dblp.uni-trier.de/db/journals/corr/corr1510.html#ZbontarL15
Chang, J., & Chen, Y. (2018). Pyramid stereo matching network. CoRR,abs/1803.08669. Retrieved from http://arxiv.org/abs/1803.08669
Chen, J., & Yuan, C. (2016, 09). Convolutional neural network using multiscale information for stereo matching cost computation. In (p. 34243428). doi: 10.1109/ICIP.2016.7532995
Cheng, F., He, X., & Zhang, H. (2017, 08). Learning to refine depth for robust stereo estimation. Pattern Recognition, 74. doi: 10.1016/j.patcog.2017.07.027
Cheng, X., Wang, P., & Yang, R. (2018). Learning depth with convolutional spatial propagation network. arXiv preprint arXiv:1810.02695
Godard, C., Aodha, O. M., & Brostow, G. J. (2016). Unsupervised monocular depth estimation with leftright consistency. CoRR, abs/1609.03677. Retrieved from http://arxiv.org/abs/1609.03677
He, K., Zhang, X., Ren, S., & Sun, J. (2015). Deep residual learning for image recognition. CoRR, abs/1512.03385. Retrieved from http://arxiv.org/abs/1512.03385
Hinton, G. E., Vinyals, O., & Dean, J. (2015). Distilling the knowledge in a neural network. ArXiv, abs/1503.02531
Hirschmüller, H. (2008). Stereo Processing by SemiGlobal Matching and Mutual Information. IEEE TRANSACTIONS ON PATTERN 69 doi:10.6844/NCKU202002215 ANALYSIS AND MACHINE INTELLIGENCE, 30(2), 328–341
Huang, G., Liu, Z., & Weinberger, K. Q. (2016). Densely connected convolutional networks. CoRR, abs/1608.06993. Retrieved from http://arxiv.org/abs/1608.06993
Kang, J., Chen, L., Deng, F., & Heipke, C. (2019, 09). Context pyramidal network for stereo matching regularized by disparity gradients. ISPRS Journal of Photogrammetry and Remote Sensing, 157. doi:10.1016/j.isprsjprs.2019.09.012
Kendall, A., Martirosyan, H., Dasgupta, S., Henry, P., Kennedy, R., Bachrach, A., & Bry, A. (2017). Endtoend learning of geometry and context for deep stereo regression. CoRR, abs/1703.04309. Retrieved from http://arxiv.org/abs/1703.04309
Kingma, D., & Ba, J. (2014, 12). Adam: A method for stochastic optimization. International Conference on Learning Representations
Liu, S., De Mello, S., Gu, J., Zhong, G., Yang, M.H., & Kautz, J. (2017). Learning affinity via spatial propagation networks. In I. Guyon et al. (Eds.), Advances in neural information processing systems 30 (pp. 1520–1530). Curran Associates, Inc. Retrieved from http://papers.nips.cc/paper/6750-learning-affinity-via-spatial-propagation-networks.pdf
Luo, W., Schwing, A. G., & Urtasun, R. (2016). Efficient deep learning for stereo matching. In 2016 ieee conference on computer vision and pattern recognition (cvpr) (p. 56955703)
Mayer, N., Ilg, E., Häusser, P., Fischer, P., Cremers, D., Dosovitskiy, A., & Brox, T. (2015). A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. CoRR, abs/1512.02134. Retrieved from http://arxiv.org/abs/
1512.02134
Menze, M., & Geiger, A. (2015). Object scene flow for autonomous vehicles. In 2015 ieee conference on computer vision and pattern recognition (cvpr) (p. 30613070)
Mozerov, M., & Weijer, J. (2015, 01). Accurate stereo matching by twostep energy minimization. IEEE transactions on image processing : a 70 doi:10.6844/NCKU202002215 publication of the IEEE Signal Processing Society, 24. doi: 10.1109/TIP.2015.2395820
PierrotDeseilligny, M., & Paparoditis, N. (2006). A multiresolution and optimizationbased image matching approach: An application to surface reconstruction from SPOT5HRS stereo imagery. The International Archives of Photogrammetry, Remote Sensing andSpatial Information Sciences, 36(1/W41), 328–341
Scharstein, D., & Szeliski, R. (2002). A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. International Journal of Computer Vision, 47(13),7–42
Seki, A., & Pollefeys, M. (2017). A. Seki and M. Pollefeys. Sgmnets:Semiglobal matching with neural networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 6640–6649
Wolf, P. R., & Dewitt, B. A. (2000). Elements of Photogrammetry: with applications in GIS. New York: McGrawHill, 30
Zhang, F., Prisacariu, V., Yang, R., & Torr, P. H. (2019). GANet:Guided Aggregation Net for Endtoend Stereo Matching. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 185–194