| 研究生: | 黃致軒 Huang, Chih-Hsuan | 
|---|---|
| 論文名稱: | 單調大尺度環境下之vSLAM精準定位 Robust Large Scale Stereo Visual SLAM System against Low-texture Environments | 
| 指導教授: | 彭兆仲 Peng, Chao-Chung | 
| 學位類別: | 碩士 Master | 
| 系所名稱: | 工學院 - 航空太空工程學系 Department of Aeronautics & Astronautics | 
| 論文出版年: | 2025 | 
| 畢業學年度: | 113 | 
| 語文別: | 中文 | 
| 論文頁數: | 270 | 
| 中文關鍵詞: | 雙眼立體相機 、視覺同時定位與地圖建構 、高空低紋理環境 、平面約束 | 
| 外文關鍵詞: | stereo camera, vSLAM, high-altitude low-texture environment, plane constraint | 
| ORCID: | https://orcid.org/0009-0003-9479-8518 | 
| 相關次數: | 點閱:11 下載:0 | 
| 分享至: | 
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 | 
隨著無人機技術的廣泛應用,在農業、基礎設施檢測、災害評估等高空作業場景中,視覺定位系統面臨嚴重挑戰。當飛行高度增加時,地表紋理資訊急劇減少,形成低紋理環境,對傳統視覺同時定位與地圖構建(Visual Simultaneous Localization and Mapping, vSLAM)演算法造成根本性困難。本研究針對高空低紋理環境下的視覺定位問題,從理論基礎出發,完整推導了特徵法與直接法 vSLAM 的核心理論。在此基礎上,本研究提出了基於平面約束的直接法改進策略,建立了從平面檢測到逆深度先驗約束的理論框架,將軟約束融入最佳化框架中,並設計合理的先驗權重。為驗證演算法性能,本研究建置了專用無人機實驗平台,設計了八種不同高度與飛行模式的實驗場景,蒐集了完整的高空低紋理環境資料集。實驗結果顯示,特徵法在所有測試場景中均因特徵點稀疏而失效,驗證了其在低紋理環境下的根本性限制。直接法展現出明顯的環境適應性優勢,而平面約束機制進一步提升了演算法的存活時間與定位精度,證明了該改進策略的技術可行性。
With widespread UAV applications in agriculture, infrastructure inspection, and disaster assessment, visual positioning systems face severe challenges in high-altitude operations where reduced surface texture creates low-texture environments that cause fundamental difficulties for traditional visual Simultaneous Localization and Mapping(vSLAM) algorithms. This research addresses visual positioning in high-altitude low-texture environments by developing comprehensive theoretical foundations for feature-based and direct vSLAM methods and proposing a plane constraint-based direct method improvement that establishes a framework from plane detection to inverse depth prior constraints with soft constraint integration and appropriate weighting. A dedicated UAV platform was constructed with eight experimental scenarios across different altitudes and flight patterns to validate algorithm performance. Results demonstrate that feature-based methods fail in all scenarios due to sparse features, confirming their fundamental limitations in low-texture environments, while direct methods show superior environmental adaptability and the proposed plane constraint mechanism further enhances algorithm survival time and positioning accuracy, proving the technical feasibility of this improvement strategy.
[1] C. Cadena et al., "Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age," IEEE Transactions on robotics, vol. 32, no. 6, pp. 1309-1332, 2016.
[2] G. Bresson, Z. Alsayed, L. Yu, and S. Glaser, "Simultaneous localization and mapping: A survey of current trends in autonomous driving," IEEE Transactions on Intelligent Vehicles, vol. 2, no. 3, pp. 194-220, 2017.
[3] W. Chen et al., "An overview on visual slam: From tradition to semantic," Remote Sensing, vol. 14, no. 13, p. 3010, 2022.
[4] I. A. Kazerouni, L. Fitzgerald, G. Dooly, and D. Toal, "A survey of state-of-the-art on visual SLAM," Expert Systems with Applications, vol. 205, p. 117734, 2022.
[5] D. Scaramuzza and F. Fraundorfer, "Visual odometry [tutorial]," IEEE robotics & automation magazine, vol. 18, no. 4, pp. 80-92, 2011.
[6] R. Mur-Artal and J. D. Tardós, "Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras," IEEE transactions on robotics, vol. 33, no. 5, pp. 1255-1262, 2017.
[7] C. Campos, R. Elvira, J. J. G. Rodríguez, J. M. Montiel, and J. D. Tardós, "Orb-slam3: An accurate open-source library for visual, visual–inertial, and multimap slam," IEEE transactions on robotics, vol. 37, no. 6, pp. 1874-1890, 2021.
[8] J. Engel, V. Koltun, and D. Cremers, "Direct sparse odometry," IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 3, pp. 611-625, 2017.
[9] R. Wang, M. Schworer, and D. Cremers, "Stereo DSO: Large-scale direct sparse visual odometry with stereo cameras," in Proceedings of the IEEE international conference on computer vision, 2017, pp. 3903-3911.
[10] J. Mo and J. Sattar, "Extending monocular visual odometry to stereo camera systems by scale optimization," in 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2019: IEEE, pp. 6921-6927.
[11] N. Anders, M. Smith, J. Suomalainen, E. Cammeraat, J. Valente, and S. Keesstra, "Impact of flight altitude and cover orientation on Digital Surface Model (DSM) accuracy for flood damage assessment in Murcia (Spain) using a fixed-wing UAV," Earth Science Informatics, vol. 13, pp. 391-404, 2020.
[12] A. Elhadary, M. Rabah, E. Ghanim, R. Mohie, and A. Taha, "The influence of flight height and overlap on UAV imagery over featureless surfaces and constructing formulas predicting the geometrical accuracy," NRIAG Journal of Astronomy and Geophysics, vol. 11, no. 1, pp. 210-223, 2022.
[13] E. Rublee, V. Rabaud, K. Konolige, and G. Bradski, "ORB: An efficient alternative to SIFT or SURF," in 2011 International conference on computer vision, 2011: Ieee, pp. 2564-2571.
[14] S. Leutenegger, M. Chli, and R. Y. Siegwart, "BRISK: Binary robust invariant scalable keypoints," in 2011 International conference on computer vision, 2011: Ieee, pp. 2548-2555.
[15] P. Siritanawan, M. D. Prasanjith, and D. Wang, "3d feature points detection on sparse and non-uniform pointcloud for slam," in 2017 18th International Conference on Advanced Robotics (ICAR), 2017: IEEE, pp. 112-117.
[16] Z. Xiao and S. Li, "A real-time, robust and versatile visual-SLAM framework based on deep learning networks," arXiv preprint arXiv:2405.03413, 2024.
[17] J. Engel, T. Schöps, and D. Cremers, "LSD-SLAM: Large-scale direct monocular SLAM," in European conference on computer vision, 2014: Springer, pp. 834-849. 
[18] E. P. Herrera-Granda, J. C. Torres-Cantero, and D. H. Peluffo-Ordóñez, "Monocular visual SLAM, visual odometry, and structure from motion methods applied to 3D reconstruction: A comprehensive survey," Heliyon, 2024.
[19] D. DeTone, T. Malisiewicz, and A. Rabinovich, "Superpoint: Self-supervised interest point detection and description," in Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2018, pp. 224-236.
[20] P.-E. Sarlin, D. DeTone, T. Malisiewicz, and A. Rabinovich, "Superglue: Learning feature matching with graph neural networks," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 4938-4947.
[21] N. Yang, R. Wang, J. Stuckler, and D. Cremers, "Deep virtual stereo odometry: Leveraging deep depth prediction for monocular direct sparse odometry," in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 817-833.
[22] H. Zhou, B. Ummenhofer, and T. Brox, "Deeptam: Deep tracking and mapping," in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 822-838.
[23] K. Tateno, F. Tombari, I. Laina, and N. Navab, "Cnn-slam: Real-time dense monocular slam with learned depth prediction," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 6243-6252.
[24] Z. Chen et al., "Deep learning features at scale for visual place recognition," in 2017 IEEE international conference on robotics andautomation (ICRA), 2017: IEEE, pp.3223-3230. 
[25] C. Chen, B. Wang, C. X. Lu, N. Trigoni, and A. Markham, "A survey on deep learning for localization and mapping: Towards the age of spatial machine intelligence," arXiv preprint arXiv:2006.12567, 2020.
[26] S. Yang, Y. Song, M. Kaess, and S. Scherer, "Pop-up slam: Semantic monocular plane slam for low-texture environments," in 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2016: IEEE, pp. 1222-1229.  
[27] A. Concha and J. Civera, "DPPTAM: Dense piecewise planar tracking and mapping from a monocular sequence," in 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2015: IEEE, pp. 5686-5693.
[28] G. H. Lee, F. Fraundorfer, and M. Pollefeys, "Mav visual slam with plane constraint," in 2011 IEEE International Conference on Robotics and Automation, 2011: IEEE, pp. 3139-3144.
[29] J. Civera, A. J. Davison, and J. M. Montiel, "Inverse depth parametrization for monocular SLAM," IEEE transactions on robotics, vol. 24, no. 5, pp. 932-945, 2008.
[30] D. Gutiérrez-Gómez, W. Mayol-Cuevas, and J. J. Guerrero, "Dense RGB-D visual odometry using inverse depth," Robotics and Autonomous Systems, vol. 75, pp. 571-583, 2016. 
[31] M. FISCHLER AND, "Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography," Commun. ACM, vol. 24, no. 6, pp. 381-395, 1981.
[32] R. I. Hartley and P. Sturm, "Triangulation," Computer vision and image understanding, vol. 68, no. 2, pp. 146-157, 1997.
[33] V. Lepetit, F. Moreno-Noguer, and P. Fua, "EP n P: An accurate O (n) solution to the P n P problem," International journal of computer vision, vol. 81, pp. 155-166, 2009.
[34] A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, "Vision meets robotics: The kitti dataset," The international journal of robotics research, vol. 32, no. 11, pp. 1231-1237, 2013.
[35] M. Burri et al., "The EuRoC micro aerial vehicle datasets," The International Journal of Robotics Research, vol. 35, no. 10, pp. 1157-1163, 2016.
 校內:2030-08-18公開
                                        校內:2030-08-18公開