簡易檢索 / 詳目顯示

研究生: 黃致軒
Huang, Chih-Hsuan
論文名稱: 單調大尺度環境下之vSLAM精準定位
Robust Large Scale Stereo Visual SLAM System against Low-texture Environments
指導教授: 彭兆仲
Peng, Chao-Chung
學位類別: 碩士
Master
系所名稱: 工學院 - 航空太空工程學系
Department of Aeronautics & Astronautics
論文出版年: 2025
畢業學年度: 113
語文別: 中文
論文頁數: 270
中文關鍵詞: 雙眼立體相機視覺同時定位與地圖建構高空低紋理環境平面約束
外文關鍵詞: stereo camera, vSLAM, high-altitude low-texture environment, plane constraint
ORCID: https://orcid.org/0009-0003-9479-8518
相關次數: 點閱:11下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著無人機技術的廣泛應用,在農業、基礎設施檢測、災害評估等高空作業場景中,視覺定位系統面臨嚴重挑戰。當飛行高度增加時,地表紋理資訊急劇減少,形成低紋理環境,對傳統視覺同時定位與地圖構建(Visual Simultaneous Localization and Mapping, vSLAM)演算法造成根本性困難。本研究針對高空低紋理環境下的視覺定位問題,從理論基礎出發,完整推導了特徵法與直接法 vSLAM 的核心理論。在此基礎上,本研究提出了基於平面約束的直接法改進策略,建立了從平面檢測到逆深度先驗約束的理論框架,將軟約束融入最佳化框架中,並設計合理的先驗權重。為驗證演算法性能,本研究建置了專用無人機實驗平台,設計了八種不同高度與飛行模式的實驗場景,蒐集了完整的高空低紋理環境資料集。實驗結果顯示,特徵法在所有測試場景中均因特徵點稀疏而失效,驗證了其在低紋理環境下的根本性限制。直接法展現出明顯的環境適應性優勢,而平面約束機制進一步提升了演算法的存活時間與定位精度,證明了該改進策略的技術可行性。

    With widespread UAV applications in agriculture, infrastructure inspection, and disaster assessment, visual positioning systems face severe challenges in high-altitude operations where reduced surface texture creates low-texture environments that cause fundamental difficulties for traditional visual Simultaneous Localization and Mapping(vSLAM) algorithms. This research addresses visual positioning in high-altitude low-texture environments by developing comprehensive theoretical foundations for feature-based and direct vSLAM methods and proposing a plane constraint-based direct method improvement that establishes a framework from plane detection to inverse depth prior constraints with soft constraint integration and appropriate weighting. A dedicated UAV platform was constructed with eight experimental scenarios across different altitudes and flight patterns to validate algorithm performance. Results demonstrate that feature-based methods fail in all scenarios due to sparse features, confirming their fundamental limitations in low-texture environments, while direct methods show superior environmental adaptability and the proposed plane constraint mechanism further enhances algorithm survival time and positioning accuracy, proving the technical feasibility of this improvement strategy.

    摘要 i Extended Abstract ii 誌謝 xxiv 目錄 xxv 表目錄 xxviii 圖目錄 xxix 第1章 緒論 1 1.1. 研究動機與目的 1 1.2. 文獻回顧 3 第2章 三維空間之位姿表示 6 2.1. 方向餘弦矩陣與平移向量 6 2.1.1. 方向餘弦矩陣與平移向量之定義 6 2.1.2. 歐拉角與方向餘弦矩陣之關係 7 2.1.3. 三維剛體變換矩陣表示 10 2.2. 李群李代數(Lie Group and Lie Algebra) 11 2.2.1. 泊松公式(Poisson’s Equation) 13 2.2.2. 李群與李代數間的轉換描述 25 2.2.3. 李代數之偏微分性質 28 第3章 特徵法之學理推導 38 3.1. 針孔相機模型(Pinhole Camera Model) 38 3.1.1. 學理推導 38 3.1.2. 量化誤差模型 43 3.1.3. 針孔相機模型與量化誤差之模擬分析 46 3.2. 畸變模型(Distortion Model) 49 3.2.1. 學理推導 50 3.2.2. 畸變模型與量化誤差之模擬分析 51 3.3. 三角測量(Triangulation) 55 3.3.1. 學理推導 55 3.3.2. 三角測量演算法模擬分析 59 3.4. 透視 n 點問題(Perspective-n-Point , PnP) 64 3.4.1. Direct Linear Transformation(DLT PnP) 64 3.4.2. Efficient PnP(EPnP) 69 3.4.3. Random Sample Consensus PnP(RANSAC PnP) 86 3.4.4. PnP問題於虛擬環境中之模擬測試 87 3.5. 光束法平差(Bundle Adjustment, BA) 101 3.5.1. 由最大後驗機率引出光束法平差問題 101 3.5.2. 光束法平差之 Jacobian 矩陣推導 106 3.5.3. 離群值排除 111 3.5.4. 利用矩陣稀疏性之計算過程加速 114 3.5.5. Motion-only Bundle Adjustment 117 3.5.6. 光束法平差之模擬與分析 119 第4章 基於特徵法開發之視覺定位演算法驗證與分析 129 4.1. 實驗平台建置與資料獲取方法 129 4.1.1. 無人機硬體系統架構 130 4.1.2. 真實值位姿資訊獲取方法 131 4.2. 演算法架構 132 4.3. 開源資料集測試 134 4.4. 低紋理環境下測試 137 4.4.1. 實驗場域環境介紹 137 4.4.2. 實驗飛行路徑展示 139 4.4.3. 評估方法說明 142 4.4.4. 實驗結果與分析 145 第5章 直接法之學理推導與精進 147 5.1. 影像形成模型(Image Formation Model) 147 5.2. 光度誤差最小化(Minimizing Photometric Error) 149 5.3. 直接法之 Jacobian 矩陣 162 5.4. 引入平面擬合(Plane Fitting) 184 5.5. 加入平面參數作為最佳化框架之先驗約束 194 第6章 基於直接法開發之視覺定位演算法驗證與分析 210 6.1. 演算法架構 210 6.2. 開源資料集測試 212 6.3. 低紋理環境下測試 213 6.4. 不同演算法綜整 218 第7章 結論與未來方向 220 7.1. 結論 220 7.2. 論文貢獻 220 7.3. 未來方向 221 參考文獻 222 附錄A 位姿圖最佳化(Pose Graph Optimization, PGO) 225 A.1. 一維位姿圖最佳化 225 A.2. 二維位姿圖最佳化 228 A.3. 三維位姿圖最佳化 233

    [1] C. Cadena et al., "Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age," IEEE Transactions on robotics, vol. 32, no. 6, pp. 1309-1332, 2016.
    [2] G. Bresson, Z. Alsayed, L. Yu, and S. Glaser, "Simultaneous localization and mapping: A survey of current trends in autonomous driving," IEEE Transactions on Intelligent Vehicles, vol. 2, no. 3, pp. 194-220, 2017.
    [3] W. Chen et al., "An overview on visual slam: From tradition to semantic," Remote Sensing, vol. 14, no. 13, p. 3010, 2022.
    [4] I. A. Kazerouni, L. Fitzgerald, G. Dooly, and D. Toal, "A survey of state-of-the-art on visual SLAM," Expert Systems with Applications, vol. 205, p. 117734, 2022.
    [5] D. Scaramuzza and F. Fraundorfer, "Visual odometry [tutorial]," IEEE robotics & automation magazine, vol. 18, no. 4, pp. 80-92, 2011.
    [6] R. Mur-Artal and J. D. Tardós, "Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras," IEEE transactions on robotics, vol. 33, no. 5, pp. 1255-1262, 2017.
    [7] C. Campos, R. Elvira, J. J. G. Rodríguez, J. M. Montiel, and J. D. Tardós, "Orb-slam3: An accurate open-source library for visual, visual–inertial, and multimap slam," IEEE transactions on robotics, vol. 37, no. 6, pp. 1874-1890, 2021.
    [8] J. Engel, V. Koltun, and D. Cremers, "Direct sparse odometry," IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 3, pp. 611-625, 2017.
    [9] R. Wang, M. Schworer, and D. Cremers, "Stereo DSO: Large-scale direct sparse visual odometry with stereo cameras," in Proceedings of the IEEE international conference on computer vision, 2017, pp. 3903-3911.
    [10] J. Mo and J. Sattar, "Extending monocular visual odometry to stereo camera systems by scale optimization," in 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2019: IEEE, pp. 6921-6927.
    [11] N. Anders, M. Smith, J. Suomalainen, E. Cammeraat, J. Valente, and S. Keesstra, "Impact of flight altitude and cover orientation on Digital Surface Model (DSM) accuracy for flood damage assessment in Murcia (Spain) using a fixed-wing UAV," Earth Science Informatics, vol. 13, pp. 391-404, 2020.
    [12] A. Elhadary, M. Rabah, E. Ghanim, R. Mohie, and A. Taha, "The influence of flight height and overlap on UAV imagery over featureless surfaces and constructing formulas predicting the geometrical accuracy," NRIAG Journal of Astronomy and Geophysics, vol. 11, no. 1, pp. 210-223, 2022.
    [13] E. Rublee, V. Rabaud, K. Konolige, and G. Bradski, "ORB: An efficient alternative to SIFT or SURF," in 2011 International conference on computer vision, 2011: Ieee, pp. 2564-2571.
    [14] S. Leutenegger, M. Chli, and R. Y. Siegwart, "BRISK: Binary robust invariant scalable keypoints," in 2011 International conference on computer vision, 2011: Ieee, pp. 2548-2555.
    [15] P. Siritanawan, M. D. Prasanjith, and D. Wang, "3d feature points detection on sparse and non-uniform pointcloud for slam," in 2017 18th International Conference on Advanced Robotics (ICAR), 2017: IEEE, pp. 112-117.
    [16] Z. Xiao and S. Li, "A real-time, robust and versatile visual-SLAM framework based on deep learning networks," arXiv preprint arXiv:2405.03413, 2024.
    [17] J. Engel, T. Schöps, and D. Cremers, "LSD-SLAM: Large-scale direct monocular SLAM," in European conference on computer vision, 2014: Springer, pp. 834-849.
    [18] E. P. Herrera-Granda, J. C. Torres-Cantero, and D. H. Peluffo-Ordóñez, "Monocular visual SLAM, visual odometry, and structure from motion methods applied to 3D reconstruction: A comprehensive survey," Heliyon, 2024.
    [19] D. DeTone, T. Malisiewicz, and A. Rabinovich, "Superpoint: Self-supervised interest point detection and description," in Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2018, pp. 224-236.
    [20] P.-E. Sarlin, D. DeTone, T. Malisiewicz, and A. Rabinovich, "Superglue: Learning feature matching with graph neural networks," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 4938-4947.
    [21] N. Yang, R. Wang, J. Stuckler, and D. Cremers, "Deep virtual stereo odometry: Leveraging deep depth prediction for monocular direct sparse odometry," in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 817-833.
    [22] H. Zhou, B. Ummenhofer, and T. Brox, "Deeptam: Deep tracking and mapping," in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 822-838.
    [23] K. Tateno, F. Tombari, I. Laina, and N. Navab, "Cnn-slam: Real-time dense monocular slam with learned depth prediction," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 6243-6252.
    [24] Z. Chen et al., "Deep learning features at scale for visual place recognition," in 2017 IEEE international conference on robotics andautomation (ICRA), 2017: IEEE, pp.3223-3230.
    [25] C. Chen, B. Wang, C. X. Lu, N. Trigoni, and A. Markham, "A survey on deep learning for localization and mapping: Towards the age of spatial machine intelligence," arXiv preprint arXiv:2006.12567, 2020.
    [26] S. Yang, Y. Song, M. Kaess, and S. Scherer, "Pop-up slam: Semantic monocular plane slam for low-texture environments," in 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2016: IEEE, pp. 1222-1229.
    [27] A. Concha and J. Civera, "DPPTAM: Dense piecewise planar tracking and mapping from a monocular sequence," in 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2015: IEEE, pp. 5686-5693.
    [28] G. H. Lee, F. Fraundorfer, and M. Pollefeys, "Mav visual slam with plane constraint," in 2011 IEEE International Conference on Robotics and Automation, 2011: IEEE, pp. 3139-3144.
    [29] J. Civera, A. J. Davison, and J. M. Montiel, "Inverse depth parametrization for monocular SLAM," IEEE transactions on robotics, vol. 24, no. 5, pp. 932-945, 2008.
    [30] D. Gutiérrez-Gómez, W. Mayol-Cuevas, and J. J. Guerrero, "Dense RGB-D visual odometry using inverse depth," Robotics and Autonomous Systems, vol. 75, pp. 571-583, 2016.
    [31] M. FISCHLER AND, "Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography," Commun. ACM, vol. 24, no. 6, pp. 381-395, 1981.
    [32] R. I. Hartley and P. Sturm, "Triangulation," Computer vision and image understanding, vol. 68, no. 2, pp. 146-157, 1997.
    [33] V. Lepetit, F. Moreno-Noguer, and P. Fua, "EP n P: An accurate O (n) solution to the P n P problem," International journal of computer vision, vol. 81, pp. 155-166, 2009.
    [34] A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, "Vision meets robotics: The kitti dataset," The international journal of robotics research, vol. 32, no. 11, pp. 1231-1237, 2013.
    [35] M. Burri et al., "The EuRoC micro aerial vehicle datasets," The International Journal of Robotics Research, vol. 35, no. 10, pp. 1157-1163, 2016.

    無法下載圖示 校內:2030-08-18公開
    校外:2030-08-18公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE