簡易檢索 / 詳目顯示

研究生: 蕭文遠
Hsiao, Wen-Yuan
論文名稱: 考量對應點品質的高效關鍵點選擇策略應用於迭代最近點演算法之即時視覺里程計系統
Efficient Correspondence-quality-aware Keypoint Selection for Real-time ICP-based Visual Odometry
指導教授: 謝明得
Shieh, Ming-Der
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2018
畢業學年度: 106
語文別: 英文
論文頁數: 62
中文關鍵詞: 視覺里程計同步定位及地圖建構擴增實境關鍵點選擇策略
外文關鍵詞: Visual odometry, Simultaneous localization and mapping (SLAM), Iterative closest point (ICP), Augmented Reality (AR), Keypoint selection
相關次數: 點閱:101下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在擴增實境(Augmented Reality) 的應用中,視覺里程計或同步定位及地圖建構是提升使用者體驗的關鍵演算法,為了讓使用者認為虛擬物件是真實存在的,必須將定位的準確度與速度保持在一定的品質,讓虛擬物件與真實世界可以自然的疊合。因此最大的挑戰是能達到即時的運算速度同時要求可靠的定位準確度。
    本論文使用迭代最近演算法(Iterative closest point, ICP) 來追蹤相機姿態,此方法需要場景中有充分的幾何結構與低誤差的點雲才能估計出最佳的相機姿態。然而常用在擴增實境領域的低複雜度深度相機通常誤差會比較大,目前已有考量結構性與可靠度問題的選點策略,但需要計算各點法向量的計算量龐大,而且深度相機在特定區域的誤差會特別大,這些問題都應該被考慮進來。本論文提出關鍵點的選取方法來找尋可靠且誤差小的關鍵點,此方法考量到深度相機的誤差特性,從所定義的感興趣區域(ROI) 內之近邊緣區域(Near-edge region) 來取點。感興趣區域主要會根據相機的速度、各點的深度值與從前一張影像所統計的資訊來避開誤差較大的邊緣區域。本論文針對關鍵點的評估提出一個新的方法,利用對應點品質的來分析高度可靠且誤差低的關鍵點在不同情況下的分佈。最後,本論文基於對應點品質的分析,在不同的相機速度與深度值的條件底下建立權重的模型。根據權重模型,非最大值抑制法(Non maximum suppression) 在限制點數的條件底下可以選出一個區域內最可靠且最具代表性的點。本論文所提出的視覺里程計演算法經過公開的資料集評估,整體軌跡誤差小於其他競爭對手,同時在單執行緒上達到實時的速度,並且本論文利用虛擬物件與重建模型的使用者體驗來評估本方法的可靠度,相較於現存產品是具有競爭力的。

    Camera pose estimated by Visual odometry (VO) or Simultaneous localization and mapping (SLAM) plays an important role in user experience of Augment reality (AR) application. To bring components of the digital world into a person’s perception of the real world, one of our main challenges of the algorithm is to keep the tracking error small enough to reduce the discrepancy between virtual and real objects. However, it is necessary to maintain small trajectory error while catching up camera frame rate. Therefore, proposed visual odometry algorithm purssue two main target: real time and robustness.
    The proposed VO algorithm is based on iterative cloest point (ICP) algorithm, which is widely employed to minimize the difference between two clouds of points. The preformance of ICP estimation highly depend on information of geometry structure and quality of input point clouds. Commodity depth sensors such as Kinect, which has been widely used in AR application, are usally noisiy. Several methods have been proposed to sample reliable points with well distribution over the scenes. However, estimating normal vector of the whole frame cost heavy computation. Moreover, noise of depth sensors, which cause the certain points highly unreliable, should be considered into the algorithm. In this thesis, we proposed a keypoint selection method to select the reliable points. Proposed method selects points from near edge region and concerns about noise from depth sensors. Noisy edge region can be skipped based on motion, depth and statistic information from previous frame.
    Second, this work introduces a new keypoint evaluation method which is correspondence quality (CQ) analysis to observe the distribution of highly reliable points under the condition of different situations. The points with high correspondence quality would have two characteristics: nioseless and salient.
    Finally, We model a weighting function for each points based on CQ anlaysis under the condition of camera motion and depth value. According to the weighting function, proposed Non maximum suppression (NMS) could select the most relevant and reliable points in a region with a limited number of point set.
    The proposed VO system is evaluated on publicly available benchmark dataset, and also evaluated on the virtual object rendering and 3D model reconstruction through our vision. Compared with other VO algorithm, the propose method exhibits competitive performance and achieves real time target using only a single CPU thread.

    誌謝 vii Table of Contents viii List of Tables x List of Figures xi Chapter 1. Introduction 1 1.1 Motivation 1 1.2 Related work 2 1.3 Thesis organization 5 Chapter 2. Background 7 2.1 Visual odometry and SLAM system 7 . 2.1.1 Visual odometry 7 . 2.1.2 SLAM 8 2.2 Preliminaries 9 . 2.2.1 Camera model 9 . 2.2.2 Noise of depth image 10 . 2.2.3 Normal estimation 11 2.3 Iterative closest point (ICP) algorithm 13 . 2.3.1 Selection of points 14 . 2.3.2 Matching 14 . 2.3.3 Weighting of pairs 15 . 2.3.4 Rejecting pairs 15 . 2.3.5 Error metric and minimization 16 2.4 Keypoint selection 17 . 2.4.1 Interest point extraction 17 . 2.4.2 Descriptor for matching 19 . 2.4.3 Non maximum suppression 19 Chapter 3. Proposed Efficient Keypoint Selection for Real-time ICP-based Visual Odometry System 21 3.1 System overview 21 3.2 Region of interest 23 . 3.2.1 Fast Edge detection 23 . 3.2.2 Near edge skipping region 25 . 3.2.3 Near edge region 32 3.3 Correspondence-quality-aware weighting function 32 . 3.3.1 Correspondence quality analysis 33 . 3.3.2 Correspondence-quality-aware weighting function 35 . 3.3.3 Non maximum supression 40 Chapter 4. Simulation Results and Performance Evaluation 42 4.1 Evaluation of visual odometry system 42 . 4.1.1 Evalutaion of proposed ROI region 43 . 4.1.2 Evaluation of proposed CQ-aware weighting function 44 . 4.1.3 Evaluation of proposed keypoint selection method 45 . 4.1.4 Performance comparison with other point selection strategies 46 . 4.1.5 System profiling 47 4.2 Performance comparison with state-of-the-art VO 48 . 4.2.1 Frame-to-frame tracking error 48 . 4.2.2 Frame-to-keyframe tracking error 50 4.3 Performance evaluation in real world 50 . 4.3.1 Virtual object rendering 51 . 4.3.2 3D model reconstruction 51 . 4.3.3 Performance comparison with other methods 52 Chapter 5. Conclusion and Future work 58 5.1 Conclusion 58 5.2 Future work 59 References 60

    [1] H. Durrant-Whyte and T. Bailey, “Simultaneous localization and mapping: part I,”
    IEEE Robotics & Automation Magazine, vol. 13, pp. 99–110, June 2006.
    [2] D. Nister, O. Naroditsky, and J. Bergen, “Visual odometry,” June 2004.
    [3] D. Scaramuzza and F. Fraundorfer, “Visual Odometry [Tutorial],” IEEE Robotics
    Automation Magazine, vol. 18, pp. 80–92, Dec. 2011.
    [4] A. J. Davison, I. D. Reid, N. D. Molton, and O. Stasse, “MonoSLAM: Real-Time Single
    Camera SLAM,” IEEE Transactions on Pattern Analysis and Machine Intelligence,
    vol. 29, pp. 1052–1067, June 2007.
    [5] P. Henry, M. Krainin, E. Herbst, X. Ren, and D. Fox, “RGB-D Mapping: Using depth
    cameras for dense 3d modeling of indoor environments,” In the 12th International
    Symposium on Experimental Robotics (ISER, 2010.
    [6] F. Endres, J. Hess, J. Sturm, D. Cremers, and W. Burgard, “3-D Mapping With an RGBD
    Camera,” IEEE Transactions on Robotics, vol. 30, pp. 177–187, Feb. 2014.
    [7] D. G. Lowe, “Object recognition from local scale-invariant features,” Proceedings of
    the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157
    vol.2, 1999.
    [8] H. Bay, A. Ess, T. Tuytelaars, and L. Van Gool, “Speeded-Up Robust Features (SURF),”
    Computer Vision and Image Understanding, vol. 110, pp. 346–359, June 2008.
    [9] C. Harris and M. Stephens, “A combined corner and edge detector,” In Proc. Fourth
    Alvey Vision Conference, pp. 147–152, 1988.
    [10] C. Kerl, J. Sturm, and D. Cremers, Dense Visual SLAM for RGB-D Cameras.
    [11] C. Kerl, J. Sturm, and D. Cremers, “Robust odometry estimation for RGB-D cameras,”
    2013 IEEE International Conference on Robotics and Automation, pp. 3748–3754, May
    2013.
    [12] P. J. Besl and N. D. McKay, “Method for registration of 3-D shapes,” Sensor Fusion
    IV: Control Paradigms and Data Structures, vol. 1611, pp. 586–607, Apr. 1992.
    [13] S. Rusinkiewicz and M. Levoy, “Efficient variants of the ICP algorithm,” Proceedings
    Third International Conference on 3-D Digital Imaging and Modeling, pp. 145–152,
    2001.
    [14] R. A. Newcombe, S. Izadi, O. Hilliges, D. Molyneaux, D. Kim, A. J. Davison, P. Kohi,
    J. Shotton, S. Hodges, and A. Fitzgibbon, “KinectFusion: Real-time dense surface mapping
    and tracking,” 2011 10th IEEE International Symposium on Mixed and Augmented
    Reality, pp. 127–136, Oct. 2011.
    [15] R. B. Rusu and S. Cousins, “3d is here: Point Cloud Library (PCL),” 2011 IEEE
    International Conference on Robotics and Automation, pp. 1–4, May 2011.
    [16] B. S. R. Bogdan and R. K. K. Wolfram, “NARF: 3d Range Image Features for Object
    Recognition,” p. 3.
    [17] C. Choi, A. J. B. Trevor, and H. I. Christensen, “RGB-D edge detection and edgebased
    registration,” 2013 IEEE/RSJ International Conference on Intelligent Robots and
    Systems, pp. 1568–1575, Nov. 2013.
    [18] L. Bose and A. Richards, “Fast depth edge detection and edge based RGB-D SLAM,”
    2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 1323–
    1330, May 2016.
    [19] T. Mallick, P. P. Das, and A. K. Majumdar, “Characterizations of Noise in Kinect Depth
    Images: A Review,” IEEE Sensors Journal, vol. 14, pp. 1731–1740, June 2014.
    [20] C. V. Nguyen, S. Izadi, and D. Lovell, “Modeling Kinect Sensor Noise for Improved 3d
    Reconstruction and Tracking,” Visualization Transmission 2012 Second International
    Conference on 3D Imaging, Modeling, Processing, pp. 524–530, Oct. 2012.
    [21] O. Bailo, F. Rameau, K. Joo, J. Park, O. Bogdan, and I. S. Kweon, “Efficient adaptive
    non-maximal suppression algorithms for homogeneous spatial keypoint distribution,”
    Pattern Recognition Letters, vol. 106, pp. 53–60, Apr. 2018.
    [22] M. Brown, R. Szeliski, and S. Winder, “Multi-image matching using multi-scale oriented
    patches,” 2005 IEEE Computer Society Conference on Computer Vision and
    Pattern Recognition (CVPR’05), vol. 1, pp. 510–517 vol. 1, June 2005.
    [23] K. Konolige and M. Agrawal, “FrameSLAM: From Bundle Adjustment to Real-Time
    Visual Mapping,” IEEE Transactions on Robotics, vol. 24, pp. 1066–1077, Oct. 2008.
    [24] N. J. Mitra, A. Nguyen, and L. Guibas, “Estimating surface normals in noisy point
    cloud data,” International Journal of Computational Geometry & Applications, vol. 14,
    pp. 261–276, Oct. 2004.
    [25] Z. Zhang, “Iterative point matching for registration of free-form curves and surfaces,”
    International Journal of Computer Vision, vol. 13, pp. 119–152, Oct. 1994.
    [26] G. Turk and M. Levoy, “Zippered polygon meshes from range images,” pp. 311–318,
    1994.
    [27] T. Masuda, K. Sakaue, and N. Yokoya, “Registration and integration of multiple range
    images for 3-D model construction,” Proceedings of 13th International Conference on
    Pattern Recognition, vol. 1, pp. 879–883 vol.1, Aug. 1996.
    [28] R. Benjemaa, F. Schmitt, ￿. Nationale, and S. Télécommunications, “Fast global registration
    of 3d sampled surfaces using a multi-Z-buffer technique,” in International
    Conference on Recent Advances in 3D Digital Imaging and Modeling, pp. 113–120,
    1997.
    [29] K. Pulli, “Multiview registration for large data sets,” Second International Conference
    on 3-D Digital Imaging and Modeling (Cat. No.PR00062), pp. 160–168, 1999.
    [30] D. Marr and E. Hildreth, “Theory of edge detection,” Proc. R. Soc. Lond. B, vol. 207,
    pp. 187–217, Feb. 1980.
    [31] D. G. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints,”
    International Journal of Computer Vision, vol. 60, pp. 91–110, Nov. 2004.
    [32] R. B. Rusu, N. Blodow, and M. Beetz, “Fast Point Feature Histograms (FPFH) for
    3d Registration,” in In Proceedings of the International Conference on Robotics and
    Automation (ICRA, 2009.
    [33] L. Bose and A. Richards, “Fast depth edge detection and edge based RGB-D SLAM,”
    2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 1323–
    1330, May 2016.
    [34] J. Sturm, N. Engelhard, F. Endres, W. Burgard, and D. Cremers, “A benchmark for the
    evaluation of RGB-D SLAM systems,” 2012 IEEE/RSJ International Conference on
    Intelligent Robots and Systems, pp. 573–580, Oct. 2012.
    [35] B. K. P. Horn, “Closed-form solution of absolute orientation using unit quaternions,”
    JOSA A, vol. 4, pp. 629–642, Apr. 1987.
    [36] E. Rodolà, A. Albarelli, D. Cremers, and A. Torsello, “A simple and effective relevancebased
    point sampling for 3d shapes,” Pattern Recognition Letters, vol. 59, pp. 41–47,
    July 2015.
    [37] Y. Zhong, “Intrinsic shape signatures: A shape descriptor for 3d object recognition,”
    2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV
    Workshops, pp. 689–696, Sept. 2009.
    [38] I. Sipiran and B. Bustos, “Harris 3d: a robust extension of the Harris operator for interest
    point detection on 3d meshes,” The Visual Computer, vol. 27, pp. 963–976, Nov. 2011.
    [39] B. W. Babu, S. Kim, Z. Yan, and L. Ren, “σ-DVO: Sensor Noise Model Meets Dense
    Visual Odometry,” pp. 18–26, Sept. 2016.
    [40] S. M. Prakhya, L. Bingbing, L. Weisi, and U. Qayyum, “Sparse Depth Odometry:
    3d keypoint based pose estimation from dense depth data,” 2015 IEEE International
    Conference on Robotics and Automation (ICRA), pp. 4216–4223, May 2015.
    [41] Z. Yan, M. Ye, and L. Ren, “Dense Visual SLAM with Probabilistic Surfel Map,” IEEE
    Transactions on Visualization and Computer Graphics, vol. 23, pp. 2389–2398, Nov.
    2017.

    無法下載圖示 校內:2024-07-23公開
    校外:不公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE