| 研究生: |
蕭文遠 Hsiao, Wen-Yuan |
|---|---|
| 論文名稱: |
考量對應點品質的高效關鍵點選擇策略應用於迭代最近點演算法之即時視覺里程計系統 Efficient Correspondence-quality-aware Keypoint Selection for Real-time ICP-based Visual Odometry |
| 指導教授: |
謝明得
Shieh, Ming-Der |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2018 |
| 畢業學年度: | 106 |
| 語文別: | 英文 |
| 論文頁數: | 62 |
| 中文關鍵詞: | 視覺里程計 、同步定位及地圖建構 、擴增實境 、關鍵點選擇策略 |
| 外文關鍵詞: | Visual odometry, Simultaneous localization and mapping (SLAM), Iterative closest point (ICP), Augmented Reality (AR), Keypoint selection |
| 相關次數: | 點閱:101 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在擴增實境(Augmented Reality) 的應用中,視覺里程計或同步定位及地圖建構是提升使用者體驗的關鍵演算法,為了讓使用者認為虛擬物件是真實存在的,必須將定位的準確度與速度保持在一定的品質,讓虛擬物件與真實世界可以自然的疊合。因此最大的挑戰是能達到即時的運算速度同時要求可靠的定位準確度。
本論文使用迭代最近演算法(Iterative closest point, ICP) 來追蹤相機姿態,此方法需要場景中有充分的幾何結構與低誤差的點雲才能估計出最佳的相機姿態。然而常用在擴增實境領域的低複雜度深度相機通常誤差會比較大,目前已有考量結構性與可靠度問題的選點策略,但需要計算各點法向量的計算量龐大,而且深度相機在特定區域的誤差會特別大,這些問題都應該被考慮進來。本論文提出關鍵點的選取方法來找尋可靠且誤差小的關鍵點,此方法考量到深度相機的誤差特性,從所定義的感興趣區域(ROI) 內之近邊緣區域(Near-edge region) 來取點。感興趣區域主要會根據相機的速度、各點的深度值與從前一張影像所統計的資訊來避開誤差較大的邊緣區域。本論文針對關鍵點的評估提出一個新的方法,利用對應點品質的來分析高度可靠且誤差低的關鍵點在不同情況下的分佈。最後,本論文基於對應點品質的分析,在不同的相機速度與深度值的條件底下建立權重的模型。根據權重模型,非最大值抑制法(Non maximum suppression) 在限制點數的條件底下可以選出一個區域內最可靠且最具代表性的點。本論文所提出的視覺里程計演算法經過公開的資料集評估,整體軌跡誤差小於其他競爭對手,同時在單執行緒上達到實時的速度,並且本論文利用虛擬物件與重建模型的使用者體驗來評估本方法的可靠度,相較於現存產品是具有競爭力的。
Camera pose estimated by Visual odometry (VO) or Simultaneous localization and mapping (SLAM) plays an important role in user experience of Augment reality (AR) application. To bring components of the digital world into a person’s perception of the real world, one of our main challenges of the algorithm is to keep the tracking error small enough to reduce the discrepancy between virtual and real objects. However, it is necessary to maintain small trajectory error while catching up camera frame rate. Therefore, proposed visual odometry algorithm purssue two main target: real time and robustness.
The proposed VO algorithm is based on iterative cloest point (ICP) algorithm, which is widely employed to minimize the difference between two clouds of points. The preformance of ICP estimation highly depend on information of geometry structure and quality of input point clouds. Commodity depth sensors such as Kinect, which has been widely used in AR application, are usally noisiy. Several methods have been proposed to sample reliable points with well distribution over the scenes. However, estimating normal vector of the whole frame cost heavy computation. Moreover, noise of depth sensors, which cause the certain points highly unreliable, should be considered into the algorithm. In this thesis, we proposed a keypoint selection method to select the reliable points. Proposed method selects points from near edge region and concerns about noise from depth sensors. Noisy edge region can be skipped based on motion, depth and statistic information from previous frame.
Second, this work introduces a new keypoint evaluation method which is correspondence quality (CQ) analysis to observe the distribution of highly reliable points under the condition of different situations. The points with high correspondence quality would have two characteristics: nioseless and salient.
Finally, We model a weighting function for each points based on CQ anlaysis under the condition of camera motion and depth value. According to the weighting function, proposed Non maximum suppression (NMS) could select the most relevant and reliable points in a region with a limited number of point set.
The proposed VO system is evaluated on publicly available benchmark dataset, and also evaluated on the virtual object rendering and 3D model reconstruction through our vision. Compared with other VO algorithm, the propose method exhibits competitive performance and achieves real time target using only a single CPU thread.
[1] H. Durrant-Whyte and T. Bailey, “Simultaneous localization and mapping: part I,”
IEEE Robotics & Automation Magazine, vol. 13, pp. 99–110, June 2006.
[2] D. Nister, O. Naroditsky, and J. Bergen, “Visual odometry,” June 2004.
[3] D. Scaramuzza and F. Fraundorfer, “Visual Odometry [Tutorial],” IEEE Robotics
Automation Magazine, vol. 18, pp. 80–92, Dec. 2011.
[4] A. J. Davison, I. D. Reid, N. D. Molton, and O. Stasse, “MonoSLAM: Real-Time Single
Camera SLAM,” IEEE Transactions on Pattern Analysis and Machine Intelligence,
vol. 29, pp. 1052–1067, June 2007.
[5] P. Henry, M. Krainin, E. Herbst, X. Ren, and D. Fox, “RGB-D Mapping: Using depth
cameras for dense 3d modeling of indoor environments,” In the 12th International
Symposium on Experimental Robotics (ISER, 2010.
[6] F. Endres, J. Hess, J. Sturm, D. Cremers, and W. Burgard, “3-D Mapping With an RGBD
Camera,” IEEE Transactions on Robotics, vol. 30, pp. 177–187, Feb. 2014.
[7] D. G. Lowe, “Object recognition from local scale-invariant features,” Proceedings of
the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157
vol.2, 1999.
[8] H. Bay, A. Ess, T. Tuytelaars, and L. Van Gool, “Speeded-Up Robust Features (SURF),”
Computer Vision and Image Understanding, vol. 110, pp. 346–359, June 2008.
[9] C. Harris and M. Stephens, “A combined corner and edge detector,” In Proc. Fourth
Alvey Vision Conference, pp. 147–152, 1988.
[10] C. Kerl, J. Sturm, and D. Cremers, Dense Visual SLAM for RGB-D Cameras.
[11] C. Kerl, J. Sturm, and D. Cremers, “Robust odometry estimation for RGB-D cameras,”
2013 IEEE International Conference on Robotics and Automation, pp. 3748–3754, May
2013.
[12] P. J. Besl and N. D. McKay, “Method for registration of 3-D shapes,” Sensor Fusion
IV: Control Paradigms and Data Structures, vol. 1611, pp. 586–607, Apr. 1992.
[13] S. Rusinkiewicz and M. Levoy, “Efficient variants of the ICP algorithm,” Proceedings
Third International Conference on 3-D Digital Imaging and Modeling, pp. 145–152,
2001.
[14] R. A. Newcombe, S. Izadi, O. Hilliges, D. Molyneaux, D. Kim, A. J. Davison, P. Kohi,
J. Shotton, S. Hodges, and A. Fitzgibbon, “KinectFusion: Real-time dense surface mapping
and tracking,” 2011 10th IEEE International Symposium on Mixed and Augmented
Reality, pp. 127–136, Oct. 2011.
[15] R. B. Rusu and S. Cousins, “3d is here: Point Cloud Library (PCL),” 2011 IEEE
International Conference on Robotics and Automation, pp. 1–4, May 2011.
[16] B. S. R. Bogdan and R. K. K. Wolfram, “NARF: 3d Range Image Features for Object
Recognition,” p. 3.
[17] C. Choi, A. J. B. Trevor, and H. I. Christensen, “RGB-D edge detection and edgebased
registration,” 2013 IEEE/RSJ International Conference on Intelligent Robots and
Systems, pp. 1568–1575, Nov. 2013.
[18] L. Bose and A. Richards, “Fast depth edge detection and edge based RGB-D SLAM,”
2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 1323–
1330, May 2016.
[19] T. Mallick, P. P. Das, and A. K. Majumdar, “Characterizations of Noise in Kinect Depth
Images: A Review,” IEEE Sensors Journal, vol. 14, pp. 1731–1740, June 2014.
[20] C. V. Nguyen, S. Izadi, and D. Lovell, “Modeling Kinect Sensor Noise for Improved 3d
Reconstruction and Tracking,” Visualization Transmission 2012 Second International
Conference on 3D Imaging, Modeling, Processing, pp. 524–530, Oct. 2012.
[21] O. Bailo, F. Rameau, K. Joo, J. Park, O. Bogdan, and I. S. Kweon, “Efficient adaptive
non-maximal suppression algorithms for homogeneous spatial keypoint distribution,”
Pattern Recognition Letters, vol. 106, pp. 53–60, Apr. 2018.
[22] M. Brown, R. Szeliski, and S. Winder, “Multi-image matching using multi-scale oriented
patches,” 2005 IEEE Computer Society Conference on Computer Vision and
Pattern Recognition (CVPR’05), vol. 1, pp. 510–517 vol. 1, June 2005.
[23] K. Konolige and M. Agrawal, “FrameSLAM: From Bundle Adjustment to Real-Time
Visual Mapping,” IEEE Transactions on Robotics, vol. 24, pp. 1066–1077, Oct. 2008.
[24] N. J. Mitra, A. Nguyen, and L. Guibas, “Estimating surface normals in noisy point
cloud data,” International Journal of Computational Geometry & Applications, vol. 14,
pp. 261–276, Oct. 2004.
[25] Z. Zhang, “Iterative point matching for registration of free-form curves and surfaces,”
International Journal of Computer Vision, vol. 13, pp. 119–152, Oct. 1994.
[26] G. Turk and M. Levoy, “Zippered polygon meshes from range images,” pp. 311–318,
1994.
[27] T. Masuda, K. Sakaue, and N. Yokoya, “Registration and integration of multiple range
images for 3-D model construction,” Proceedings of 13th International Conference on
Pattern Recognition, vol. 1, pp. 879–883 vol.1, Aug. 1996.
[28] R. Benjemaa, F. Schmitt, . Nationale, and S. Télécommunications, “Fast global registration
of 3d sampled surfaces using a multi-Z-buffer technique,” in International
Conference on Recent Advances in 3D Digital Imaging and Modeling, pp. 113–120,
1997.
[29] K. Pulli, “Multiview registration for large data sets,” Second International Conference
on 3-D Digital Imaging and Modeling (Cat. No.PR00062), pp. 160–168, 1999.
[30] D. Marr and E. Hildreth, “Theory of edge detection,” Proc. R. Soc. Lond. B, vol. 207,
pp. 187–217, Feb. 1980.
[31] D. G. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints,”
International Journal of Computer Vision, vol. 60, pp. 91–110, Nov. 2004.
[32] R. B. Rusu, N. Blodow, and M. Beetz, “Fast Point Feature Histograms (FPFH) for
3d Registration,” in In Proceedings of the International Conference on Robotics and
Automation (ICRA, 2009.
[33] L. Bose and A. Richards, “Fast depth edge detection and edge based RGB-D SLAM,”
2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 1323–
1330, May 2016.
[34] J. Sturm, N. Engelhard, F. Endres, W. Burgard, and D. Cremers, “A benchmark for the
evaluation of RGB-D SLAM systems,” 2012 IEEE/RSJ International Conference on
Intelligent Robots and Systems, pp. 573–580, Oct. 2012.
[35] B. K. P. Horn, “Closed-form solution of absolute orientation using unit quaternions,”
JOSA A, vol. 4, pp. 629–642, Apr. 1987.
[36] E. Rodolà, A. Albarelli, D. Cremers, and A. Torsello, “A simple and effective relevancebased
point sampling for 3d shapes,” Pattern Recognition Letters, vol. 59, pp. 41–47,
July 2015.
[37] Y. Zhong, “Intrinsic shape signatures: A shape descriptor for 3d object recognition,”
2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV
Workshops, pp. 689–696, Sept. 2009.
[38] I. Sipiran and B. Bustos, “Harris 3d: a robust extension of the Harris operator for interest
point detection on 3d meshes,” The Visual Computer, vol. 27, pp. 963–976, Nov. 2011.
[39] B. W. Babu, S. Kim, Z. Yan, and L. Ren, “σ-DVO: Sensor Noise Model Meets Dense
Visual Odometry,” pp. 18–26, Sept. 2016.
[40] S. M. Prakhya, L. Bingbing, L. Weisi, and U. Qayyum, “Sparse Depth Odometry:
3d keypoint based pose estimation from dense depth data,” 2015 IEEE International
Conference on Robotics and Automation (ICRA), pp. 4216–4223, May 2015.
[41] Z. Yan, M. Ye, and L. Ren, “Dense Visual SLAM with Probabilistic Surfel Map,” IEEE
Transactions on Visualization and Computer Graphics, vol. 23, pp. 2389–2398, Nov.
2017.
校內:2024-07-23公開