簡易檢索 / 詳目顯示

研究生: 黃翊倫
Huang, Yi-Lun
論文名稱: 植基於Kinect之3D影像建立與以認知學習系統為基礎之物件抓取控制策略於居家服務型機器人
Implementation of Kinect-based 3D Vision and Cognition Learning System Based Object Grasping Control for Home Service Robots
指導教授: 李祖聖
Li, Tzuu-Hseng
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2014
畢業學年度: 102
語文別: 英文
論文頁數: 69
中文關鍵詞: SURF演算法BRISK特徵TLD演算法點雲
外文關鍵詞: BRISK algorithm, Point Cloud, SURF algorithm, TLD algorithm
相關次數: 點閱:129下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文主要在探討利用Kinect建立的3D影像與認知學習系統,祈居家服務型機器人能自我調整抓取物件之姿態。視覺系統包含物件辨識和追蹤系統,其中物件辨識系統的特徵點偵測法是使用SURF演算法,而特徵點描述者使用BRISK。影像追蹤系統則是採用TLD演算法,其可以追蹤影像目標,亦可及時的學習影像資料和資料庫更新。手臂抓取姿態方面,將利用Kinect得到的2D影像搭配紅外線轉換成以點雲表示的3D影像,由點雲圖可得知物件的空間資訊。另規劃多個點當作候選點,透過手掌的限制與偵測手指閉合的路徑上是否有障礙物來決定適合的抓取姿態。為了讓機器人能夠自我調整抓取物件的姿態,使用心理學家Daniel Kahneman在「快思慢想」這本書中提出的人類兩種思維模式:快速直覺思維的系統一與緩慢理性思維的系統二,來設計認知學習系統。最後,本篇論文所提出的方法應用在居家服務型機器人上,並透過實驗結果證明其可行性。

    This thesis proposes a method using 3D vision based on Kinect and the cognition learning system. This method allows the home service robot to automatically adjust the posture of its hands when grasping an object. The vision system includes an object recognition and a tracking system. Feature detection of the object recognition system uses the Speeded-Up Robust Features (SURF) algorithm while the feature description uses Binary Robust Invariant Scalable Keypoints (BRISK). The vision tracking system uses the Tracking-Learning-Detection (TLD) which it not only can track the target object but also can learn the vision data and update the database in real time. In the posture of the robotic arm, it uses 2D vision obtained from Kinect and infrared rays to transform into 3D vision described as point cloud. From the point cloud data, we know the space information of the object. Then, more points are planned as candidates and the appropriate posture is decided from the palm limit and the information about whether or not there is an obstacle along the path when the fingers move. In order to allow the robot to automatically adjust the posture when grasping the object, we use two human thinking modes to design the cognition learning system. These are the fast intuitive thinking System 1 and slow rational thinking System 2 as proposed in a book entitled, “Thinking, Fast and Slow”, written by a psychologist, Daniel Kahneman. Finally, the method proposed in this thesis is applied to the home service robot and is proven the feasibility by the experimental results.

    Abstract Ⅰ Acknowledgement Ⅲ Contents Ⅳ List of Figures Ⅵ List of Tables Ⅸ Chapter 1. Introduction 1 1.1 Motivation 1 1.2 Related Work 2 1.3 Hardware and Software 3 1.4 Thesis Organization 9 Chapter 2. Design of Visual System 10 2.1 Introduction 10 2.2 SURF (Speeded-Up Robust Features) 11 2.2.1 Integral Image 11 2.2.2 Feature Detection for Hessian Matrix 13 2.2.3 BRISK (Binary Robust Invariant Scalable Keypoints) 19 2.3 Tracking-Learning Detection System 21 2.3.1 Tracking System 22 2.3.2 Detection System 25 2.3.3 Learning System 26 Chapter 3. Grasping Selection 28 3.1 Introduction 28 3.2 Design 3D Vision 30 3.2.1 Point Cloud 31 3.2.2 Simplified Point Cloud Data 32 3.3 Grasping Posture Selection 35 3.3.1 Analysis of the Pose of Objects 36 3.3.2 Design of Selecting Grasping Posture 38 Chapter 4. Design of Cognition Learning System 42 4.1 Introduction 42 4.2 Cognition Learning System 43 4.2.1 System 1 and System 2 44 4.2.2 Adjustment of the Robotic Arm Posture 47 4.3 Strategy System for Object Grasping Control 51 Chapter 5. Experiments 54 5.1 Introduction 54 5.2 Experiments for the Posture Selection 55 5.3 Experiments for Cognition Learning System 61 Chapter 6. Conclusion and Future Work 65 6.1 Conclusion 65 6.2 Future Work 66 References 67

    [1] HONDA ASIMO, Available:
    http://www.honda-taiwan.com.tw/technology_ASIMO_new.html
    [2] Y. Sakagami, R. Watanabe, C. Aoyama, S. Matsunaga, N. Higaki, and K. Fujimura, “The intelligent ASIMO: System overview and integration,” in Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, vol. 3, pp. 2478–2483, 2002.
    [3] PR2, Available: http://www.willowgarage.com/pages/pr2/overview
    [4] S. Chitta, E. G. Jones, M. Ciocarlie, and K. Hsiao, “Perception, planning, and execution for mobile manipulation in unstructured environments,” in Proceedings of IEEE Robotics and Automation Magazine, Special Issue on Mobile Manipulation, vol. 19, 2012.
    [5] K. Kaneko, F. Kanehiro, M. Morisawa, K. Miura, S. Nakaoka, and S. Kajita, “Cybernetic human HRP-4C,” in Proceedings of IEEE 9th IEEE-RAS Int. Conf. Humanoid Robots, pp. 7–14, 2009.
    [6] HPR-4C, Available: http://en.wikipedia.org/wiki/HRP-4C
    [7] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, vol. 60, no. 2, pp. 91–110, 2004.
    [8] H. Bay, T. Tuytelaars, and L. V. Gool, “SURF: Speeded-up robust features,” in Proceedings of the European Conference on Computer Vision, pp. 404–417, 2006.
    [9] C. G. Harris and M. J. Stephens, “A combined corner and edge detector,” in Proceedings of 4th Alvey Vision Conference, pp. 147-151, 1988.
    [10] E. Rosten and T. Drummond, “Faster and better: A machine learning approach to corner detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 1, pp.105-119, 2010.
    [11] M. Z. Zia, M. Stark, and K. Schindler, “Explicit occlusion modeling for 3D object class representations,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 3326-3333, 2013.
    [12] M. Z. Zia, M. Stark, B. Schiele, and K. Shindler, “Detailed 3D representations for object recognition and modeling,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 11, pp. 2608-2623, 2013.
    [13] M. Xu, T. Ellis, S. Godsill, and G. Jones, “Visual tracking of partially observable targets with suboptimal filtering,” IET Computer Vision, vol. 5, pp. 1-13, 2011.
    [14] O. S. Gedik and A. A. Alatan, “3-D rigid body tracking using vision and depth sensors,” IEEE Transactions on Cybernetics, vol. 43, no. 5, pp. 1395-1405, 2013.
    [15] H. Grabner, M. Grabner, and H. Bischof, “Real-time tracking via on-line boosting,” in Proceedings of British Machine Vision Conference, vol. 1, pp. 47-56, 2006.
    [16] H. Grabner, C. Leistner, and H. Bischof, “Semi-supervised online boosting for robust tracking,” in Proceedings of European Conference on Computer Vision, 2008.
    [17] Q. Wang, F. Chen, J. Yang, W. Xu, and M. Yang, “Transferring visual prior for online object tracking,” IEEE Transactions on Image Processing, vol. 21, pp. 3296-3305, 2012.
    [18] Z. Kalal, J. Matas, and K. Mikolajczyk, "Tracking-learning-detection," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 7, pp. 1409-1422, 2012.
    [19] R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. Cambrigde, MA: MIT press, 1998
    [20] Wikipedia for neural network, Available:
    http://en.wikipedia.org/wiki/Artificial_neural_network
    [21] J. Kennedy and R. Eberhart, “Particle swarm optimization,” in Proceedings of IEEE International Conference on Neural Networks, Perth, Australia, pp. 1942-1948, 1995.
    [22] D. Kahneman, Thinking, Fast and Slow. Penguin Group UK, 2012.
    [23] ROBOTIS, Available: http://www.robotis.com/xe
    [24] RODE, Available: http://www.rodemic.com/mics/videomicpro
    [25] Kinect for Windows, Available:
    http://www.microsoft.com/en-us/kinectforwindowsdev/default.aspx
    [26] SICK. Available:
    http://www.sick.com/group/EN/home/products/product_portfolio/laser_measurement_systems/Pages/indoor_laser_measurement_technology.aspx
    [27] S. Leutenegger, M. Chli, and R. Siegwart, “BRISK: Binary robust invariant scalable keypoints,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 2548–2555, 2011.
    [28] M. Calonder, V. Lepetit, and P. Fua, “BRIEF: Binary robust independent elementary features,” in Proceedings of European Conference on Computer Vision, 2010
    [29] Z. Kalal, K. Mikolajczyk, and J. Matas, “Forward-backward error: Automatic detection of tracking failures,” in Proceedings of International Conference on Pattern Recognition, pp. 23-26, 2010.
    [30] B.D. Lucas and T. Kanade, “An iterative image registration technique with an application to stereo vision,” International Joint Conferences on Artificial Intelligence, vol. 81, pp. 674–679, 1981.
    [31] J. P. Lewis, “Fast normalized cross-correlation,” Vision interface, vol. 10, no. 1, pp. 120-123, 1995.
    [32] L. Breiman, “Random forests,” Machine Learning, vol. 45, no. 1, pp. 5-32, 2001.
    [33] Z. Kalal, J. Matas, and K. Mikolajczyk, “P-N learning: Bootsrapping binary classifiers by structural constraints,” in Proceedings of Conference on Computer Vision and Pattern Recognition, 2010.
    [34] P. Viola and M. Jones, “Rapid object detection using a boosted cascade of simple features,” in Proceedings of Conference on Computer Vision and Pattern Recognition, vol. 1, pp. I-511,I-518, 2001.
    [35] Point Cloud Image, Available:
    http://www.3dvia.com/studio/documentation/user-manual/3d-world/point-cloud-2
    [36] F. Li, R. Tang, C. Liu, and H. Yu, “A method for object reconstruction based on point-cloud data via 3D scanning,” in Proceedings of the 2010 International Conference on Audio Language and Image Processing, pp. 302-306, 2010.
    [37] R. B. Rusu and S. Cousuns, “3D is here: Point cloud library (PCL),” in Proceedings of the IEEE international Conference on Robotics and Automation, May 2011.
    [38] Viola–Jones object detection framework, Available: http://en.wikipedia.org/wiki/Viola%E2%80%93Jones_object_detection_framework

    無法下載圖示 校內:2017-08-14公開
    校外:不公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE