簡易檢索 / 詳目顯示

研究生: 陳質岩
Chen, Chih-Yen
論文名稱: 基於寬度學習之視線追蹤與手勢分類應用於機器人遙控介面
Board Learning System based Gaze and Gesture Classification for Telerobotics Interface
指導教授: 李祖聖
Li, Tzuu-Hseng S.
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2018
畢業學年度: 106
語文別: 英文
論文頁數: 84
中文關鍵詞: 寬度學習手勢辨識視覺追蹤人機互動遙控機器人
外文關鍵詞: Board Learning System, Gesture recognition, Gaze tracking, Human robot interaction, Telerobotics
相關次數: 點閱:73下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文目的為設計一個機器人搖控介面,讓使用者可以在居家環境當中,透過手勢以及眼球移動位置,控制遠端的機器人。此介面透過一台架設於使用者面前的攝影機,擷取使用者的影像,進行手勢辨識以及視線追蹤,並將辨識到的手勢與視線,對應到機器人不同的行為,用以控制機器人移動方向,手臂運動,以及抓取動作等。為了準確的辨識出使用者的手勢,本文提出一雙階層混和模型,首先使用類卷積神經網路標記出手位置,接著使用寬度學習法辨識不同手勢。寬度學習法採用單層結構,利用矩陣運算的方式,可快速地完成網路的訓練,此外,當網路結構需要擴展時,寬度學習的方法可增加節點,而無需重新訓練。由於使用者的視線變化與頭部轉動角度有關,因此,首先須找出人臉的特徵點,建構出人臉模型,計算臉之法向量,以及眼睛的區域,接著,使用色彩梯度與最大累積內積方法,算出眼睛中心點,取出眼睛位置與視線移動向量,再透過寬度學習法,對使用者注視位置進行學習。模擬的結果顯示,相較於其他演算法,寬度學習具有良好的表現。透過手勢與眼睛注視位置,使用者可以搖控機器人的底盤,使其在環境中移動,也可控制機器人的頭部轉動角度,改變機器人攝影機的觀看角度,並看到機器人的視野,除此之外,也可以控制機器人的手臂,使其移動及抓取物品。實驗結果顯示,整合上述控制功能,機器人即可在室內環境中,協助使用者完成日常任務。

    This thesis proposes a telerobotics interface which allows a person to remotely control a service robot to accomplish some daily tasks. This telerobotics interface controls the robot by detecting and recognizing the gesture and gaze direction. A camera is placed in front of the person, and the robot vision will be shown on a screen so that the person can control the robot intuitively without wearing any equipment. For recognition of gestures, a 2-stage hybrid learning system is proposed. In the first stage, the location of the hand is detected by a YOLO method. The image of the hand is then trained in the second stage by a Board Learning System (BLS) to classify different gestures. For gaze direction recognition, the face landmarks are mapped first and a face 3D model is built. The pose of the face is estimated. Next, the location of the eye center is calculated and trained by another board learning system. The BLS is a one layer network that adjusts its weight by matrix operations; therefore, its training time is very short. The simulations compare the performance of the BLS with other algorithms, and show that the BLS has a great accuracy rate. In addition to the simulations, we constructed three experiment scenarios in order to evaluate the efficiency of the proposed telerobotics interface. The experimental results show the robot can be easily remote-controlled by the gesture and gaze direction and can accomplish several daily tasks successfully using this interface.

    Contents Abstract I Acknowledgment III Contents IV List of Figures VI List of Tables VIII List of Variables IX Chapter 1. Introduction 1 1.1 Motivation 1 1.2 Related Work 2 1.3 Thesis Organization 4 Chapter 2. System Overview 6 2.1 Introduction 6 2.2 System Overview of telerobotics interface for HRI 7 2.3 The process of gesture control HRI system 11 2.4 The process of gaze control HRI system 13 2.5 Summary 15 Chapter 3. Method of Feature Extraction 16 3.1 Introduction 16 3.2 Feature Extraction for Hand 17 3.2.1 Hand detection using YOLO 17 3.2.2 Preprocess of hand feature 22 3.3 Feature Extraction for Eye 24 3.3.1 Face features detection 25 3.3.2 Head pose estimation 28 3.3.3 Eyes center location 30 3.3.4 Features extraction and tracking 33 3.3.5 Architecture of gaze and head HRI system 35 3.4 Summary 36 Chapter 4. BLS-based Gaze and Gesture Classification 38 4.1 Introduction 38 4.2 Broad Learning System 39 4.3 Result of Simulation and Comparisons 47 4.4 Broad Learning System-Based HRI System 50 4.4.1 BLS-Base Gaze Tracker Classification 50 4.4.2 BLS-Base Gesture Classification 53 4.5 Experiment Results of BLS-Based HRI System 58 4.6 Summary 59 Chapter 5. Experiments and Scenarios 61 5.1 Introduction 61 5.2 Experimental Environment Setting 62 5.3 Experimental Results: Basic Functions 64 5.4 Experimental Results: Indoor Scenarios 67 5.4.1 Indoor Scenario One 68 5.4.2 Indoor Scenario Two 71 5.5 Experimental Results: Outdoor Scenarios 74 5.6 Summary 76 Chapter 6. Conclusions and Future Works 77 6.1 Conclusions 77 6.2 Future Works 79 References 80

    References
    [1] M. Beetz, D. Jain, L. Mosenlechner, M. Tenorth, L. Kunze, N. Blodow, and D. Pangercic, “Cognition-Enabled Autonomous Robot Control for the Realization of Home Chore Task Intelligence,” Proceedings of the IEEE, vol. 100, no. 8, pp. 2454–2471, 2012.
    [2] M. Bollini, S. Tellex, T. Thompson, N. Roy, and D. Rus, “Interpreting and Executing Recipes with a Cooking Robot,” in Proc. of Experimental Robotics: The 13th International Symposium on Experimental Robotics, pp. 481–495, 2013.
    [3] S. Miller, J. van den Berg, M. Fritz, T. Darrell, K. Goldberg, and P. Abbee, “A Geometric Approach to Robotic Laundry Folding,” The International Journal of Robotics Research, vol. 31, no. 2, pp. 249–267, 2012.
    [4] S. Kim, J.-Y. Sim, and S. Yang, “Vision-based Cleaning Area Control for Cleaning Robots,” IEEE Transactions on Consumer Electronics, vol. 58, no. 2, pp. 685–690, 2012.
    [5] K. Yamazaki, R. Ueda, S. Nozawa, M. Kojima, K. Okada, K. Matsumoto, M. Ishikawa, I. Shimoyama, and M. Inaba, “Home-Assistant Robot for an Aging Society,” Proceedings of the IEEE, vol. 100, no. 8, pp. 2429–2441, 2012.
    [6] S. W. Brose, D. J. Weber, B. A. Salatin, G. G. Grindle, H. Wang, J. J. Vazquez, and R. A. Cooper, “The Role of Assistive Robotics in the Lives of Persons with Disability,” American Journal of Physical Medicine & Rehabilitation, vol. 89, no. 6, pp. 509–521, 2010.
    [7] S. Li, X. Zhang, and J. D. Webb, “3-D-Gaze-Based Robotic Grasping Through Mimicking Human Visuomotor Function for People with Motion Impairments,” IEEE Transaction on Biomedical Engineering, vol. 64, no. 12, pp. 2824–2835, 2017.
    [8] J. Bruce, J. Perron, and R. Vaughan, “Ready—Aim—Fly! Hands-Free Face-Based HRI for 3D Trajectory Control of UAVs,” in Proceedings of Conference on Computer and Robot Vision, pp. 307–313, 2017.
    [9] D. McColl, C. Jiang, and G. Nejat, “Classifying a Person’s Degree of Accessibility from Natural Body Language During Social Human–Robot Interactions,” IEEE Transactions on Cybernetics, vol. 47, no. 2, pp. 524–538, 2017.
    [10] J.-H. Han, S.-J. Lee, and J.-H. Kim, “Behavior Hierarchy-Based Affordance Map for Recognition of Human Intention and Its Application to Human–Robot Interaction,” IEEE Transactions on Human-Machine Systems, vol. 46, no. 5, pp. 708–722, 2016.
    [11] S. Haykin, Neural Networks - A Comprehensive Foundation. London, 1999.
    [12] G. E. Hinton and S. Osindero, “A Fast Learning Algorithm for Deep Belief Nets,” Neural Computation, vol. 18, no. 7, pp. 1527-1554, 2006.
    [13] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation Tech Report,” in Proceedings of Conference on Computer Vision and Pattern Recognition, pp. 580 - 587, 2014.
    [14] A. Krizhevsky, “ImageNet Classification with Deep Convolutional Neural Networks,” Communications of the ACM, vol. 60, no. 6, pp. 84-90, 2017.
    [15] R.-E. Fan, P.-H. Chen, and C.-J.n Lin, “Working Set Selection Using Second Order Information for Training Support Vector Machines,” The Journal of Machine Learning Research, vol. 6, pp. 1889-1918, 2005.
    [16] C. L. P. Chen, and Z. Liu, “Broad Learning System: An Effective and Efficient Incremental Learning System Without the Need for Deep Architecture,” IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 1, pp. 10-24, 2018.
    [17] L. Bi, X.-A. Fan, and Y. Liu, “EEG-Based Brain-Controlled Mobile Robots: A Survey,” IEEE Transactions on Human-Machine Systems, vol. 43, no. 2, pp. 161-176, 2013.
    [18] J. Meng, S. Zhang, A. Bekyo, J. Olsoe, B. Baxter, and B. He, “Noninvasive Electroencephalogram Based Control of a Robotic Arm for Reach and Grasp Tasks,” Scientific Reports, vol. 6, no. 1, 2016.
    [19] P. K. Artemiadis and K. J. Kyriakopoulos, “EMG-Based Control of a Robot Arm Using Low-Dimensional Embeddings,” IEEE Transactions on Robotics, vol. 36, no. 2, pp. 393-398, 2010.
    [20] K. Anderson and P. W. McOwan, “A Real-Time Automated System for the Recognition of Human Facial Expressions,” IEEE Transactions on Systems, Man and Cybernetics, Part B, vol. 36, no. 1, pp. 96-105, 2006.
    [21] S. Li, “Attention-Aware Robotic Laparoscope Based on Fuzzy Interpretation of Eye-Gaze Patterns,” Journal of Medical Devices, vol. 9, no. 4, pp. 041007, 2015.
    [22] M. Betke, J. Gips, and P. Fleming, “The Camera Mouse: Visual Tracking of Body Features to Provide Computer Access for People with Severe Disabilities,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 10, no. 1, pp. 1-10, 2002.
    [23] A. K. Roy, M. N. Akhtar, M. Mahadevappa, R. Guha, and J. Mukherjee, “A Novel Technique to Develop Cognitive Models for Ambiguous Image Identification using Eye Tracker,” IEEE Transactions on Affective Computing (Early Access), pp. 1-1, 2017.
    [24] A. Lanatà, A. Armato, G. Valenza, and E. P. Scilingo, “Eye Tracking and Pupil Size Variation as Response to Affective Stimuli: a Preliminary Study,” in Proceedings of Conference on Pervasive Computing Technologies for Healthcare and Workshops, pp. 78-84, 2011.
    [25] V. Silva, J. Ramos, F. Soares, P. Novais, P. Arezes, C. Figueira, J. Silva, A. Santos, and F. Sousa, “A Wearable and Non-wearable Approach for Gesture Recognition – Initial Results,” in Proceedings of Congress on Ultra Modern Telecommunications and Control Systems and Workshops, pp. 185-190, 2017.
    [26] C. R. Naguri and R. C. Bunescu, “Recognition of Dynamic Hand Gestures from 3D Motion Data using LSTM and CNN architectures,” in Proceedings of Conference on Machine Learning and Applications, pp. 1130-1133, 2017.
    [27] F. Sachara, T. Kopinski, A. Gepperth, and U. Handmann, “Free-hand Gesture Recognition with 3D-CNNs for In-car Infotainment Control in Real-time,” in Proceedings of Conference on Intelligent Transportation Systems, pp. 959-964, 2017.
    [28] D.-L. Lee and W.-S. You, “Recognition of Complex Static Hand Gestures by Using the Wristband-based Contour Features,” IET Image Processing, vol. 12, no. 1, pp. 80-87, 2018.
    [29] Z. Ren, J. Yuan, and Z. Zhang, “Robust Hand Gesture Recognition Based on Finger-Earth Mover’s Distance with a Commodity Depth Camera,” in Proceedings of Conference on ACM international conference on Multimedia, pp. 1093-1096, 2011.
    [30] S. Y. Kim, H. G. Han, J. W. Kim, S. Lee, and T. W. Kim, “A Hand Gesture Recognition Sensor Using Reflected Impulses,” IEEE Sensors Journal, vol. 17, no. 10, pp. 2975 - 2976, 2017.
    [31] X. Liu, J. Sacks, M. Zhang, A. G. Richardson, T. H. Lucas, and J. Van der Spiegel, “The Virtual Trackpad: An Electromyography-Based, Wireless, Real-Time, Low-Power, Embedded Hand-Gesture-Recognition System Using an Event-Driven Artificial Neural Network,” IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 64, no. 11, pp. 1257-1261, 2017.
    [32] P. Bao, A. I. Maqueda, C. R. del-Blanco, and N. García, “Tiny Hand Gesture Recognition without Localization via a Deep Convolutional Network,” IEEE Transactions on Consumer Electronics, vol. 63, no. 3, pp. 251-257, 2017.
    [33] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look Once: Unified, Real-Time Object Detection,” in Proceedings of Conference on Computer Vision and Pattern Recognition, pp. 779-788, 2016.
    [34] J. Redmon, and A. Farhadi, “YOLO9000: Better, Faster, Stronger,” in Proceedings of Conference on Computer Vision and Pattern Recognition, pp. 6517-6525, 2017.
    [35] P. Viola and M. J. Jones “Robust Real-Time Face Detection,” International Journal of Computer Vision, vol. 57, no. 2, pp. 137-154, 2004.
    [36] V. Kazemi and J. Sullivan, “One Millisecond Face Alignment with an Ensemble of Regression Trees,” in Proceedings of Conference on Computer Vision and Pattern Recognition, pp. 1867-1874, 2014.
    [37] N. Dalal and B. Triggs, “Histograms of Oriented Gradients for Human Detection,” in Proceedings of Conference on Computer Vision and Pattern Recognition, 2005.
    [38] D. G. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints,” International Journal of Computer Vision, vol. 60, no. 2, pp. 91-110, 2004.
    [39] A. H. Gee and R. Cipolla, “Determining the Gaze of Faces in Images,” Image and Vision Computing, vol. 12, no. 10, pp. 639-647, 1994.
    [40] Hough Circle Transform. [Online] Available:
    https://docs.opencv.org/2.4/doc/tutorials/imgproc/imgtrans/hough_circle/hough_circle.html
    [41] F. Timm and E. Barth, “Accurate Eye Centre Localisation by Means of Gradients,” in Proceedings of Conference on Computer Vision Theory and Applications, 2011.
    [42] G. Welch and G. Bishop, “An Introduction to the Kalman Filter,” North Carolina at Chapel Hill Univ., Technical Report, 1995.
    [43] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” in Proceedings of Conference on Neural Information Processing Systems, pp. 1097-1105, 2012.
    [44] Convolutional Neural Networks (CNNs / ConvNets). [Online] Available:
    http://cs231n.github.io/convolutional-networks/
    [45] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137-1149, 2016.
    [46] M. Leshno, V. Y. Lin, A. Pinkus, and S. Schocken, “Multilayer Feedforward Networks with a Nonpolynomial Activation Function Can Approximate Any Function,” Neural Networks, vol. 6, no. 6, pp. 861-867, 1993.
    [47] Y.-H. Pao, G.-H. Park, and D. J. Sobajic, “Learning and Generalization Characteristics of the Random Vector Functional-link Net,” Neurocomputing, vol. 6, no. 2, pp. 163-180, 1994.
    [48] A. E. Hoerl, “Ridge Regression: Biased Estimation for Nonorthogonal Problems,” Technometrics, vol. 42, no. 1, pp. 80-86, 2000.
    [49] C. L. P. Chen, and J. Z. Wan, “A Rapid Learning and Dynamic Stepwise Updating Algorithm for Flat Neural Networks and the Application to Time-Series Prediction,” IEEE Transactions on Systems, Man and Cybernetics, Part B, vol. 29, no. 1, pp. 62-72, 1999.
    [50] UCI machine learning repository. [Online] Available:
    http://archive.ics.uci.edu/ml/index.php
    [51] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-Based Learning Applied to Document Recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.
    [52] The MNIST Database. [Online] Available: http://yann.lecun.com/exdb/mnist/
    [53] The MatLab Neural-Network Toolbox. [Online] Available: https://www.mathworks.com/products/neural-network.html
    [54] The MatLab Code of Random Forest, Bagging KNN, and AdaBoost. [Online] Available: https://github.com/AntoineAugusti/bagging-boosting-random-forests
    [55] The MatLab Code of DBN. [Online] Available: https://github.com/rasmusbergpalm/DeepLearnToolbox
    [56] The MatLab Code of SVM. [Online] Available: https://www.mathworks.com/matlabcentral/fileexchange/62061-multi-class-svm

    無法下載圖示 校內:2023-06-01公開
    校外:不公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE