簡易檢索 / 詳目顯示

研究生: 陳瑋泰
Chen, Wei-Tai
論文名稱: 開發一套用於手持治療機器人之手勢辨識演算法
Development of a Gesture Recognition Algorithm for Therapeutic Holding Robot
指導教授: 蘇芳慶
Su, Fong-Chin
學位類別: 碩士
Master
系所名稱: 工學院 - 生物醫學工程學系
Department of BioMedical Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 英文
論文頁數: 85
中文關鍵詞: 機器學習深度學習手勢模式手勢辨認
外文關鍵詞: Machine Learning, Deep Learning, Hand Gesture Pattern, Gesture Recognition
相關次數: 點閱:127下載:13
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 根據世界衛生組織的報告,近年來老年人口的比例大幅增加。維持身心健康對於未來的政策規劃至關重要。近來許多研究指出,陪伴機器人對老年人的身心健康具有正面的影響,因為這種機器人會根據人類給予刺激而提供各種回饋。因此,識別來自人類的刺激是邁向人機互動的第一步。其中一種人機互動的方法是觸覺互動,它被認為是傳達親密情感的首選管道。所以,本研究著重在使用機器學習和深度學習來辨識人類社交觸摸的手勢。
    眾所周知,機器學習和深度學習是完成分類任務的強大工具。本研究將使用支撐向量機(SVM)、隨機森林(RF)、一維卷積神經網路(1D-CNN)、二維卷積神經網路(2D-CNN)和三維卷積神經網路(3D-CNN)來識別6種社交觸摸手勢,分別是拍、握、戳、抓和無接觸。數據集總共有17716個樣本。受試者以2種不同的姿態執行手勢,分別是物體靜置在桌面上和手持裝置,裝置上有壓力感測元件。
    交叉驗證用於評估每個模型的表現,SVM、RF、1D-CNN、2D-CNN和3D-CNN的準確度分別是16.76%、49.37%、70.51%、70.46%、73.63%。辨識結果表示所有模型均可利用壓力數據辨識不同手勢。未來將增加其他感測元件以提高辨識準確度。此外更進一步的研究手勢和情緒的關係是必要的。

    According to the World Health Organization’s report, the percentage of older adults has significantly increased in recent years. Maintaining their physical and mental health is essential for future policy planning. Recently, many studies have found that companion robots had beneficial effects on physical and mental health for older adults. This is explained by the fact that robots can offer various responses according to the stimulation from humans. Thus, recognition of these stimulations is the first step towards human–robot interaction. An example is a tactile interaction, which is the preferred channel to communicate intimate emotions. Therefore, this study focused on the hand gesture recognition of social touch in humans using machine learning and deep learning.
    Machine learning and deep learning, powerful tools for classification, have been widely developed to recognize the types of social touch gestures. In this study, five algorithms, support vector machines (SVM), random forest (RF), and three convolutional neural networks, one-dimensional (1D-CNN), two-dimensional (2D-CNN), and three-dimensional (3D-CNN), were used for analysis and comparison of their performance on hand gesture recognition. These models recognized six types of social touch gestures, the pat, stroke, grab, poke, scratch, and no touch. The dataset included 17,716 samples of these six gestures. All gestures were performed in two postures, stationary and holding, on a pressure mapping sensor mat attached on a cylinder-shaped companion robot simulator.
    Ten-fold cross-validation was used to evaluate the performance of all models. The final accuracy percentages for SVM, RF, 1D-CNN, 2D-CNN, and 3D-CNN were 16.76%, 49.37%, 70.51%, 70.46%, and 75.78%, respectively. The results indicate that the models could classify hand gestures based on pressure data. Future work is required to increase the accuracy either by adding database size or utilizing high-resolution pressure sensors. Furthermore, the relationships between hand gestures and emotion states should also be considered.

    摘要 I Abstract II 致謝 IV Content V List of Figures VII List of Tables X Chapter 1 Introduction 1 1.1 Aging Population Problems 1 1.2 Animal-Assisted Therapy and Activities 2 1.3 Robotic Pets in Healthcare 2 1.4 How Robots Interact with Humans 3 1.5 State-of-the-Art on Gesture Recognition 4 1.5.1 Gesture Recognition with Machine Learning 4 1.5.2 Gesture Recognition with Deep Learning 5 1.6 Motivation 7 1.7 Research Questions and Hypotheses 8 Chapter 2  Materials and Methods 9 2.1 Pressure Sensor and Microcontroller Board 9 2.2 Data Acquisition 13 2.2.1 Experiment Setup 13 2.2.2 Subjects and Gestures 15 2.3 Pre-processing and Feature Extraction 17 2.4 Machine Learning Models 20 2.4.1 Support Vector Machine (SVM) 20 2.4.2 Decision Tree and Random Forest 21 2.5 Deep Learning Models 22 2.5.1 Convolutional Neural Networks 22 2.5.2 Model Architecture 25 2.6 Training Strategy 37 2.7 K-Fold Cross-Validation 38 Chapter 3 Results 43 3.1 Model Performance for Different Postures 43 3.1.1 Recognition Accuracy 43 3.1.2 Loss and Accuracy Curves 46 3.1.3 Confusion Matrices 49 3.2 Inter-Subjects’ Model Performance 54 3.2.1 Recognition Accuracy 54 3.2.2 Loss and Accuracy Curves 58 3.2.3 Confusion Matrices 63 Chapter 4 Discussion 70 4.1 Influence of Different Postures on Model’s Performance 70 4.2 Gesture Movement Patterns 72 4.3 Robust Algorithms 73 4.4 Model Comparison 74 4.4.1 Machine Learning Models 74 4.4.2 Deep Learning Models 76 4.4.3 Model Comparison 76 4.4.4 Comparison of Algorithms and Humans 78 Chapter 5 Conclusion 79 References 80

    [1] U. Nations, "World population prospects 2019," 2019.
    [2] W. H. Organization, World report on ageing and health. World Health Organization, 2015.
    [3] L. Grenade and D. Boldy, "Social isolation and loneliness among older people: issues and future challenges in community and residential settings," Australian Health Review, vol. 32, no. 3, pp. 468-478, 2008.
    [4] U. S. D. o. H. H. Services. [Online]. Available: https://www.womenshealth.gov/mental-health/good-mental-health/good-mental-health-every-age.
    [5] T. F. Garrity, L. F. Stallones, M. B. Marx, and T. P. Johnson, "Pet ownership and attachment as supportive factors in the health of the elderly," Anthrozoös, vol. 3, no. 1, pp. 35-44, 1989.
    [6] D. Lago, M. Delaney, M. Miller, and C. Grill, "Companion animals, attitudes toward pets, and health outcomes among the elderly: A long-term follow-up," Anthrozoös, vol. 3, no. 1, pp. 25-34, 1989.
    [7] J. Gammonley and J. Yates, "Pet projects: Animal assisted therapy in nursing homes," Journal of gerontological nursing, vol. 17, no. 1, pp. 12-15, 1991.
    [8] D. Society, "Standards of practice for animal assisted activities and animal assisted therapy," ed: Delta Society Renton (WA), 1996.
    [9] D. Feil-Seifer and M. J. Matarić, "Socially assistive robotics," IEEE Robotics & Automation Magazine, vol. 18, no. 1, pp. 24-31, 2011.
    [10] T. Shibata, "An overview of human interactive robots for psychological enrichment," Proceedings of the IEEE, vol. 92, no. 11, pp. 1749-1758, 2004.
    [11] S. T. Hansen, H. J. Andersen, and T. Bak, "Practical evaluation of robots for elderly in Denmark—an overview," in 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI), 2010: IEEE, pp. 149-150.
    [12] L. Odetti et al., "Preliminary experiments on the acceptability of animaloid companion robots by older people with early dementia," in 2007 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2007: IEEE, pp. 1816-1819.
    [13] K. Wada, T. Shibata, K. Sakamoto, and K. Tanie, "Quantitative analysis of utterance of elderly people in long-term robot assisted activity," in ROMAN 2005. IEEE International Workshop on Robot and Human Interactive Communication, 2005., 2005: IEEE, pp. 267-272.
    [14] G. Huisman, "Social touch technology: a survey of haptic technology for social touch," IEEE transactions on haptics, vol. 10, no. 3, pp. 391-408, 2017.
    [15] M. J. Hertenstein, J. M. Verkamp, A. M. Kerestes, and R. M. Holmes, "The communicative functions of touch in humans, nonhuman primates, and rats: a review and synthesis of the empirical research," Genetic, social, and general psychology monographs, vol. 132, no. 1, pp. 5-94, 2006.
    [16] D. Silvera-Tawil, D. Rye, and M. Velonaki, "Artificial skin and tactile sensing for socially interactive robots: A review," Robotics and Autonomous Systems, vol. 63, pp. 230-243, 2015.
    [17] B. App, D. N. McIntosh, C. L. Reed, and M. J. Hertenstein, "Nonverbal channel use in communication of emotion: How may depend on why," Emotion, vol. 11, no. 3, p. 603, 2011.
    [18] N.-E. Ayat, M. Cheriet, L. Remaki, and C. Y. Suen, "KMOD-A new support vector machine kernel with moderate decreasing for pattern recognition. Application to digit image recognition," in Proceedings of Sixth International Conference on Document Analysis and Recognition, 2001: IEEE, pp. 1215-1219.
    [19] L. Deng and X. Li, "Machine learning paradigms for speech recognition: An overview," IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 5, pp. 1060-1089, 2013.
    [20] Y.-M. Kim, S.-Y. Koo, J. G. Lim, and D.-S. Kwon, "A robust online touch pattern recognition for dynamic human-robot interaction," IEEE Transactions on Consumer Electronics, vol. 56, no. 3, pp. 1979-1987, 2010.
    [21] K. Altun and K. E. MacLean, "Recognizing affect in human touch of a robot," Pattern Recognition Letters, vol. 66, pp. 31-40, 2015.
    [22] A. Flagg and K. MacLean, "Affective touch gesture recognition for a furry zoomorphic machine," in Proceedings of the 7th International Conference on Tangible, Embedded and Embodied Interaction, 2013, pp. 25-32.
    [23] S. Albawi, O. Bayat, S. Al-Azawi, and O. N. Ucan, "Social touch gesture recognition using convolutional neural network," Computational intelligence and neuroscience, vol. 2018, 2018.
    [24] D. Hughes, A. Krauthammer, and N. Correll, "Recognizing social touch gestures using recurrent and convolutional neural networks," in 2017 IEEE International Conference on Robotics and Automation (ICRA), 2017: IEEE, pp. 2315-2321.
    [25] J. Sun, S. Redyuk, E. Billing, D. Högberg, and P. Hemeren, "Tactile interaction and social touch: Classifying human touch using a soft tactile sensor," in Proceedings of the 5th International Conference on Human Agent Interaction, 2017, pp. 523-526.
    [26] D. Singh, E. Merdivan, S. Hanke, J. Kropf, M. Geist, and A. Holzinger, "Convolutional and recurrent neural networks for activity recognition in smart environment," in Towards integrative machine learning and knowledge extraction: Springer, 2017, pp. 194-205.
    [27] J.-H. Park, J.-H. Seo, Y.-H. Nho, and D.-S. Kwon, "Touch Gesture Recognition System based on 1D Convolutional Neural Network with Two Touch Sensor Orientation Settings," in 2019 16th International Conference on Ubiquitous Robots (UR), 2019: IEEE, pp. 65-70.
    [28] H. J. Ku, J. J. Choi, S. Jang, W. Do, S. Lee, and S. Seok, "Online Social Touch Pattern Recognition with Multi-modal-sensing Modular Tactile Interface," in 2019 16th International Conference on Ubiquitous Robots (UR), 2019: IEEE, pp. 271-277.
    [29] N. Zhou and J. Du, "Recognition of social touch gestures using 3D convolutional neural networks," in Chinese Conference on Pattern Recognition, 2016: Springer, pp. 164-173.
    [30] R. Pascanu, T. Mikolov, and Y. Bengio, "On the difficulty of training recurrent neural networks," in International conference on machine learning, 2013, pp. 1310-1318.
    [31] D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri, "Learning spatiotemporal features with 3d convolutional networks," in Proceedings of the IEEE international conference on computer vision, 2015, pp. 4489-4497.
    [32] B. Shi, X. Bai, and C. Yao, "An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition," arXiv preprint arXiv:1507.05717, 2015.
    [33] H. Iwata and S. Sugano, "Human-robot-contact-state identification based on tactile recognition," IEEE Transactions on Industrial Electronics, vol. 52, no. 6, pp. 1468-1477, 2005.
    [34] M. Kaboli, A. Long, and G. Cheng, "Humanoids learn touch modalities identification via multi-modal robotic skin and robust tactile descriptors," Advanced Robotics, vol. 29, no. 21, pp. 1411-1425, 2015.
    [35] F. Naya, J. Yamato, and K. Shinozawa, "Recognizing human touching behaviors using a haptic interface for a pet-robot," in IEEE SMC'99 Conference Proceedings. 1999 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No. 99CH37028), 1999, vol. 2: IEEE, pp. 1030-1034.
    [36] S. van Wingerden, T. J. Uebbing, M. M. Jung, and M. Poel, "A neural network based approach to social touch classification," in Proceedings of the 2014 workshop on Emotion Representation and Modelling in Human-Computer-Interaction-Systems, 2014, pp. 7-12.
    [37] B. Zhou et al., "Textile pressure mapping sensor for emotional touch detection in human-robot interaction," Sensors, vol. 17, no. 11, p. 2585, 2017.
    [38] M. D. Cooney, S. Nishio, and H. Ishiguro, "Recognizing affection for a touch-based interaction with a humanoid robot," in 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2012: IEEE, pp. 1420-1427.
    [39] M. M. Jung, "Towards social touch intelligence: developing a robust system for automatic touch recognition," in Proceedings of the 16th International Conference on Multimodal Interaction, 2014, pp. 344-348.
    [40] M. M. Jung, M. Poel, R. Poppe, and D. K. Heylen, "Automatic recognition of touch gestures in the corpus of social touch," Journal on multimodal user interfaces, vol. 11, no. 1, pp. 81-96, 2017.
    [41] M. M. Jung, R. Poppe, M. Poel, and D. K. Heylen, "Touching the void--introducing CoST: corpus of social touch," in Proceedings of the 16th International Conference on Multimodal Interaction, 2014, pp. 120-127.
    [42] W. D. Stiehl and C. Breazeal, "Affective touch for robotic companions," in International Conference on Affective Computing and Intelligent Interaction, 2005: Springer, pp. 747-754.
    [43] S. Yohanan and K. E. MacLean, "The role of affective touch in human-robot interaction: Human intent and expectations in touching the haptic creature," International Journal of Social Robotics, vol. 4, no. 2, pp. 163-180, 2012.
    [44] M. J. Hertenstein, R. Holmes, M. McCullough, and D. Keltner, "The communication of emotion via touch," Emotion, vol. 9, no. 4, p. 566, 2009.
    [45] F. Pedregosa et al., "Scikit-learn: Machine learning in Python," Journal of machine learning research, vol. 12, no. Oct, pp. 2825-2830, 2011.
    [46] C.-W. Hsu, C.-C. Chang, and C.-J. Lin, "A practical guide to support vector classification," ed: Taipei, 2003.
    [47] c. Wikipedia. (22 March 2020 03:46 UTC). Decision tree learning [Online]. Available: https://en.wikipedia.org/w/index.php?title=Decision_tree_learning&oldid=940671669.
    [48] F. Chollet, Deep Learning mit Python und Keras: Das Praxis-Handbuch vom Entwickler der Keras-Bibliothek. MITP-Verlags GmbH & Co. KG, 2018.
    [49] T. Hung. 卷積神經網路(Convolutional neural network, CNN) — 卷積運算、池化運算 [Online]. Available: https://medium.com/@chih.sheng.huang821/%E5%8D%B7%E7%A9%8D%E7%A5%9E%E7%B6%93%E7%B6%B2%E8%B7%AF-convolutional-neural-network-cnn-%E5%8D%B7%E7%A9%8D%E9%81%8B%E7%AE%97-%E6%B1%A0%E5%8C%96%E9%81%8B%E7%AE%97-856330c2b703.
    [50] S. Verma. Understanding 1D and 3D Convolution Neural Network | Keras [Online]. Available: https://towardsdatascience.com/understanding-1d-and-3d-convolution-neural-network-keras-9d8f76e29610.
    [51] M. Lin, Q. Chen, and S. Yan, "Network in network," arXiv preprint arXiv:1312.4400, 2013.
    [52] S. Ioffe and C. Szegedy, "Batch normalization: Accelerating deep network training by reducing internal covariate shift," arXiv preprint arXiv:1502.03167, 2015.
    [53] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770-778.
    [54] C. Szegedy et al., "Going deeper with convolutions," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 1-9.
    [55] H. Li, Z. Xu, G. Taylor, C. Studer, and T. Goldstein, "Visualizing the loss landscape of neural nets," in Advances in Neural Information Processing Systems, 2018, pp. 6389-6399.
    [56] A. Mohan. Loss Visualization [Online]. Available: http://www.telesens.co/loss-landscape-viz/viewer.html.
    [57] V. Sze, Y.-H. Chen, T.-J. Yang, and J. S. Emer, "Efficient processing of deep neural networks: A tutorial and survey," Proceedings of the IEEE, vol. 105, no. 12, pp. 2295-2329, 2017.
    [58] M. Abadi et al., "Tensorflow: A system for large-scale machine learning," in 12th [1] Symposium on Operating Systems Design and Implementation ([1] 16), 2016, pp. 265-283.
    [59] L. Liu et al., "On the variance of the adaptive learning rate and beyond," arXiv preprint arXiv:1908.03265, 2019.
    [60] M. Zhang, J. Lucas, J. Ba, and G. E. Hinton, "Lookahead Optimizer: k steps forward, 1 step back," in Advances in Neural Information Processing Systems, 2019, pp. 9593-9604.
    [61] P. Ramachandran, B. Zoph, and Q. V. Le, "Searching for activation functions," arXiv preprint arXiv:1710.05941, 2017.
    [62] K. He, X. Zhang, S. Ren, and J. Sun, "Delving deep into rectifiers: Surpassing human-level performance on imagenet classification," in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1026-1034.
    [63] T. Huang. (2018). 交叉驗證(Cross-validation, CV) [Online]. Available: https://medium.com/@chih.sheng.huang821/%E4%BA%A4%E5%8F%89%E9%A9%97%E8%AD%89-cross-validation-cv-3b2c714b18db.
    [64] R. Zhang, "Making convolutional networks shift-invariant again," arXiv preprint arXiv:1904.11486, 2019.
    [65] G. Cheng, P. Zhou, and J. Han, "Rifd-cnn: Rotation-invariant and fisher discriminative convolutional neural networks for object detection," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 2884-2893.
    [66] G. Cheng, J. Han, P. Zhou, and D. Xu, "Learning rotation-invariant and fisher discriminative convolutional neural networks for object detection," IEEE Transactions on Image Processing, vol. 28, no. 1, pp. 265-278, 2018.
    [67] H. A. Elfenbein and N. Ambady, "On the universality and cultural specificity of emotion recognition: a meta-analysis," Psychological bulletin, vol. 128, no. 2, p. 203, 2002.
    [68] K. R. Scherer, T. Johnstone, and G. Klasmeyer, Vocal expression of emotion. Oxford University Press, 2003.

    下載圖示 校內:2022-10-12公開
    校外:2022-10-12公開
    QR CODE