研究生: |
蔡丞益 Tsai, Cheng-Yi |
---|---|
論文名稱: |
基於手勢與視覺之人機互動與教導研究 Study on Human-Robot Interaction and Teaching based on Hand Gesture and Vision |
指導教授: |
鄭銘揚
Cheng, Ming-Yang |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
論文出版年: | 2024 |
畢業學年度: | 112 |
語文別: | 中文 |
論文頁數: | 102 |
中文關鍵詞: | 動態手勢辨識 、深度學習 、示範教導 、手眼校正 、機械手臂 |
外文關鍵詞: | Dynamic Gesture Recognition, Deep Learning, Learning from Demonstration, Hand-Eye Calibration, Robotic Arm |
相關次數: | 點閱:68 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
各國的製造業者正逐步將工廠生產線自動化,在智慧製造的領域與「工業5.0」中,重新將人引入工廠中以整合人與機器人彼此的優勢,因此人機協作逐漸成為重要的議題,然而此舉同樣會增加操作員與機器人發生碰撞的可能性。有鑑於此,本論文發展基於電腦視覺技術辨識操作員的動態手勢,並設計一套針對機械手臂控制的動態手勢數據集,實現機械手臂的直覺性非接觸式教導任務之功能。本論文總共設計18種動態手勢,實現本論文所設計的三種教導任務:手眼校正、點對點運動、軌跡模仿,並基於教導任務定義機械手臂的平移與旋轉運動、紀錄特徵點、執行任務的對應手勢。本論文針對不同模型與影像前處理進行分析與探討,並對本論文的教導系統的各項任務進行實驗,而實驗結果證實了本論文所提出之動態手勢辨識系統與教導系統組合成的視覺教導系統的有效性、可行性與應用價值。
Manufacturers in various countries are gradually automating factory production lines. In the field of smart manufacturing and "Industry 5.0," there is a reconsideration of welcoming humans back to factories to integrate the strengths of both humans and robots. Therefore, human-robot collaboration is becoming an important issue. However, this approach also increases the possibility of collisions between operators and robots. In view of this risk, this thesis develops a system based on computer vision technology to recognize any dynamic gestures given by operators. It also designs a set of dynamic gesture datasets for controlling robotic arms, realizing the functionality of intuitive, non-contact teaching tasks for robotic arms.
This thesis presents the design of a total of 18 dynamic gestures to achieve three teaching tasks: hand-eye calibration, point-to-point motion, and trajectory imitation. The definition of teaching tasks, herein refers to the translation and rotation movements of the robotic arm, recording of feature points, and execution of corresponding gestures for tasks. Different models and image preprocessing for the dynamic gesture recognition system are analyzed and discussed. Experiments are conducted on various tasks of the teaching system proposed in this thesis, with results confirming the effectiveness, feasibility, and application value of a visual teaching system comprised of a dynamic gesture recognition system and the teaching system proposed in this thesis.
1. J. Leng, W. Sha, B. Wang, P. Zheng, C. Zhuang, Q. Liu, T. Wuest, D. Mourtzis, and L. Wang, “Industry 5.0: Prospect and retrospect,” Journal of Manufacturing Systems, vol. 65, pp. 279-295, Oct. 2022.
2. K. Harpreet, and J. Rani., “A review: Study of various techniques of Hand gesture recognition,” in Proceedings of 2016 IEEE 1st International Conference on Power Electronics, Intelligent Control and Energy Systems (ICPEICES), Delhi, India, 2016, pp. 1-5.
3. A. Mohanty, S. Rambhatla, and R. Sahay, “Deep Gesture: Static Hand Gesture Recognition Using CNN,” in Proceedings of International Conference on Computer Vision and Image Processing: CVIP 2016, Uttarakhand, India, 2016, pp. 449-461.
4. X. Yin,, and X. Zhu, “Hand Posture Recognition in Gesture-Based Human-Robot Interaction,” in Proceedings of 2006 1ST IEEE Conference on Industrial Electronics and Applications, Singapore, 2006, pp. 1-6.
5. N. Dardas, and N. Georganas, “Real-Time Hand Gesture Detection and Recognition Using Bag-of-Features and Support Vector Machine Techniques,” IEEE Transactions on Instrumentation and measurement, vol. 60, no.11, pp. 3592-3607, Aug, 2011.
6. M. Elmezain, A. Al-Hamadi, J. Appenrodt, B. Michaelis, “A Hidden Markov Model-based Continuous Gesture Recognition System for Hand Motion Trajectory,” in Proceedings of 2008 19th international conference on pattern recognition, FL, USA, 2008, pp. 1-4.
7. G. Plouffe, and A. Cretu, “Static and dynamic hand gesture recognition in depth data using dynamic time warping,” IEEE transactions on instrumentation and measurement, vol 65, no. 2, pp. 305-316, Feb, 2015.
8. D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri, “Learning spatiotemporal features with 3d convolutional networks,” in Proceedings of the IEEE international conference on computer vision, Santiago, Chile, 2015, pp. 4489-4497
9. 劉展憲,基於卷積神經網路與遞迴神經網路之手部追蹤與動態手勢辨識研究,國立成功大學,電機工程學系,2019。
10. K. Lai and S. Yanushkevich, “CNN+ RNN depth and skeleton based dynamic hand gesture recognition,” in Proceedings of 2018 24th international conference on pattern recognition (ICPR), Beijing, China, 2018, pp. 3451-3456.
11. A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929, 2020.
12. 林佩宜,應用影像前處理以提升人物動作辨識網路之效能,碩士論文,國立臺灣大學,工程科學及海洋工程學系,2022。
13. D. Sarikaya and J. Pierre, “Surgical gesture recognition with optical flow only,” arXiv preprint arXiv:1904.01143, 2019.
14. Q. Gao, Y. Chen, Z. Ju, and Y. Liang, “Dynamic Hand Gesture Recognition Based on 3D Hand Pose Estimation for Human–Robot Interaction,” IEEE Sensors Journal, vol. 22, no. 18, pp. 17421-17430, Sep, 2022.
15. O. Mazhar, B. Navarro, S. Ramdani, R. Passama, and A. Cherubini, “ A real-time human-robot interaction framework with robust background invariant hand gesture detection,” Robotics and Computer-Integrated Manufacturing, vol. 60, pp. 34-48, Dec, 2019.
16. V. Moysiadis, D. Katikaridis, L. Benos, P. Busato, A. Anagnostis, D. Kateris, S. Pearson, and D. Bochtis, “An integrated real-time hand gesture recognition framework for human–robot interaction in agriculture,” Applied Sciences, vol. 12, no. 16, pp. 8160, Aug, 2022.
17. D. Yongda, F. Li, and X. Huang, “Research on multimodal human-robot interaction based on speech and gesture,” Computers & Electrical Engineering, vol. 72, pp. 443-454, Nov, 2018.
18. 張書菡,手勢操作六軸機械手臂運動技術,碩士論文,國立臺灣科技大學,機械工程學系,2015。
19. 謝佩哲,具機械手臂之履帶式機器人協作任務之實現,碩士論文,國立臺灣師範大學,電機工程學系,2023。
20. 沈鈺琦,應用於人機協作之遷移式學習手勢及姿態辨識系統,碩士論文,國立中正大學,機械工程學系,2023。
21. O. Köpüklü, A. Gunduz, N. Kose, and G. Rigoll, “Real-time Hand Gesture Detection and Classification Using Convolutional Neural Networks,” in Proceedings of 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition, Lille, France, 2019, pp. 1-8.
22. 林致遠,基於改良卷積長短期記憶網路與 BERT 的即時人體行為辨識,碩士論文,國立臺灣科技大學,電機工程學系,2023。
23. Simonyan, Karen, and Andrew Zisserman. “Two-stream Convolutional Networks for Action Recognition in Videos.” in Proceedings of Advances in neural information processing systems, Montréal, Canda, 2014.
24. J. -H. Kim and C. S. Won, “Action Recognition in Videos Using Pre-Trained 2D Convolutional Neural Networks,” IEEE Access, vol. 8, pp. 60179-60188.
25. A. Elgammal, R. Duraiswami, D. Harwood, and L. S. Davis, “Background and Foreground Modeling Using Nonparametric Kernel Density Estimation for Visual Surveillance,” Proceedings of the IEEE, vol. 90, no. 7, pp. 1151-1163, July 2002.
26. 楊士賢,基於電腦視覺與危險能量場之工業型機械手臂避障研究,國立成功大學,電機工程學系,2020。
27. O. Kopuklu, N. Kose, and G. Rigoll, “Motion fused frames: Data level fusion strategy for hand gesture recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, Salt Lake City, USA, 2018.
28. C. Lugaresi, J. Tang, H. Nash, C. McClanahan, E. Uboweja, M. Hays, F. Zhang, C. Chang, M. Yong, J. Lee, W. Chang, W. Hua, M. Georg, and M. Grundmann, “Mediapipe: A framework for perceiving and processing reality,” in Proceedings of the IEEE conference on computer vision and pattern recognition, Long Beach, CA. 2019.
29. F. Zhang, V. Bazarevsky, A. Vakunov, A. Tkachenka, G. Sung, C. Chang, and M. Grundmann “Mediapipe hands: On-device real-time hand tracking,” arXiv preprint arXiv:2006.10214, 2020.
30. V. Bazarevsky, Y. Kartynnik, A. Vakunov, K. Raveendran, M. Grundmann, “Blazeface: Sub-millisecond neural face detection on mobile gpus,” arXiv preprint arXiv:1907.05047, 2019.
31. M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, L. Chen, “Mobilenetv2: Inverted residuals and linear bottlenecks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, Salt Lake City, USA, 2018, pp. 4510-4520.
32. McNeill, David, and Elena Levy. “Conceptual representations in language activity and gesture.” in Speech, place, and action, 1982, pp. 271-295.
33. 胡瑞麟,基於深度學習與時序定位的多攝影下駕駛分心行為識別之研究,碩士論文,國立陽明交通大學,多媒體工程研究所,2023。
34. K. Yu and Y. Fu, “Human action recognition and prediction: A survey,” International Journal of Computer Vision, vol. 130, no.5 pp. 1366-1401.
35. J. Lin, G. Chuang, and H. Song, “Tsm: Temporal shift module for efficient video understanding,” in Proceedings of the IEEE international conference on computer vision, Seoul, Korea, 2019, pp. 7083-7093
36. B. Shi, X. Bai, and C. Yao, “An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition,” IEEE transactions on pattern analysis and machine intelligence, vol. 39, no. 11, pp. 2298-2304, Nov. 2016.
37. Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” in Proceedings of the IEEE international conference on computer vision, Montreal, Canada, 2021, pp. 10012-10022.
38. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, Las Vegas, USA, 2016, pp. 770-778.
39. J. Elman, “Finding structure in time,” Cognitive science, vol. 14, no. 2, pp. 179-211, March, 1990.
40. J. Chung, C. Gulcehre, K. Cho, and Y. Bengio “Empirical evaluation of gated recurrent neural networks on sequence modeling,” arXiv preprint arXiv:1412.3555, 2014.
41. A. D’Eusanio, A. Simoni, S. Pini, G. Borghi, R. Vezzani, and R. Cucchiara, “A transformer-bas ed network for dynamic hand gesture recognition,” in Proceedings of 2020 International Conference on 3D Vision(3DV), Fukuoka, Japan, 2020, pp. 623-632.
42. Z. Liu, J. Ning, Y. Cao, Y. Wei, Z. Zhang, S. Lin, and H. Hu “Video swin transformer,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, New Orleans, USA, 2022, pp. 3202-3211.
43. M. Spong, S. Hutchinson, and M, Vidyasagar. “Forward kinematics: the denavit-hartenberg convention,” in Robot dynamics and control, John Wiley and Sons, USA: Wiley, 2004, pp. 57-82. [44. M. Spong, S. Hutchinson, and M. Vidyasagar, Robot modeling and control. John Wiley and Sons, USA: Wiley, 2006.
45. C. Lee and M. Ziegler, “Geometric approach in solving inverse kinematics of PUMA robots,” IEEE Transactions on Aerospace and Electronic Systems, vol. 20, no. 6, pp. 695-706, Nov. 1984.
46. B. Siciliano, L. Sciavicco, L. Villani, and G. Oriolo, “Differential kinematics and statics,” in Robotics: Modelling, Planning and Control, UK: Springer London, 2009. pp. 105-160.
47. K. Arun, T. Huang, and S. Blostein, “Least-squares fitting of two 3-D point sets,” IEEE Transactions on pattern analysis and machine intelligence, vol. 9, no. 5, pp. 698-700, Sep. 1987.
48. S. Schaal, “Dynamic movement primitives-a framework for motor control in humans and humanoid robotics,” in Adaptive motion of animals and machines, H. Kimura, K. Tsuchiya, A. Ishiguro, and H. Witte, Japan: Springer Tokyo, 2006. pp. 261-280.
49. J. Wang, Y. Wang, W. Chen, Q. Diao, “6D trajectories planning based on improved dynamic movement primitives,” Control Theory and Applications, vol. 39, no. 5, pp. 809-818, May. 2022.
50. H. Hoffmann, P. Pastor, D. Park, and S. Schaal, “Biologically-inspired dynamical systems for movement generation: Automatic real-time goal adaptation and obstacle avoidance,” in Proceedings of the 2009 IEEE international conference on robotics and automation, Kobe, Japan, 2009, pp. 2587-2592.
51. S. Schaal and C. Atkeson. “Constructive incremental learning from only local information.” Neural computation, vol. 10, no. 8, pp. 2047-2084, Nov. 1998.
52. G. Du, M. Chen, C. Liu, B. Zhang, and P. Zhang, “Online robot teaching with natural human–robot interaction,” IEEE Transactions on Industrial Electronics, vol. 65, no. 12, pp. 9571-9581, Apr. 2018.
53. A. Vysocký, T. Poštulka, J. Chlebek, T. Kot, J. Maslowski, and S.Grushko, “Hand Gesture Interface for Robot Path Definition in Collaborative Applications: Implementation and Comparative Study,” Sensors, vol. 23, no. 9, pp. 4219, Apr. 2023.