簡易檢索 / 詳目顯示

研究生: 蔡丞益
Tsai, Cheng-Yi
論文名稱: 基於手勢與視覺之人機互動與教導研究
Study on Human-Robot Interaction and Teaching based on Hand Gesture and Vision
指導教授: 鄭銘揚
Cheng, Ming-Yang
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2024
畢業學年度: 112
語文別: 中文
論文頁數: 102
中文關鍵詞: 動態手勢辨識深度學習示範教導手眼校正機械手臂
外文關鍵詞: Dynamic Gesture Recognition, Deep Learning, Learning from Demonstration, Hand-Eye Calibration, Robotic Arm
相關次數: 點閱:68下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 各國的製造業者正逐步將工廠生產線自動化,在智慧製造的領域與「工業5.0」中,重新將人引入工廠中以整合人與機器人彼此的優勢,因此人機協作逐漸成為重要的議題,然而此舉同樣會增加操作員與機器人發生碰撞的可能性。有鑑於此,本論文發展基於電腦視覺技術辨識操作員的動態手勢,並設計一套針對機械手臂控制的動態手勢數據集,實現機械手臂的直覺性非接觸式教導任務之功能。本論文總共設計18種動態手勢,實現本論文所設計的三種教導任務:手眼校正、點對點運動、軌跡模仿,並基於教導任務定義機械手臂的平移與旋轉運動、紀錄特徵點、執行任務的對應手勢。本論文針對不同模型與影像前處理進行分析與探討,並對本論文的教導系統的各項任務進行實驗,而實驗結果證實了本論文所提出之動態手勢辨識系統與教導系統組合成的視覺教導系統的有效性、可行性與應用價值。

    Manufacturers in various countries are gradually automating factory production lines. In the field of smart manufacturing and "Industry 5.0," there is a reconsideration of welcoming humans back to factories to integrate the strengths of both humans and robots. Therefore, human-robot collaboration is becoming an important issue. However, this approach also increases the possibility of collisions between operators and robots. In view of this risk, this thesis develops a system based on computer vision technology to recognize any dynamic gestures given by operators. It also designs a set of dynamic gesture datasets for controlling robotic arms, realizing the functionality of intuitive, non-contact teaching tasks for robotic arms.
    This thesis presents the design of a total of 18 dynamic gestures to achieve three teaching tasks: hand-eye calibration, point-to-point motion, and trajectory imitation. The definition of teaching tasks, herein refers to the translation and rotation movements of the robotic arm, recording of feature points, and execution of corresponding gestures for tasks. Different models and image preprocessing for the dynamic gesture recognition system are analyzed and discussed. Experiments are conducted on various tasks of the teaching system proposed in this thesis, with results confirming the effectiveness, feasibility, and application value of a visual teaching system comprised of a dynamic gesture recognition system and the teaching system proposed in this thesis.

    中文摘要 II EXTEND ABSTRACT III 誌謝 XV 目錄 XVII 表目錄 XX 圖目錄 XXI 第一章 緒論 1 1.1 研究動機與目的 1 1.2 文獻回顧 2 1.3 論文架構與貢獻 3 第二章 動態手勢辨識系統架構 6 2.1 動態手勢辨識系統框架 7 2.2 影像前處理 8 2.2.1 堆疊灰階三通道影像(Stacked Grayscale 3-channel Image) 8 2.2.2 背景減去影像(Background Subtraction Image) 10 2.2.3 3D融合影像(3D Fusion Image) 14 2.3 即時辨識啟動器 17 2.4 本章小節 19 第三章 動態手勢辨識模型 20 3.1 CNN模型之動態手勢辨識探討 21 3.1.1 2D CNN架構探討 21 3.2 RNN模型之動態手勢辨識探討 26 3.2.1 門控循環單元(Gate Recurrent Unit)架構 27 3.2.2 卷積遞迴神經網路架29 3.3 Transformer模型之動態手勢辨識探討 30 3.3.1 Vision Transformer架構探討 30 3.3.2 Shifted Window Transformer架構探討 31 3.4 本章小節 36 第四章 教導系統架構 37 4.1 機械手臂之運動學模型 38 4.1.1 順向運動學[43] 38 4.1.2 逆向運動學[45][46] 40 4.2 手眼校正 45 4.3 軌跡模仿 47 4.4 動態手勢數據集與教導系統架構 49 4.4.1 人機交互的動態手勢設計 49 4.4.2 教導系統 53 4.5 本章小節 59 第五章 實驗結果與分析 60 5.1 實驗架構 60 5.1.1 實驗設備 60 5.1.2 實驗場景 61 5.2 實驗結果 62 5.2.1 實驗一:不同模型對動態手勢數據集的辨識結果 62 5.2.2 實驗二:影像前處理對模型效果 64 5.2.3 實驗三:即時辨識啟動器實驗 67 5.2.4 實驗四:機械手臂運動實驗 68 5.2.5 實驗五:機械手臂軌跡模仿實驗 70 5.3 本章小節 72 第六章 結論與建議 73 6.1 結論 73 未來建議與展望 74 參考文獻 75

    1. J. Leng, W. Sha, B. Wang, P. Zheng, C. Zhuang, Q. Liu, T. Wuest, D. Mourtzis, and L. Wang, “Industry 5.0: Prospect and retrospect,” Journal of Manufacturing Systems, vol. 65, pp. 279-295, Oct. 2022.
    2. K. Harpreet, and J. Rani., “A review: Study of various techniques of Hand gesture recognition,” in Proceedings of 2016 IEEE 1st International Conference on Power Electronics, Intelligent Control and Energy Systems (ICPEICES), Delhi, India, 2016, pp. 1-5.
    3. A. Mohanty, S. Rambhatla, and R. Sahay, “Deep Gesture: Static Hand Gesture Recognition Using CNN,” in Proceedings of International Conference on Computer Vision and Image Processing: CVIP 2016, Uttarakhand, India, 2016, pp. 449-461.
    4. X. Yin,, and X. Zhu, “Hand Posture Recognition in Gesture-Based Human-Robot Interaction,” in Proceedings of 2006 1ST IEEE Conference on Industrial Electronics and Applications, Singapore, 2006, pp. 1-6.
    5. N. Dardas, and N. Georganas, “Real-Time Hand Gesture Detection and Recognition Using Bag-of-Features and Support Vector Machine Techniques,” IEEE Transactions on Instrumentation and measurement, vol. 60, no.11, pp. 3592-3607, Aug, 2011.
    6. M. Elmezain, A. Al-Hamadi, J. Appenrodt, B. Michaelis, “A Hidden Markov Model-based Continuous Gesture Recognition System for Hand Motion Trajectory,” in Proceedings of 2008 19th international conference on pattern recognition, FL, USA, 2008, pp. 1-4.
    7. G. Plouffe, and A. Cretu, “Static and dynamic hand gesture recognition in depth data using dynamic time warping,” IEEE transactions on instrumentation and measurement, vol 65, no. 2, pp. 305-316, Feb, 2015.
    8. D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri, “Learning spatiotemporal features with 3d convolutional networks,” in Proceedings of the IEEE international conference on computer vision, Santiago, Chile, 2015, pp. 4489-4497
    9. 劉展憲,基於卷積神經網路與遞迴神經網路之手部追蹤與動態手勢辨識研究,國立成功大學,電機工程學系,2019。
    10. K. Lai and S. Yanushkevich, “CNN+ RNN depth and skeleton based dynamic hand gesture recognition,” in Proceedings of 2018 24th international conference on pattern recognition (ICPR), Beijing, China, 2018, pp. 3451-3456.
    11. A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929, 2020.
    12. 林佩宜,應用影像前處理以提升人物動作辨識網路之效能,碩士論文,國立臺灣大學,工程科學及海洋工程學系,2022。
    13. D. Sarikaya and J. Pierre, “Surgical gesture recognition with optical flow only,” arXiv preprint arXiv:1904.01143, 2019.
    14. Q. Gao, Y. Chen, Z. Ju, and Y. Liang, “Dynamic Hand Gesture Recognition Based on 3D Hand Pose Estimation for Human–Robot Interaction,” IEEE Sensors Journal, vol. 22, no. 18, pp. 17421-17430, Sep, 2022.
    15. O. Mazhar, B. Navarro, S. Ramdani, R. Passama, and A. Cherubini, “ A real-time human-robot interaction framework with robust background invariant hand gesture detection,” Robotics and Computer-Integrated Manufacturing, vol. 60, pp. 34-48, Dec, 2019.
    16. V. Moysiadis, D. Katikaridis, L. Benos, P. Busato, A. Anagnostis, D. Kateris, S. Pearson, and D. Bochtis, “An integrated real-time hand gesture recognition framework for human–robot interaction in agriculture,” Applied Sciences, vol. 12, no. 16, pp. 8160, Aug, 2022.
    17. D. Yongda, F. Li, and X. Huang, “Research on multimodal human-robot interaction based on speech and gesture,” Computers & Electrical Engineering, vol. 72, pp. 443-454, Nov, 2018.
    18. 張書菡,手勢操作六軸機械手臂運動技術,碩士論文,國立臺灣科技大學,機械工程學系,2015。
    19. 謝佩哲,具機械手臂之履帶式機器人協作任務之實現,碩士論文,國立臺灣師範大學,電機工程學系,2023。
    20. 沈鈺琦,應用於人機協作之遷移式學習手勢及姿態辨識系統,碩士論文,國立中正大學,機械工程學系,2023。
    21. O. Köpüklü, A. Gunduz, N. Kose, and G. Rigoll, “Real-time Hand Gesture Detection and Classification Using Convolutional Neural Networks,” in Proceedings of 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition, Lille, France, 2019, pp. 1-8.
    22. 林致遠,基於改良卷積長短期記憶網路與 BERT 的即時人體行為辨識,碩士論文,國立臺灣科技大學,電機工程學系,2023。
    23. Simonyan, Karen, and Andrew Zisserman. “Two-stream Convolutional Networks for Action Recognition in Videos.” in Proceedings of Advances in neural information processing systems, Montréal, Canda, 2014.
    24. J. -H. Kim and C. S. Won, “Action Recognition in Videos Using Pre-Trained 2D Convolutional Neural Networks,” IEEE Access, vol. 8, pp. 60179-60188.
    25. A. Elgammal, R. Duraiswami, D. Harwood, and L. S. Davis, “Background and Foreground Modeling Using Nonparametric Kernel Density Estimation for Visual Surveillance,” Proceedings of the IEEE, vol. 90, no. 7, pp. 1151-1163, July 2002.
    26. 楊士賢,基於電腦視覺與危險能量場之工業型機械手臂避障研究,國立成功大學,電機工程學系,2020。
    27. O. Kopuklu, N. Kose, and G. Rigoll, “Motion fused frames: Data level fusion strategy for hand gesture recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, Salt Lake City, USA, 2018.
    28. C. Lugaresi, J. Tang, H. Nash, C. McClanahan, E. Uboweja, M. Hays, F. Zhang, C. Chang, M. Yong, J. Lee, W. Chang, W. Hua, M. Georg, and M. Grundmann, “Mediapipe: A framework for perceiving and processing reality,” in Proceedings of the IEEE conference on computer vision and pattern recognition, Long Beach, CA. 2019.
    29. F. Zhang, V. Bazarevsky, A. Vakunov, A. Tkachenka, G. Sung, C. Chang, and M. Grundmann “Mediapipe hands: On-device real-time hand tracking,” arXiv preprint arXiv:2006.10214, 2020.
    30. V. Bazarevsky, Y. Kartynnik, A. Vakunov, K. Raveendran, M. Grundmann, “Blazeface: Sub-millisecond neural face detection on mobile gpus,” arXiv preprint arXiv:1907.05047, 2019.
    31. M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, L. Chen, “Mobilenetv2: Inverted residuals and linear bottlenecks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, Salt Lake City, USA, 2018, pp. 4510-4520.
    32. McNeill, David, and Elena Levy. “Conceptual representations in language activity and gesture.” in Speech, place, and action, 1982, pp. 271-295.
    33. 胡瑞麟,基於深度學習與時序定位的多攝影下駕駛分心行為識別之研究,碩士論文,國立陽明交通大學,多媒體工程研究所,2023。
    34. K. Yu and Y. Fu, “Human action recognition and prediction: A survey,” International Journal of Computer Vision, vol. 130, no.5 pp. 1366-1401.
    35. J. Lin, G. Chuang, and H. Song, “Tsm: Temporal shift module for efficient video understanding,” in Proceedings of the IEEE international conference on computer vision, Seoul, Korea, 2019, pp. 7083-7093
    36. B. Shi, X. Bai, and C. Yao, “An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition,” IEEE transactions on pattern analysis and machine intelligence, vol. 39, no. 11, pp. 2298-2304, Nov. 2016.
    37. Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” in Proceedings of the IEEE international conference on computer vision, Montreal, Canada, 2021, pp. 10012-10022.
    38. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, Las Vegas, USA, 2016, pp. 770-778.
    39. J. Elman, “Finding structure in time,” Cognitive science, vol. 14, no. 2, pp. 179-211, March, 1990.
    40. J. Chung, C. Gulcehre, K. Cho, and Y. Bengio “Empirical evaluation of gated recurrent neural networks on sequence modeling,” arXiv preprint arXiv:1412.3555, 2014.
    41. A. D’Eusanio, A. Simoni, S. Pini, G. Borghi, R. Vezzani, and R. Cucchiara, “A transformer-bas ed network for dynamic hand gesture recognition,” in Proceedings of 2020 International Conference on 3D Vision(3DV), Fukuoka, Japan, 2020, pp. 623-632.
    42. Z. Liu, J. Ning, Y. Cao, Y. Wei, Z. Zhang, S. Lin, and H. Hu “Video swin transformer,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, New Orleans, USA, 2022, pp. 3202-3211.
    43. M. Spong, S. Hutchinson, and M, Vidyasagar. “Forward kinematics: the denavit-hartenberg convention,” in Robot dynamics and control, John Wiley and Sons, USA: Wiley, 2004, pp. 57-82. [44. M. Spong, S. Hutchinson, and M. Vidyasagar, Robot modeling and control. John Wiley and Sons, USA: Wiley, 2006.
    45. C. Lee and M. Ziegler, “Geometric approach in solving inverse kinematics of PUMA robots,” IEEE Transactions on Aerospace and Electronic Systems, vol. 20, no. 6, pp. 695-706, Nov. 1984.
    46. B. Siciliano, L. Sciavicco, L. Villani, and G. Oriolo, “Differential kinematics and statics,” in Robotics: Modelling, Planning and Control, UK: Springer London, 2009. pp. 105-160.
    47. K. Arun, T. Huang, and S. Blostein, “Least-squares fitting of two 3-D point sets,” IEEE Transactions on pattern analysis and machine intelligence, vol. 9, no. 5, pp. 698-700, Sep. 1987.
    48. S. Schaal, “Dynamic movement primitives-a framework for motor control in humans and humanoid robotics,” in Adaptive motion of animals and machines, H. Kimura, K. Tsuchiya, A. Ishiguro, and H. Witte, Japan: Springer Tokyo, 2006. pp. 261-280.
    49. J. Wang, Y. Wang, W. Chen, Q. Diao, “6D trajectories planning based on improved dynamic movement primitives,” Control Theory and Applications, vol. 39, no. 5, pp. 809-818, May. 2022.
    50. H. Hoffmann, P. Pastor, D. Park, and S. Schaal, “Biologically-inspired dynamical systems for movement generation: Automatic real-time goal adaptation and obstacle avoidance,” in Proceedings of the 2009 IEEE international conference on robotics and automation, Kobe, Japan, 2009, pp. 2587-2592.
    51. S. Schaal and C. Atkeson. “Constructive incremental learning from only local information.” Neural computation, vol. 10, no. 8, pp. 2047-2084, Nov. 1998.
    52. G. Du, M. Chen, C. Liu, B. Zhang, and P. Zhang, “Online robot teaching with natural human–robot interaction,” IEEE Transactions on Industrial Electronics, vol. 65, no. 12, pp. 9571-9581, Apr. 2018.
    53. A. Vysocký, T. Poštulka, J. Chlebek, T. Kot, J. Maslowski, and S.Grushko, “Hand Gesture Interface for Robot Path Definition in Collaborative Applications: Implementation and Comparative Study,” Sensors, vol. 23, no. 9, pp. 4219, Apr. 2023.

    無法下載圖示 校內:2029-07-05公開
    校外:2029-07-05公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE