簡易檢索 / 詳目顯示

研究生: 張中昀
Chang, Chung-Yun
論文名稱: 基於視覺之深度增強式學習應用於機械手臂物件吸取任務研究
Study on Vision-Based Deep Reinforcement Learning for Object Pick-and-Place Task with Manipulator
指導教授: 鄭銘揚
Cheng, Ming-Yang
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2024
畢業學年度: 112
語文別: 中文
論文頁數: 133
中文關鍵詞: YOLOv7K-Means柔性演員評論家協作型機械手臂物件拾取
外文關鍵詞: YOLOv7, K-Means, Soft Actor-Critic (SAC), Collaborative Robot Manipulators, Object Grasping
相關次數: 點閱:30下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著自動化時代的來臨,智慧機器人越來越普及於日常生活中。許多產業已將機器學習技術應用於協作型機器人中,以完成高效率且可以彈性操作的任務。有鑑於此,本論文開發一套使用深度增強式學習方法於模擬環境中訓練機械手臂拾取物件之系統。該系統結合電腦視覺的物件辨識、定位與資料前處理等技術,於 CoppeliaSim 平台訓練機械手臂自主學習吸取物件並應用至真實環境中。其中,資料前處理包含使用 YOLOv7 進行物件偵測與定位、利用 K-Means 演算法將感興趣物件與背景分割、設定一物件表面點之法向量與 Z 軸夾角的閾值以篩選部分點雲,留下較容易吸取成功的物件點雲資訊作為深度增強式學習演算法之輸入。而本論文使用柔性演員評論家(Soft Actor-Critic, SAC)來決定物件吸取點位置,其中,輸入之狀態採用物件的深度影像或是 3D 點雲資訊,分別透過 CNN 以及基於PointNet++的方法萃取特徵並進行訓練結果的比較。最後,將訓練結果應用於 UR5 協作型六軸機械手臂中,使其以高吸取成功率完成多物件吸取與分類,驗證本論文所提方法之有效性。

    With the arrival of the automation era, intelligent robots have become increasingly integrated into daily life. Many industry sectors have applied machine learning technologies in collaborative robots to perform tasks with high efficiency and operational flexibility. In light of such demands, this thesis develops a robotic object grasping system that is trained in a simulated environment using a deep reinforcement learning network. In particular, the robotic object grasping system integrates computer vision-based object detection, object localization, and data preprocessing technologies. The system is trained in a simulated environment—the CoppeliaSim platform—to autonomously learn robotic object grasping, and the trained model is then applied in real-world scenarios. Data preprocessing methods used in this thesis include using YOLOv7 for object detection and localization, applying the K-Means algorithm to segment objects of interest from their background, and setting a threshold for the angle between the normal vector of a point on the object's surface and the Z-axis. This threshold is used to filter the point cloud, retaining point cloud information that is easier for successful suction as input for the deep reinforcement learning algorithm. The Soft Actor-Critic (SAC) method is employed to determine the object's suction point position. The input state includes the object's depth image or 3D point cloud information, with features extracted and training results compared using the CNN and PointNet++ methods. Finally, the trained model is applied to a UR5 collaborative 6-DOF robot manipulator, enabling it to achieve high suction success rates in multi-object pick-and-place and classification tasks. The results validate the effectiveness of the proposed method in this thesis.

    中文摘要 I EXTENDED ABSTRACT II 致謝 XVI 目錄 XVIII 表目錄 XXI 圖目錄 XXII 符號 XXV 縮寫 XXIX 第一章 緒論 1 1.1 研究動機與目的 1 1.2 文獻回顧 3 1.3 論文架構與貢獻 6 第二章 基於 YOLOv7演算法辨識之資料前處理 8 2.1 物件辨識與定位 9 2.1.1 YOLOv7演算法 9 2.2 前景背景分割 15 2.2.1 K-Means 15 2.2.2 DBSCAN 18 2.2.3 HDBSCAN 21 2.3 物件表面法向量篩選 22 2.4 本章小結 24 第三章 基於深度增強式學習之物件吸取策略 25 3.1 深度增強式學習 26 3.2 SAC演算法 29 3.2.1 最大化熵 29 3.2.2 柔性策略疊代 30 3.2.3 網路架構 32 3.2.4 訓練流程 35 3.3 基於深度增強式學習之物件吸取決策設計 37 3.3.1 二維影像特徵萃取 37 3.3.2 三維影像特徵萃取 38 3.3.3 基於 SAC之物件吸取決策網路架構設計 40 3.4 本章小結 46 第四章 六軸機械手臂運動學與手眼校正 48 4.1 順向運動學 49 4.2 逆向運動學 52 4.3 手眼校正 58 4.4 本章小結 61 第五章 實驗結果與分析 62 5.1 實驗架構 63 5.1.1 模擬平台與場景設置 63 5.1.2 實驗設備 64 5.1.3 實際上機實驗場景 66 5.2 實驗結果 68 5.2.1 YOLOv7訓練結果 68 5.2.2 不同聚類演算法之前景與背景分割效果 70 5.2.3 法向量閾值篩選結果 72 5.2.4 基於 SAC之物件吸取決策模型訓練方式與結果 73 5.2.5 上機實驗一:單一物件吸取 80 5.2.6 上機實驗二:指定物件吸取 83 5.2.7 上機實驗三:多物件分類 87 5.3 本章小結 91 第六章 結論與建議 92 6.1 結論 92 6.2 未來建議與展望 94 參考文獻 96

    2023/2024產業技術白皮書,經濟部技術處,2023。
    Bakery Kung。AI未來將能「預測天災」?黃仁勳台大演講「六大重點」整理:機器人未來趨勢令人期待!。檢索日期2024年6月19日,取自https://www.gq.com.tw/article/%E9%BB%83%E4%BB%81%E5%8B%B3-%E5%8F%B0%E5%A4%A7-ai%E6%87%89%E7%94%A8。
    吳凱中。廣明旗下達明機器人攜NVIDIA Isaac技術亮相COMPUTEX。檢索日期2024年6月19日,取自https://money.udn.com/money/story/5612/8006745。
    R. Girshick, J. Donahue, T. Darrell, and J. Malik, "Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, Jun. 2014, pp. 580-587.
    R. Girshick, "Fast R-CNN," in Proceedings of the IEEE International Conference on Computer Vision , Santiago, Chile, Sep. 2015, pp. 1440-1448.
    S. Ren, K. He, R. Girshick, and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," Advances in Neural Information Processing Systems, pp. 91-99, 2015.
    W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, "SSD: Single Shot Multibox Detector," in Proceedings of the European Conference on Computer Vision, Cham, Switzerland, 2016, pp. 21-37.
    J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Jun. 2016, pp. 779-788.
    J. Redmon and A. Farhadi, "YOLO9000: Better, Faster, Stronger," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Jul. 2017, pp. 7263-7271.
    J. Redmon and A. Farhadi, "YOLOv3: An Incremental Improvement," arXiv:1804.02767, 2018.
    A. Bochkovskiy, C. Y. Wang, and H. Y. M. Liao, "YOLOv4: Optimal Speed and Accuracy of Object Detection," arXiv:2004.10934, 2020.
    C. Y. Wang, A. Bochkovskiy, and H. Y. M. Liao, "YOLOv7: Trainable Bag-of-freebies Sets New State-of-the-art for Real-time Object Detectors," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 2023, pp. 7464-7475.
    A. Ückermann, R. Haschke, and H. Ritter, "Real-Time 3D Segmentation of Cluttered Scenes for Robot Grasping," in Proceedings of the IEEE-RAS International Conference on Humanoid Robots, Osaka, Japan, 2012, pp. 198-203.
    Y. Yu, Z. Cao, S. Liang, Z. Liu, J. Yu, and X. Chen, "A Grasping CNN with Image Segmentation for Mobile Manipulating Robot," in Proceedings of the IEEE International Conference on Robotics and Biomimetics, Dali, China, 2019, pp. 1688-1692.
    S. M. Ahmed and C. M. Chew, "Density-Based Clustering for 3D Object Detection in Point Clouds," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 2020, pp. 10608-10617.
    Z. Ling, Y. Yao, X. Li, and H. Su, "On the Efficacy of 3D Point Cloud Reinforcement Learning," arXiv:2306.06799, 2023.
    林潔君,基於視覺之工業用機械手臂物件夾取研究,碩士論文,國立成功大學電機工程學系,中華民國,2015。
    J. Mahler, F. T. Pokorny, B. Hou, M. Roderick, M. Laskey, M. Aubry, K. Kohlhoff, T. Kroger, J. Kuffner, and K. Goldberg, "Dex-Net 1.0: A Coud-Based Network of 3D Objects for Robust Grasp Planning Using a Multi-Armed Bandit Model with Correlated Rewards," in Proceedings of the IEEE International Conference on Robotics and Automation, Stockholm, Sweden, 2016, pp. 1957-1964.
    J. Mahler, J. Liang, S. Niyaz, M. Laskey, R. Doan, X. Liu, J. A. Ojea, and K. Goldberg, "Dex-Net 2.0: Deep Learning to Plan Robust Grasps with Synthetic Point Clouds and Analytic Grasp Metrics," arXiv:1703.09312, 2017.
    J. Mahler, M. Matl, X. Liu, A. Li, D. Gealy, and K. Goldberg, "Dex-Net 3.0: Computing Robust Vacuum Suction Grasp Targets in Point Clouds Using a New Analytic Model and Deep Learning," in Proceedings of the IEEE International Conference on Robotics and Automation, Brisbane, QLD, Australia, 2018, pp. 5620-5627.
    J. Mahler, M. Matl, V. Satish, M. Danielczuk, B. DeRose, S. McKinley, and K. Goldberg, "Learning Ambidextrous Robot Grasping Policies," Science Robotics, vol. 4, no. 26, 2019, Art. no. eaau4984.
    H. S. Fang, C. Wang, M. Gou, and C. Lu, "GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 2020, pp. 11444-11453.
    C. C. Beltran-Hernandez, D. Petit, I. G. Ramirez-Alpizar, and K. Harada, "Variable Compliance Control for Robotic Peg-in-Hole Assembly: A Deep-Reinforcement-Learning Approach," Applied Sciences, vol. 10, no. 19, pp. 1-17, 2020.
    許譯云,基於深度增強式學習與 B-RRT*演算法之移動機器人路徑規劃與自主導航研究,碩士論文,國立成功大學電機工程學系,中華民國,2022。
    D. Quillen, E. Jang, O. Nachum, C. Finn, J. Ibarz, and S. Levine, "Deep Reinforcement Learning for Vision-Based Robotic Grasping: A Simulated Comparative Evaluation of Off-Policy Methods," in Proceedings of the IEEE International Conference on Robotics and Automation, Brisbane, Australia, 2018, pp. 6284-6291.
    A. A. Shahid, L. Roveda, D. Piga, and F. Braghin, "Learning Continuous Control Actions for Robotic Grasping with Reinforcement Learning," in Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, Toronto, ON, Canada, 2020, pp. 4066-4072.
    K. Rao, C. Harris, A. Irpan, S. Levine, J. Ibarz, and Mohi Khansari, „RL-CycleGAN: Reinforcement Learning Aware Simulation-to-Real," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 11157-11166.
    D. Kalashnikov, J. Varley, Y. Chebotar, B. Swanson, R. Jonschkowski, C. Finn, S. Levine, and K. Hausman, "MT-Opt: Continuous Multi-Task Robotic Reinforcement Learning at Scale," arXiv:2104.08212, 2021.
    Y. L. Chen, Y. R. Cai, and M. Y. Cheng, "Vision-Based Robotic Object Grasping—A Deep Reinforcement Learning Approach," Machines, vol. 11, no. 2, pp. 275, 2023.
    J. Li and D. J. Cappelleri, "Sim-Suction: Learning a Suction Grasp Policy for Cluttered Environments Using a Synthetic Benchmark," IEEE Transactions on Robotics, vol. 40, pp. 316-331, 2024.
    J. A. Hartigan and M. A. Wong, "Algorithm AS 136: A K-Means Clustering Algorithm," Journal of the Royal Statistical Society. Series C (Applied Statistics), vol. 28, no. 1, pp.100-108, 1979.
    M. Ester, H. P. Kriegel, J. Sander, and X. W. Xu, "A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise," in Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, vol. 96, 1996, pp. 226-231.
    L. McInnes, J. Healy, and S. Astels, "HDBSCAN: Hierarchical Density Based Clustering," Journal of Open Source Software, vol. 2, no. 11, pp. 205, 2017.
    Q. Y. Zhou, J. Park, and V. Koltun, "Open3D: A Modern Library for 3D Data Processing," arXiv:1801.09847, 2018.
    L. P. Kaelbling, M. L. Littman, and A. W. Moore, "Reinforcement Learning: A Survey," Journal of Artificial Intelligence Research, vol. 4, pp. 237-285, 1996.
    T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine, "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor," in Proceedings of the International Conference on Machine Learning, 2018, pp. 1861-1870.
    T. Haarnoja, A. Zhou, K. Hartikainen, G. Tucker, S. Ha, J. Tan, V. Kumar, H. Zhu, A. Gupta, P. Abbeel, and S. Levine, "Soft Actor-Critic Algorithms and Applications," arXiv:1812.05905, 2018.
    V. R. Konda and J. N. Tsitsiklis, "Actor-Critic Algorithms," Advances in Neural Information Processing Systems, vol. 12, pp. 1008-1014, 2000.
    J. Peters and S. Schaal, "Natural Actor-Critic," Neurocomputing, vol. 71, no. 7-9, pp. 1180-1190, 2008.
    T. Haarnoja, H. Tang, P. Abbeel, and S. Levine, "Reinforcement Learning with Deep Energy-Based Policies," in Proceedings of the 34th International Conference on Machine Learning, vol. 70, 2017, pp. 1352-1361.
    S. Fujimoto, H. Hoof, and D. Meger, "Addressing Function Approximation Error in Actor-Critic Methods," in Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 2018, pp. 1587-1596.
    Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-Based Learning Applied to Document Recognition," in Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998.
    C. R. Qi, L. Yi, H. Su, and L. J. Guibas, "PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space," Neural Information Processing Systems, vol. 30, pp. 5105-5114, 2017.
    R. Keating and N. J. Cowan, "UR5 Inverse Kinematics," unpublished, Johns Hopkins University, 2016. https://tianyusong.com/wp-content/uploads/2017/12/ur5_inverse_kinematics.pdf.
    K. Arun, T. Huang, and S. Blostein, "Least-Squares Fitting of Two 3-D Point Sets," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 9, no. 5, pp. 698-700, 1987.
    E. Rohmer, S. P. Singh, and M. Freese, "V-REP: A Versatile and Scalable Robot Simulation Framework," in Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, Tokyo, Japan, 2013, pp. 1321-1326.
    陳亞伶,基於電腦視覺與深度增強式學習之工業用機械手臂物件取放任務研究,碩士論文,國立成功大學電機工程學系,中華民國,2021。

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE