簡易檢索 / 詳目顯示

研究生: 林超
Lin, Chao
論文名稱: 藉由深度卷積神經網路來分類物體和估計夾取角度於機器手臂操作
Object Classification and Pick Angle Estimation Using Deep Convolutional Neural Networks for Robot Arm Operation
指導教授: 連震杰
Lien, Jenn-Jier James
共同指導教授: 郭淑美
Guo, Shu-Mei
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2018
畢業學年度: 106
語文別: 英文
論文頁數: 87
中文關鍵詞: 機械手臂機械手臂操作深度卷積神經網路級聯型深度卷積神經網路視覺定位視覺分類
外文關鍵詞: Manipulator, Manipulator operation, Deep convolution neural network, Cascade deep convolution neural network, Visual positioning
相關次數: 點閱:162下載:8
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在機器人領域,物體的操作(Operation)是一個經典的問題。正確的操作物體需要判斷物體所處的位置、物體便於夾取角度和物體所屬的類別。傳統電腦視覺的判斷演算法,存在泛用性較差、魯棒性(robust)較低和計算成本較高等問題,無法滿足多樣複雜的物體操作需求。為解決上述問題,本文基於深度卷積神經網路,提出一種通用的物體取放框架(framework),其包含資料自動收集,深度卷積神經網路訓練,深度卷積神經網路測試三個部分,能夠較為高效的完成物體的操作任務。其中,單個物體資料收集時間約2.5小時,神經網路訓練時間約6小時。訓練完成的神經網路模型在測試中總體分類準確率達到100%,夾取準確率達到94.8%,並且可以隨著資料自動收集數量的增加而進一步提升。另外,在神經網路測試中,本文提出了級聯型(cascaded)網路架構加速運算,在不會影響夾取和分類準確率的前提下,將一次有效的運算時間從單一網路的1.53秒加速到0.65秒。

    In the field of robots, the operation of objects is a classic problem. Correct manipulation of objects requires judging the position of objects, the angle at which objects are clipped, and the type of objects they belong to. Traditional computer vision judgment algorithms have some problems, such as poor generality, low robustness and high computational cost, which can not meet the needs of complex object operation. In order to solve the above problems, this paper proposes a general object retrieval framework based on deep convolution neural network, which includes three parts: automatic data collection, deep convolution neural network training, and deep convolution neural network testing. It can accomplish the task of object operation more efficiently. Among them, the collection time of single object data is about 2.5 hours, and the training time of neural network is about 6 hours. The total classification accuracy and pinching accuracy of the trained neural network model are 100% and 94.8% respectively, and can be further improved with the increase of the number of data automatically collected. In addition, in the test of neural network, this paper proposes cascaded network architecture to accelerate the operation, which can accelerate the effective operation time from 1.53 seconds of a single network to 0.65 seconds without affecting the pinch and classification accuracy.

    摘要 IV Abstract V 志謝 VI List of Figures X List of Tables XIII Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Related Works 2 1.3 Contribution 5 2.1 System Setup and Framework 8 2.2 CNN-Based Visual Sub-System 11 2.3 Robot Arm Sub-System 16 2.4 Self-Calibration between Robot Arm and Visual 19 2.5 Data Collection and Data Augmentation 21 Chapter 3 Object Pick and Classification Using Self-Supervised Learning Network 26 3.1 Self-Supervised Network Architecture by Using VGG16 28 3.2 Modified VGG16 Training Process Using Gradient Descent 34 3.3 Modified VGG16 Test Process for Pick and Classification 41 Chapter 4 Speedup of Self-Supervised Learning Network by Cascaded MobileNetV2 45 4.1 MobileNetV2 Network Architecture 48 4.2 MobileNetV2 Training Process Using Gradient Descent 58 4.3 Test Process for Cascaded Network 61 Chapter 5 Experimental result 64 5.1 Object Pick and Classification 64 5.2 Object Pick and Classification by Cascaded Network 75 Chapter 6 Conclusion and Future Work 83 References 85

    [1] B. Bidanda, S. Motavalli, and K. Harding, "Reverse Engineering: An Evaluation of Prospective Non-contact Technologies and Applications in Manufacturing Systems," International Journal of Computer Integrated Manufacturing, Vol. 4, No. 3, pp. 145-156, 1991.
    [2] G. Bradski, "Computer Vision Face Tracking for Use in A Perceptual User Interface," IEEE Workshop Applications of Computer Vision, pp. 790-799, 1998.
    [3] R. Girshick, "Fast R-CNN," Proceedings of the IEEE International Conference on Computer Vision, pp. 1440-1448, 2015.
    [4] R. Girshick, "Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580-587, 2014.
    [5] K. He, "Deep Residual Learning for Image Recognition," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, 2016.
    [6] K. He, "Delving Deep into Rectifiers: Surpassing Human-Level Performance on Imagenet Classification." Proceedings of the IEEE International Conference on Computer Vision. 2015.
    [7] K. He, "Mask R-CNN," Computer Vision (ICCV), 2017 IEEE International Conference on. IEEE, pp. 2980-2988, 2017.
    [8] A. G. Howard, "Mobilenets: Efficient Convolutional Neural Networks for Mobile Vision Applications," arXiv preprint arXiv:1704.04861, 2017.
    [9] A. Krizhevsky, S. Ilya and G. E. Hinton. "Imagenet Classification with Deep Convolutional Neural Networks," Advances in Neural Information Processing Systems, pp. 1097-1105, 2012.
    [10] Y. LeCun, "Gradient-Based Learning Applied to Document Recognition," Proceedings of the IEEE, pp.2278-2324, 1998.
    [11] M. Lin, C. Qiang, and Y. Shuicheng, "Network in Network," arXiv preprint arXiv:1312.4400, 2013.
    [12] L. Pinto and G. Abhinav, "Supersizing Self-Supervision: Learning to Grasp from 50k Tries and 700 Robot Hours," Robotics and Automation (ICRA), 2016 IEEE International Conference on IEEE, pp. 3406-3413, 2016.
    [13] S. Ren, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," Advances in Neural Information Processing Systems, pp. 91-99, 2015.
    [14] D. E. Rumelhart, G. E. Hinton and R. J. Williams, "Learning Internal Representations by Error Propagation," California Univ San Diego La Jolla Inst for Cognitive Science, No. ICS-8506, 1985.
    [15] M. Sandler, "Mobilenetv2: Inverted Residuals and Linear Bottlenecks," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, arXiv preprint arXiv:1801.04381, 2018.
    [16] K. Simonyan and A. Zisserman, "Very Deep Convolutional Networks for Large-Scale Image Recognition," arXiv preprint arXiv:1409.1556, 2014.
    [17] C. Szegedy, "Going Deeper with Convolutions," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-9, 2015.
    [18] A. T. Miller and P. K. Allen, " Graspit! A Versatile Simulator for
    Robotic Grasping," Robotics & Automation Magazine, IEEE, Vol.11, no. 4, pp.110-122, 2004.
    [19] YASKAWA, "FS100 Instructions," pp. 8.21-8.42, 2012.
    [20] YASKAWA, "Motoman MH5LF Robot," 2013.
    [21] YASKAWA, "FS100 Operator’s Manual," pp. 2.5-2.15, 2014.
    [22] YASKAWA, "FS100 Options Instructions," pp. 4.1-4.6, 2014.
    [23] Z. Zhang, "A Flexible New Technique for Camera Calibration," IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol.22, no.11, pp.1330-1334, 2000.
    [24] Z. Zhang, "Flexible Camera Calibration by Viewing a Plane from Unknown Orientations," International Conference on Computer Vision, pp.666-673, 1999.

    下載圖示 校內:2023-01-01公開
    校外:2023-01-01公開
    QR CODE