| 研究生: |
宋慶煌 Song, Qing-Huang |
|---|---|
| 論文名稱: |
基於卷積神經網路與平行串級PID控制器實現人臉識別與追蹤控制 Face recognition and tracking control based on convolutional neural network and parallel-cascade PID controller |
| 指導教授: |
廖德祿
Liao, Teh-Lu |
| 學位類別: |
碩士 Master |
| 系所名稱: |
工學院 - 工程科學系 Department of Engineering Science |
| 論文出版年: | 2020 |
| 畢業學年度: | 108 |
| 語文別: | 中文 |
| 論文頁數: | 79 |
| 中文關鍵詞: | 卷積神經網路 、平行串級PID 控制器 、麥克納姆輪 |
| 外文關鍵詞: | Convolution neural network, Parallel-cascade PID controller, Mecanum wheels |
| 相關次數: | 點閱:81 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來人工智慧的發展在許多領域上獲得了新的突破,因此而成為了現代大家所關注的焦點。隨著生活以及工業需求,利用人工智慧賦予傳感器具有決策能力而延伸出邊緣運算技術,使得在實務方面獲得了更廣泛的應用。在傳統的無人機系統上,大多皆是依靠雲端來進行數據接收與計算,並將計算完的資訊回傳後加以控制,這會造成資料傳輸的延遲並間接影響控制穩定性;藉由人工智慧的技術與硬體設備效能的進步,能使數據在移動載具上進行本地運算,藉此就能避免傳輸效率的問題。由上述概念本論文提出以卷積神經網路為基礎的物件追蹤演算法實現人臉識別,透過識別資訊利用三角幾何測距法計算出物件距離,藉由物距資訊與麥克納姆輪之全向移動載具並結合本論文所提出的平行串級PID控制器架構,實現定距離的線性追蹤控制並經由演算法優化、控制器參數調整與系統整合,完成一具有識別與追蹤功能的邊緣運算系統。最後由本論文之實驗結果可驗證其系統在實際運作上具有良好的穩定性以及強健性。
This thesis proposes a face recognition algorithm based on convolutional neural networks to realize object tracking control problem on the mecanum-wheel omnidirectional vehicle. In order to have a good control effect, this thesis derives a parallel-cascade PID controller design, which realizes linear tracking control maintaining a fixed and desired distance between human and the vehicle. An edge computing system with recognition and tracking functions is realized through algorithm optimization, controller parameter adjustment and system integration. Finally, experimental results are given to demonstrate that the system has stability and robustness in actual operation.
[1] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. "Gradient-based learning applied to document recognition". Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.
[2] A. Krizhevsky, I. Sutskever, and G. E. Hinton. "ImageNet Classification with Deep Convolutional Neural Networks". Advances in neural information processing systems, pp. 1097-1105, 2012.
[3] K. Simonyan and A. Zisserman. "Very Deep Convolutional Networks for Large-Scale Image Recognition". arXiv preprint arXiv:1409.1556, 2014.
[4] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. "Going deeper with convolutions." Proceedings of the IEEE conference on computer vision and pattern recognition, 2015.
[5] K. He, X. Zhang, S. Ren, and J. Sun. "Deep Residual Learning for Image Recognition. " The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 315-323, 2016.
[6] R. Girshick, J. Donahue, T. Darrell, J. Malik. "Rich feature hierarchies for accurate object detection and semantic segmentation." The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 580-587, 2014.
[7] R. Girshick, "Fast R-CNN." In Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 1440-1448, 2015.
[8] S. Ren, K. He, R. Girshick, and J. Sun. "Faster R-CNN: Towards real-time object detection with region proposal networks." In Advances in Neural Information Processing Systems (NIPS), pp. 91-99, 2015.
[9] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi. "You only look once: Unified, real-time object detection. " In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779-788, 2016.
[10] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu and A. C. Berg. "SSD: Single shot multibox detector. " In Proceedings of the European Conference on Computer Vision (ECCV), pp. 21-37, 2016.
[11] J. Redmon and A. Farhadi. "YOLO9000: better, faster, stronger. " In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7263-7271, 2017.
[12] J. Redmon and A. Farhadi. "YOLOv3: An incremental improvement." arXiv preprint arXiv:1804.02767, 2018.
[13] X. Glorot, A. Bordes and Y. Bengio. "Deep Sparse Rectifier Neural Networks." Proceedings of the fourteenth international conference on artificial intelligence and statistics, pp. 315-323, 2011.
[14] A. L. Maas, A. Y. Hannun and A. Y. Ng. "Rectifier nonlinearities improve neural network acoustic models." Proceeding of the International Conference on Machine Learning (ICML), Vol. 30, no. 1, pp. 3, 2013.
[15] Ruder, Sebastian. "An overview of gradient descent optimization algorithms." arXiv preprint arXiv:1609.04747, 2016.
[16] J. Duchi, E. Hazan, and Y. Singer. "Adaptive subgradient methods for online learning and stochastic optimization." Journal of machine learning research (JMLR), Vol. 12, pp. 2121-2159, 2011.
[17] G. E. Hinton. "RMSProp". Coursera: Neural networks for machine learning, Lecture 6.5, 2012.
[18] I. Sutskever, J. Martens, G. Dahl and G. E. Hinton. "On the importance of initialization and momentum in deep learning." International Conference on Machine Learning (ICML), Vol. 28, pp. 1139-1147, 2013.
[19] D. P. Kingma, J. Ba. "Adam: A Method for Stochastic Optimization". Proceedings of the 3rd International Conference on Learning Representations (ICLR), 2014.
[20] N. Srivastava, G. E. Hinton, A. Krizhevsky, I. Sutskever, R. Salakhutdinov. "Dropout: a simple way to prevent neural networks from overfitting." The Journal of machine learning research (JMLR), Vol. 15, pp. 1929-1958, 2014.
[21] S. Ioffe and C. Szegedy. "Batch normalization: Accelerating deep network training by reducing internal covariate shift." arXiv preprint arXiv:1502.03167, 2015.
[22] M. Lin, Q. Chen, and S. Yan. "Network in network." arXiv preprint arXiv:1312.4400, 2013.
[23] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna. "Rethinking the inception architecture for computer vision." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2818-2826, 2016.
[24] A. Veit, M. Wilber, and S. Belongie. "Residual networks behave like ensembles of relatively shallow networks." Advances in neural information processing systems (NIPS), pp. 550-558, 2016.
[25] G. Huang, Z. Liu and L. van der Maaten . "Densely connected convolutional networks." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4700-4708, 2017.
[26] L. Y. Pratt. "Discriminability-based transfer between neural networks." Advances in neural information processing systems (NIPS), pp. 204-211, 1993.
[27] "VOC dataset", http://host.robots.ox.ac.uk/pascal/VOC/.
[28] "ImageNet", http://www.image-net.org/.
[29] "COCO dataset", https://cocodataset.org/#home.
[30] P. Dollár, R. Appel, S. Belongie and P. Perona. "Fast feature pyramids for object detection." IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 8, pp. 1532-1545, 2014.
[31] T. Y. Lin, P. Goyal, R. Girshick, K. He and P. Dollár. "Focal loss for dense object detection." Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 2999-3007, 2017.
[32] "Darknet Framework" : https://github.com/AlexeyAB/darknet
[33] H. Rezatofighi, N. Tsoi, J. Y. Gwak and A. Sadeghian. "Generalized intersection over union: A metric and a loss for bounding box regression." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 658-666, 2019.
[34] Z. Zheng, P. Wang, W. Liu, J. Li, R. Ye and D. Ren. "Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression." AAAI, pp. 12993-13000, 2020.
[35] H. Zhang, M. Cisse, Y. N. Dauphin and D. Lopez-Paz. "mixup: Beyond empirical risk minimization." arXiv preprint arXiv:1710.09412, 2017.
[36] A. Bochkovskiy, C. Y. Wang, and H. Y. M. Liao. "YOLOv4: Optimal Speed and Accuracy of Object Detection." arXiv preprint arXiv:2004.10934, 2020.
[37] 顧迪, "發展Wiimote室內定位技術與全向移動載具軌跡追蹤控制器與其於智慧生活之應用", 國立成功大學機械工程研究所碩士論文, 2018.
[38] P. Meshram and R. G. Kanojiya. "Tuning of PID controller using Ziegler-Nichols method for speed control of DC motor", Proceedings of the IEEE ICAESM, pp. 117-122, 2012.
[39] 黃如鵬, "基於散焦演算法及逆向熱傳導方程式實現單鏡頭相機深度估測",國立成功大學工程科學研究所碩士論文, 2017.
[40] T. Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Doll´ar and C. Lawrence Zitnick. "Microsoft coco: Common objects in context." European conference on computer vision (ECCV), pp. 740-755, 2014.