| 研究生: |
李兆文 Lee, Chow-Wen |
|---|---|
| 論文名稱: |
基於自動駕駛之影像物件偵測技術於自定義資料集之實現與分析 Implementation and Analysis of Customized Dataset for Object Detection and Autonomous Driving |
| 指導教授: |
莊智清
Juang, Jyh-Ching |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2018 |
| 畢業學年度: | 106 |
| 語文別: | 中文 |
| 論文頁數: | 100 |
| 中文關鍵詞: | 深度學習 、電腦視覺 、物件偵測 、SSD 、DetectNet |
| 外文關鍵詞: | Deep Learning, Computer Vision, Object Detection, SSD, DetectNet |
| 相關次數: | 點閱:188 下載:11 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文主要利用深度學習之影像物件偵測演算法,藉由資料集之重新定義與編輯,以實現於自動駕駛車用影像物件偵測於校園周遭環境,並將其應用於電腦或人工智慧車用電腦平臺上。此物件偵測演算法主要偵測車輛周邊的物件,包括行人、機車、腳踏車、車輛、紅綠燈、卡車等。為達到此目的本論文研究著重於重新編譯、收集與標記相關物件之影像數據,再通過SSD和DetectNet 深度學習演算法進行單類別分析與SSD演算法進行多類別分析,透過這兩種分析作為應用與實現車用影像偵測於臺南市與成大校園周邊實際環境分析做為本篇論文的研究重點。
為了讓深度學習演算法達到更有效的訓練,因此研究上應用深度學習伺服器平臺DGX-1來進行雲端GPU運算,使整體的訓練結果的品質快速地提升、減少訓練所需的時間。最終藉由訓練所得該訓練模型,並於電腦或人工智慧車用電腦平臺上進行偵測。
The objective of the research is to apply deep learning algorithm for object detection and to customize dataset so as to achieve object detection of objects of traffic scenes and surrounding vehicles in the campus and local environment. Different of traffic participants such as pedestrian, motorbike, bicycle, car, traffic light, and truck et cetera are considered. This thesis based on two well-known models for object detection are SSD and DetectNet which focus on research is to implement and analyze these two methods of object detection.
The vision sensors can provide image information of the vehicles surrounding and traffic scenarios. Through object detection, the safety can be improved. In addition, in order to enable deep learning algorithm for object detection to achieve better training result, the NVIDIA DGX-1 system is used to deploy models in the cloud. Finally, this thesis deploys the trained model on personal computer and artificial intelligence automotive computer to assess the performance.
[1] M. Everingham, L. Van Gool, C. K. Williams, J. Winn, and A. Zisserman, “The Pascal visual object classes (voc) challenge,” International journal of computer vision, 88(2):303– 338, 2010.
[2] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. FeiFei, “Imagenet: A large-scale hierarchical image database,” CVPR, pp. 248–255, 2009.
[3] N. Dalal, and B.Triggs, “ Histograms of oriented gradients for human detection,” CVPR, vol. 1, pp. 886-893, 2005.
[4] D. G. Lowe, “Distinctive image features from scale invariant keypoints,” International Journal of Computer Vision, 2004.
[5] H. Bay, T. Tuytelaars, and L. V. Gool, “SURF: speeded up robust features,” ECCV, 2006.
[6] C.Corinna, V. Vapnik, “Support-vector networks,” Machine learning, 20.3: 273-297, 1995.
[7] Y. Freund, and R. E. Schapire, “A decision-theoretic generalization of on-line learning and an application to boosting,” Journal of Computer and System Sciences 55(1), pp.119–139, 1997.
[8] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” CVPR, 2014.
[9] R. Girshick, “Fast R-CNN,” IEEE International Conference on Computer Vision (ICCV), 2015.
[10] J. Uijlings, K. van de Sande, T. Gevers, and A. Smeulders, “Selective search for object recognition,” IJCV, 2013
[11] P. F. Felzenszwalb, and D. P. Huttenlocher, “Efficient GraphBased Image Segmentation,” IJCV, 59:167–181, 2004.
[12] A. Krizhevsky, I. Sutskever, and G. Hinton, “ImageNet classification with deep convolutional neural networks,” NIPS, 2012.
[13] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-time object detection with region proposal networks,” Advances in neural information processing systems, pp. 91–99, 2015.
[14] W. Liu, D. Anguelov, D. Erhan, S. Christian, S. Reed, C.-Y. Fu, and A. C. Berg, “SSD: single shot multibox detector,” ECCV, 2016.
[15] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788, 2016.
[16] K. Simonyan, and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” NIPS, 2015.
[17] A. Shrivastava, A. Gupta, and R. Girshick, “Training region-based object detectors with online hard example mining,” CVPR, 2016.
[18] D. Erhan, C. Szegedy, A. Toshev, and D. Anguelov, “Scalable object detection using deep neural networks,” CVPR, 2014.
[19] A. Tao, J. Barker, and S. Sarathy, “Detectnet: Deep neural network for object detection in DIGITS,” 2016, Available: https://devblogs.nvidia.com/parallelforall/detectnet-deep-neural-network-object-detection-digits/
[20] J. Barker and S. Prasanna, “Deep Learning for Object Detection with DIGITS,” 2016, Available:https://devblogs.nvidia.com/parallelforall/deep-learning-object-detection-digits/
[21] S. Ngaraj, Bhushan Muthiyan, S. R., V. Menezes, K. Kapoor, and J. Hyeran, “ Edge-Based Street Object Detection,” 2017, Available: http://smart-city-sjsu.net/
AICityChallenge/papers/NVIDIA_AI_City_Challenge_2017_paper_14.pdf
[22] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V.Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-9, 2015.
[23] A. Geiger, P. Lenz, and R. Urtasun, “ Are we ready for autonomous driving? the KITTI vision benchmark suite,” CVPR, pp. 3354-3361, 2012.
[24] A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: The KITTI dataset,” The International Journal of Robotics Research, 32(11), pp. 1231-1237, 2013.
[25] Tzutalin, “LabelImg,” Git code,2015. Available: https://github.com/tzutalin/labelImg
[26] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and K. Fei-Fei, “Imagenet large scale visual recognition challenge,” IJCV, 2015.
[27] J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” arXiv preprint arXiv:1709.01507, 2017.
[28] W. Li, L. Wang, W. Li , E. Agustsson, J. Berent, A. Gupta, R. Sukthankar, and L. Van Gool, “WebVision Challenge: Visual Learning and Understanding With Web Data,” arXiv preprint arXiv:1705.05640, 2017.
[29] R. R. Larson, “Introduction to information retrieval,” Journal of the American Society for Information Science and Technology, 61(4), pp.852-853, 2010.
[30] J. Davis, and M. Goadrich, “The relationship between precision-recall and ROC curves,” The 23rd international conference on Machine learning, pp. 233-240, June, 2006.
[31] E. J. Holupka, J. Rossman, T. Morancy, J. Aronovitz, and I. D. Kaplan, “The detection of implanted radioactive seeds on ultrasound images using convolution neural networks,” 2018.
[32] Two Days to a Demo. Available: https://developer.nvidia.com/embedded/
twodaystoademo
[33] W. Liu, A. Rabinovich, and A. C. Berg, “Parsenet: Looking wider to see better,” arXiv preprint arXiv:1506.04579, 2015.
[34] T. Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollar, and C. L. Zitnick, “Microsoft coco: common objects in context,” European Conference on Computer Vision, pp. 740–755, Springer, 2014.
[35] S. Kato, S. Tokunaga, Y. Maruyama, S. Maeda, M. Hirabayashi, Y. Kitsukawa, A. Monrroy, T. Ando, Y. Fujii, and T. Azumi, “Autoware on board: enabling autonomous vehicles with embedded systems,” ICCPS2018, pp. 287-296, 2018.
[36] S. Kato, E. Takeuchi, Y. Ishiguro, Y. Ninomiya, K. Takeda, and T. Hamada, “An Open Approach to Autonomous Vehicles,” IEEE Micro, Vol. 35, No. 6, pp. 60-69, 2015.
[37] J. Redmon, and A. Farhadi, “YOLOv3: An incremental improvement,” arXiv preprint arXiv:1506.02640, 2018.
[38] C. Y. Fu, W. Liu, A. Ranga, A. Tyagi, and A. C. Berg, “DSSD: Deconvolutional single shot detector,” arXiv preprint arXiv:1701.06659, 2017.
[39] A. Howard, “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” arXiv preprint arXiv:1704.04861, 2017.