成功大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	李兆文 Lee, Chow-Wen
論文名稱：	基於自動駕駛之影像物件偵測技術於自定義資料集之實現與分析 Implementation and Analysis of Customized Dataset for Object Detection and Autonomous Driving
指導教授：	莊智清 Juang, Jyh-Ching
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 電機工程學系 Department of Electrical Engineering
論文出版年：	2018
畢業學年度：	106
語文別：	中文
論文頁數：	100
中文關鍵詞：	深度學習、電腦視覺、物件偵測、SSD 、DetectNet
外文關鍵詞：	Deep Learning, Computer Vision, Object Detection, SSD, DetectNet
相關次數：	點閱：284 下載：11
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

本論文主要利用深度學習之影像物件偵測演算法，藉由資料集之重新定義與編輯，以實現於自動駕駛車用影像物件偵測於校園周遭環境，並將其應用於電腦或人工智慧車用電腦平臺上。此物件偵測演算法主要偵測車輛周邊的物件，包括行人、機車、腳踏車、車輛、紅綠燈、卡車等。為達到此目的本論文研究著重於重新編譯、收集與標記相關物件之影像數據，再通過SSD和DetectNet 深度學習演算法進行單類別分析與SSD演算法進行多類別分析，透過這兩種分析作為應用與實現車用影像偵測於臺南市與成大校園周邊實際環境分析做為本篇論文的研究重點。
為了讓深度學習演算法達到更有效的訓練，因此研究上應用深度學習伺服器平臺DGX-1來進行雲端GPU運算，使整體的訓練結果的品質快速地提升、減少訓練所需的時間。最終藉由訓練所得該訓練模型，並於電腦或人工智慧車用電腦平臺上進行偵測。

The objective of the research is to apply deep learning algorithm for object detection and to customize dataset so as to achieve object detection of objects of traffic scenes and surrounding vehicles in the campus and local environment. Different of traffic participants such as pedestrian, motorbike, bicycle, car, traffic light, and truck et cetera are considered. This thesis based on two well-known models for object detection are SSD and DetectNet which focus on research is to implement and analyze these two methods of object detection.
The vision sensors can provide image information of the vehicles surrounding and traffic scenarios. Through object detection, the safety can be improved. In addition, in order to enable deep learning algorithm for object detection to achieve better training result, the NVIDIA DGX-1 system is used to deploy models in the cloud. Finally, this thesis deploys the trained model on personal computer and artificial intelligence automotive computer to assess the performance.

摘要 I
Extended Abstract II
誌謝 V
目錄 VI
圖目錄 VIII
表目錄 XI
第一章 緒論 1
1.1前言 1
1.2研究動機與目的 2
1.3文獻回顧 3
1.4主要貢獻 4
1.5論文架構 5
第二章 影像物件偵測技術 6
2.1 SSD：Single Shot MultiBox Detector 6
2.2 DetectNet 10
第三章 自定義資料集與影像標記 15
3.1資料集與標記格式 16
3.1.1 PASCAL VOC 16
3.1.2 KITTI 20
3.1.3 ImageNet 25
3.2影像標記工具 27
3.3單類別自定義資料集 29
3.4多類別自定義資料集 34
第四章 實驗結果與討論 41
4.1硬體設備 41
4.2評估指標 43
4.2.1 SSD 演算法 44
4.2.2 DetectNet 演算法 46
4.3單類別物件偵測 47
4.3.1 DetectNet 網絡模型 48
4.3.2 SSD 網絡模型 63
4.4多類別物件偵測 73
4.5 SSD網絡訓練模型之攝影機配置 89
第五章 結論與未來展望 95
5.1結論 95
5.2未來工作 95
參考文獻 97
                                    

[1] M. Everingham, L. Van Gool, C. K. Williams, J. Winn, and A. Zisserman, “The Pascal visual object classes (voc) challenge,” International journal of computer vision, 88(2):303– 338, 2010.
[2] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. FeiFei, “Imagenet: A large-scale hierarchical image database,” CVPR, pp. 248–255, 2009.
[3] N. Dalal, and B.Triggs, “ Histograms of oriented gradients for human detection,” CVPR, vol. 1, pp. 886-893, 2005.
[4] D. G. Lowe, “Distinctive image features from scale invariant keypoints,” International Journal of Computer Vision, 2004.
[5] H. Bay, T. Tuytelaars, and L. V. Gool, “SURF: speeded up robust features,” ECCV, 2006.
[6] C.Corinna, V. Vapnik, “Support-vector networks,” Machine learning, 20.3: 273-297, 1995.
[7] Y. Freund, and R. E. Schapire, “A decision-theoretic generalization of on-line learning and an application to boosting,” Journal of Computer and System Sciences 55(1), pp.119–139, 1997.
[8] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” CVPR, 2014.
[9] R. Girshick, “Fast R-CNN,” IEEE International Conference on Computer Vision (ICCV), 2015.
[10] J. Uijlings, K. van de Sande, T. Gevers, and A. Smeulders, “Selective search for object recognition,” IJCV, 2013
[11] P. F. Felzenszwalb, and D. P. Huttenlocher, “Efficient GraphBased Image Segmentation,” IJCV, 59:167–181, 2004.
[12] A. Krizhevsky, I. Sutskever, and G. Hinton, “ImageNet classification with deep convolutional neural networks,” NIPS, 2012.
[13] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-time object detection with region proposal networks,” Advances in neural information processing systems, pp. 91–99, 2015.
[14] W. Liu, D. Anguelov, D. Erhan, S. Christian, S. Reed, C.-Y. Fu, and A. C. Berg, “SSD: single shot multibox detector,” ECCV, 2016.
[15] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788, 2016.
[16] K. Simonyan, and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” NIPS, 2015.
[17] A. Shrivastava, A. Gupta, and R. Girshick, “Training region-based object detectors with online hard example mining,” CVPR, 2016.
[18] D. Erhan, C. Szegedy, A. Toshev, and D. Anguelov, “Scalable object detection using deep neural networks,” CVPR, 2014.
[19] A. Tao, J. Barker, and S. Sarathy, “Detectnet: Deep neural network for object detection in DIGITS,” 2016, Available: https://devblogs.nvidia.com/parallelforall/detectnet-deep-neural-network-object-detection-digits/
[20] J. Barker and S. Prasanna, “Deep Learning for Object Detection with DIGITS,” 2016, Available:https://devblogs.nvidia.com/parallelforall/deep-learning-object-detection-digits/
[21] S. Ngaraj, Bhushan Muthiyan, S. R., V. Menezes, K. Kapoor, and J. Hyeran, “ Edge-Based Street Object Detection,” 2017, Available: http://smart-city-sjsu.net/
AICityChallenge/papers/NVIDIA_AI_City_Challenge_2017_paper_14.pdf
[22] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V.Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-9, 2015.
[23] A. Geiger, P. Lenz, and R. Urtasun, “ Are we ready for autonomous driving? the KITTI vision benchmark suite,” CVPR, pp. 3354-3361, 2012.
[24] A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: The KITTI dataset,” The International Journal of Robotics Research, 32(11), pp. 1231-1237, 2013.
[25] Tzutalin, “LabelImg,” Git code,2015. Available: https://github.com/tzutalin/labelImg
[26] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and K. Fei-Fei, “Imagenet large scale visual recognition challenge,” IJCV, 2015.
[27] J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” arXiv preprint arXiv:1709.01507, 2017.
[28] W. Li, L. Wang, W. Li , E. Agustsson, J. Berent, A. Gupta, R. Sukthankar, and L. Van Gool, “WebVision Challenge: Visual Learning and Understanding With Web Data,” arXiv preprint arXiv:1705.05640, 2017.
[29] R. R. Larson, “Introduction to information retrieval,” Journal of the American Society for Information Science and Technology, 61(4), pp.852-853, 2010.
[30] J. Davis, and M. Goadrich, “The relationship between precision-recall and ROC curves,” The 23rd international conference on Machine learning, pp. 233-240, June, 2006.
[31] E. J. Holupka, J. Rossman, T. Morancy, J. Aronovitz, and I. D. Kaplan, “The detection of implanted radioactive seeds on ultrasound images using convolution neural networks,” 2018.
[32] Two Days to a Demo. Available: https://developer.nvidia.com/embedded/
twodaystoademo
[33] W. Liu, A. Rabinovich, and A. C. Berg, “Parsenet: Looking wider to see better,” arXiv preprint arXiv:1506.04579, 2015.
[34] T. Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollar, and C. L. Zitnick, “Microsoft coco: common objects in context,” European Conference on Computer Vision, pp. 740–755, Springer, 2014.
[35] S. Kato, S. Tokunaga, Y. Maruyama, S. Maeda, M. Hirabayashi, Y. Kitsukawa, A. Monrroy, T. Ando, Y. Fujii, and T. Azumi, “Autoware on board: enabling autonomous vehicles with embedded systems,” ICCPS2018, pp. 287-296, 2018.
[36] S. Kato, E. Takeuchi, Y. Ishiguro, Y. Ninomiya, K. Takeda, and T. Hamada, “An Open Approach to Autonomous Vehicles,” IEEE Micro, Vol. 35, No. 6, pp. 60-69, 2015.
[37] J. Redmon, and A. Farhadi, “YOLOv3: An incremental improvement,” arXiv preprint arXiv:1506.02640, 2018.
[38] C. Y. Fu, W. Liu, A. Ranga, A. Tyagi, and A. C. Berg, “DSSD: Deconvolutional single shot detector,” arXiv preprint arXiv:1701.06659, 2017.
[39] A. Howard, “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” arXiv preprint arXiv:1704.04861, 2017.

2020-12-31公開

簡易檢索 / 詳目顯示

相關論文