| 研究生: |
廖婉汝 Liow, Wan-Ju |
|---|---|
| 論文名稱: |
應用於自駕技術之物件偵測與追蹤網路 Object Detection and Tracking Networks for Autonomous Driving |
| 指導教授: |
楊家輝
Yang, Jar-Ferr |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電腦與通信工程研究所 Institute of Computer & Communication Engineering |
| 論文出版年: | 2021 |
| 畢業學年度: | 109 |
| 語文別: | 英文 |
| 論文頁數: | 55 |
| 中文關鍵詞: | 深度學習 、自動駕駛系統 、物件追蹤 、對抗訓練 、穩健性 、物件偵測 |
| 外文關鍵詞: | Deep Learning, Autonomous Driving, Object Detection, Object Tracking, Adversarial Training, Robustness |
| 相關次數: | 點閱:139 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來,無人駕駛成為在生活應用以及研究領域的一個熱門議題,自駕車能夠感知其環境,並在沒有駕駛參與的情況下進行操作車輛。毫無疑問地,安全性是自動駕駛領域優先考量的部分。因此,準確的目標追踪是前瞻駕駛輔助系統的重要組成部分。為了應用在自動駕駛汽車中,除了考量系統的準確性,如何設計一種平衡精準度和模型運作時間的網路,也是個十分重要的研究方向。在本論文中,我們提出了一種考慮時間信息的目標跟踪網絡,不僅可以檢測台灣街道上的多輛車輛,並且大大地減少運行時間。通過用於整個單幅圖像的主檢測網絡和僅需要細化感興趣區域的追蹤檢測器互相交織配合以達到速度的提升。同時,利用獲取的時間資訊補足主檢測器漏掉的物件。此外,針對各種天氣條件,例如雨,霧,雪等,我們再利用對抗性學習容錯的訓練和使用知識蒸餾的概念來增強主檢測網絡的穩健性,以實現安全地自動駕駛。根據實驗結果,本論文的方法具有良好的精準度,並且在運行速度方面,與原主檢測器相比,可獲得高達相對 30%的改進。
In recent years, autonomous driving is a hot topic in the vehicle industry. Autonomous driving is capable of sensing its environment and operating the vehicle without human involvement. Without a doubt, self-driving cars always put safety as the first consideration. Therefore, robust object detection and tracking technologies are essential parts of advanced driving assistant systems (ADAS). In addition, it's also an important research direction about improving the system detection accuracy and reducing its latency. In this thesis, we proposed an object detection and tracking network considering temporal information to detect multiple vehicles on the streets and reduce the computation time. By interleaving main and updating detectors, where the main detector is performed for the whole image while the updating detector only needs to refine the regions of interest, we can save a lot of computation using the updating detector. To cover various weather conditions, such as rain, fog, snow, etc., we also suggest adversarial noisy learning and use the concept of knowledge distillation to enhance the robustness of the main detector to achieve safe autonomous driving. The experimental results show that the proposed system achieves good accuracy performance and speeds up to 26% relative improvements over the use.
References
[1] J. Yan, Z. Lei, L. Wen, and S. Z. Li, "The fastest deformable part model for object detection." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2497-2504, 2014
[2] Y. Freund and R. E. Schapire, "A decision-theoretic generalization of on-line learning and an application to boosting," Journal of Computer and System Sciences, vol. 55, pp. 119-139, 1997.
[3] N. Dalal and B. Triggs, "Histograms of oriented gradients for human detection," In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 886-893, 2005.
[4] E. Rublee, V. Rabaud, K. Konolige and G. Bradski, "ORB: an efficient alternative to SIFT or SURF," In Proceedings of 2011 International Conference on Computer Vision, Barcelona, 2011, pp. 2564-2571, doi: 10.1109/ICCV.2011.6126544.
[5] D. G. Lowe, "Object recognition from local scale-invariant features," Proceedings of the Seventh IEEE International Conference on Computer Vision, Kerkyra, Greece, 1999, pp. 1150-1157 vol.2, doi: 10.1109/ICCV.1999.790410.
[6] H. Bay, A. Ess, T. Tuytelaars, and L. Van Gool, "Speeded-up robust features (SURF)," Computer Vision and Image Understanding, vol. 110, pp. 346-359, 2008.
[7] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," Proceedings of the 25th International Conference on Neural Information Processing Systems, vol. 1pp. 1097-1105, 2012.
[8] M. D. Zeiler, and R. Fergus, "Visualizing and understanding convolutional networks," 2014 Springer International Publishing Switzerland, vol.2, pp. 818-833, 2014.
[9] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, "Going deeper with convolutions," In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-9, 2015.
[10] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, 2016.
[11] J. Hu, L. Shen, and G. Sun, "Squeeze-and-excitation networks," In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132-7141, 2018.
[12] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," in Advances in Neural Information Processing Systems, pp. 1097-1105, 2012.
[13] J. Deng, W. Dong, R. Socher, L. Li, Kai Li and Li Fei-Fei, "ImageNet: a large-scale hierarchical image database," In Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, pp. 248-255, doi: 10.1109/CVPR.2009.5206848, 2009.
[14] R. Girshick, J. Donahue, T. Darrell, and J. Malik, "Rich feature hierarchies for accurate object detection and semantic segmentation," In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pp. 580-587, 2014.
[15] R. Girshick, "Fast r-cnn," In Proceedings of the IEEE International Conference on Computer Vision, pp. 1440-1448, 2015.
[16] S. Ren, K. He, R. Girshick, J. Sun. "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," In Proceedings of the 28th International Conference on Neural Information Processing Systems, pp. 91–99, 2015.
[17] J. Redmon, S. Divvala, R. Girshick, & A. Farhadi, "You only look once: unified, real-time object detection," In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779-788, 2016.
[18] J. Redmon, & A. Farhadi. "YOLO9000: better, faster, stronger," In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 6517-6525, doi: 10.1109/CVPR.2017.690, 2017.
[19] J. Redmon, & A. Farhadi. "YOLOv3: an incremental improvement," arXiv preprint arXiv: 1804.02767, 2018.
[20] A. Bochkovskiy, C.-Y. Wang and H.-Y. M. Liao. "YOLOv4: optimal speed and accuracy of object detection," arXiv preprint arXiv: 2004.10934, 2020.
[21] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, and A. C. Berg, "Ssd: single shot multibox detector," In Proceedings of European Conference on Computer Vision, pp. 21-37, October 2016.
[22] B. Sahbani and W. Adiprawita. "Kalman filter and Iterative-Hungarian algorithm implementation for low complexity point tracking as part of fast multiple object tracking system," In Proceedings of 6th International Conference on System Engineering and Technology, pp. 109-115, doi: 10.1109/ICSEngT.2016.7849633, 2016.
[23] Q. Li, R. Li, K. Ji, and W. Dai. "Kalman filter and its application," In Proceedings of 2015 8th International Conference on Intelligent Networks and Intelligent Systems, pp. 74-77, doi: 10.1109/ICINIS.2015.35, 2015.
[24] HR Künsch. "Particle filters," arXiv preprint arXiv: 1390.7807v1, 2013.
[25] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam. "MobileNets: efficient convolutional neural networks for mobile vision applications," arXiv preprint arXiv: 1704.04861, 2017.
[26] X. Yuan, P. He, Q. Zhu and X. Li. "Adversarial Examples: attacks and defenses for deep learning," IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 9, pp. 2805-2824, doi: 10.1109/TNNLS.2018.2886017, Sept. 2019.
[27] I. J. Goodfellow, J. Shlens, and C. Szegedy. "Explaining and harnessing adversarial examples," arXiv preprint arXiv: 1412.6572, 2014.
[28] N. Narodytska and S. Kasiviswanathan. "Simple black-box adversarial attacks on deep neural networks," In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1310-1318, doi: 10.1109/CVPRW.2017.172, 2017.
[29] K. Lee, K. Lee, H. Lee, and J. Shin. "A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks," arXiv preprint arXiv: 1807.03888, 2018.
[30] N. Papernot, P. McDaniel, X. Wu, S. Jha and A. Swami. "Distillation as a defense to adversarial perturbations against deep neural networks," In Proceedings of IEEE Symposium on Security and Privacy, pp. 582-597, doi: 10.1109/SP.2016.41, 2016.
[31] Q. Xie, M. -T. Luong, E. Hovy and Q. V. Le. "Self-training with noisy student improves ImageNet classification," In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10684-10695, doi: 10.1109/CVPR42600.2020.01070, 2020.
[32] X. Wang, R. Girshick, A. Gupta, & K. He. "Non-local neural networks." In Proceedings of the IEEE Conference on Computer Vision and Pattern Eecognition, pp. 7794-7803, 2018.
[33] G. Bertasius, L. Torresani, J. Shi. "Object detection in video with spatiotemporal sampling networks." In Proceedings of the the European Conference on Computer Vision, pp. 331-346, 2018.
[34] Mobilenet SSD: https://github.com/tensorflow/models/tree/master/research/object_
detection.
[35] W. J. Yang, Y. H. Chen, P. C. Chung and J. F. Yang, “Lightweight moving object detector trained with a small training set,"Proc. of International Conference on Mechatronic, Automobile, and Environmental Engineering, Shizuoka, Japan, July 2019.
[36] ICMR2021 Grand Challenge: https://pairlabs.ai/en/acm-icmr-2021-grand-challenge-pair-competition/.
校內:2026-08-02公開