| 研究生: |
姚松伯 Yao, Song-Bo |
|---|---|
| 論文名稱: |
基於光達之金字塔多尺度三維物件偵測網路 Pyramid Multi-scale RCNN for 3D LiDAR-based Object Detection |
| 指導教授: |
楊家輝
Yang, Jar-Ferr |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電腦與通信工程研究所 Institute of Computer & Communication Engineering |
| 論文出版年: | 2023 |
| 畢業學年度: | 111 |
| 語文別: | 英文 |
| 論文頁數: | 39 |
| 中文關鍵詞: | 深度學習 、卷積類神經網路 、稀疏卷積 、點雲 、三維物件偵測 |
| 外文關鍵詞: | deep learning, convolutional neural networks, sparse convolution, point cloud, 3D object detection |
| 相關次數: | 點閱:73 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
基於光達(LiDAR)的物體檢測系統在自動駕駛汽車的發展中起著舉足輕重的作用。隨著光達傳感器的大量降價及使用越來越廣泛,輔助駕駛越來越受重視,點雲數據的重要性也在增加。然而,光達點雲的稀疏性給物體檢測任務帶來了挑戰,這就需要神經網絡的進步和稀疏卷積網絡的引入。考慮到現有三維物體檢測網絡的細化網絡中缺乏多尺度特徵融合機制,我們提出了一種基於卷積神經網絡的新型細化網絡架構。該架構旨在提高網絡識別不同尺度物體的能力。此外,我們通過完善現有的數據增強技術來增強訓練策略,使訓練後的網絡能夠取得更好的結果。實驗結果證明了我們提出的檢測系統在KITTI數據集的三個類別中的有效性。這些改進解決了當前方法的局限性,並突出了我們提出的系統的卓越性能。
LiDAR-based object detection systems play a pivotal role in the advancement of self-driving vehicles. The use of LiDAR sensors becomes more widespread and assisted driving gains traction after its large cost down. The significance of point cloud technologies is increasing important. However, the sparsity of point clouds poses challenges for object detection tasks, necessitating advancements in neural networks and the introduction of sparse convolutional networks. Considering the absence of a multiscale feature fusion mechanism in the refinement network of existing 3D object detection networks, we propose a novel refinement network architecture based on convolutional neural networks. This architecture aims to enhance the network capability to recognize objects at various scales. Additionally, we enhance the training strategy by refining the existing data augmentation techniques, enabling the trained network to achieve improved results. The experimental results demonstrate the effectiveness of our proposed detection system across three categories on the KITTI dataset. These enhancements address the limitations of current approaches and highlight the superior performance of our proposed system.
[1] J. Deng, S. Shi, P. Li, W. Zhou, Y. Zhang, and H. Li, “Voxel r-cnn: Towards high performance voxel-based 3d object detection,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 2, 2021, pp. 1201–1209.
[2] C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning on point sets for 3d classification and segmentation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 652–660.
[3] C. R. Qi, L. Yi, H. Su, and L. J. Guibas, “Pointnet++: Deep hierarchical feature learning on point sets in a metric space,” Advances in Neural Information Processing Systems, vol. 30, 2017.
[4] Z. Yang, Y. Sun, S. Liu, and J. Jia, “3dssd: Point-based 3d single stage object detector,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 11 040–11 048.
[5] Y. Yan, Y. Mao, and B. Li, “Second: Sparsely embedded convolutional detection,” Sensors, vol. 18, no. 10, p. 3337, 2018.
[6] B. Graham and L. Van der Maaten, “Submanifold sparse convolutional networks,” arXiv preprint arXiv:1706.01307, 2017.
[7] A. H. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang, and O. Beijbom, “Pointpillars: Fast encoders for object detection from point clouds,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 12 697–12 705.
[8] J. S. Hu, T. Kuai, and S. L. Waslander, “Point density-aware voxels for lidar 3d object detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 8469–8478.
[9] G. Yu, Q. Chang, W. Lv, C. Xu, C. Cui, W. Ji, Q. Dang, K. Deng, G. Wang, Y. Du et al., “Pp-picodet: A better real-time object detector on mobile devices,” arXiv preprint arXiv:2111.00902, 2021.
[10] H. Wu, J. Deng, C. Wen, X. Li, C. Wang, and J. Li, “Casa: A cascade attention network for 3-d object detection from lidar point clouds,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–11, 2022.
[11] S. Qiao, L.-C. Chen, and A. Yuille, “Detectors: Detecting objects with recursive feature pyramid and switchable atrous convolution,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 10 213–10 224.
[12] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
[13] A. Simonelli, S. R. Bulo, L. Porzi, M. L ́opez-Antequera, and P. Kontschieder, “Disentangling monocular 3d object detection,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 1991–1999.
[14] “OpenPCDet,” GitHub, Jun. 26, 2023. https://github.com/open-mmlab/OpenPCDet (accessed Jun. 26, 2023).
[15] Q. Xu, Y. Zhong, and U. Neumann, “Behind the curtain: Learning occluded shapes for 3d object detection,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 3, 2022, pp. 2893–2901.
[16] C. Zhang, H. Wan, X. Shen, and Z. Wu, “Pvt: Point-voxel transformer for point cloud learning,” International Journal of Intelligent Systems, vol. 37, no. 12, pp. 11 985–12 008, 2022.
[17] J. Mao, Y. Xue, M. Niu, H. Bai, J. Feng, X. Liang, H. Xu, and C. Xu, “Voxel transformer for 3d object detection,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 3164–3173.
校內:2028-07-31公開