簡易檢索 / 詳目顯示

研究生: 姚松伯
Yao, Song-Bo
論文名稱: 基於光達之金字塔多尺度三維物件偵測網路
Pyramid Multi-scale RCNN for 3D LiDAR-based Object Detection
指導教授: 楊家輝
Yang, Jar-Ferr
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電腦與通信工程研究所
Institute of Computer & Communication Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 英文
論文頁數: 39
中文關鍵詞: 深度學習卷積類神經網路稀疏卷積點雲三維物件偵測
外文關鍵詞: deep learning, convolutional neural networks, sparse convolution, point cloud, 3D object detection
相關次數: 點閱:73下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 基於光達(LiDAR)的物體檢測系統在自動駕駛汽車的發展中起著舉足輕重的作用。隨著光達傳感器的大量降價及使用越來越廣泛,輔助駕駛越來越受重視,點雲數據的重要性也在增加。然而,光達點雲的稀疏性給物體檢測任務帶來了挑戰,這就需要神經網絡的進步和稀疏卷積網絡的引入。考慮到現有三維物體檢測網絡的細化網絡中缺乏多尺度特徵融合機制,我們提出了一種基於卷積神經網絡的新型細化網絡架構。該架構旨在提高網絡識別不同尺度物體的能力。此外,我們通過完善現有的數據增強技術來增強訓練策略,使訓練後的網絡能夠取得更好的結果。實驗結果證明了我們提出的檢測系統在KITTI數據集的三個類別中的有效性。這些改進解決了當前方法的局限性,並突出了我們提出的系統的卓越性能。

    LiDAR-based object detection systems play a pivotal role in the advancement of self-driving vehicles. The use of LiDAR sensors becomes more widespread and assisted driving gains traction after its large cost down. The significance of point cloud technologies is increasing important. However, the sparsity of point clouds poses challenges for object detection tasks, necessitating advancements in neural networks and the introduction of sparse convolutional networks. Considering the absence of a multiscale feature fusion mechanism in the refinement network of existing 3D object detection networks, we propose a novel refinement network architecture based on convolutional neural networks. This architecture aims to enhance the network capability to recognize objects at various scales. Additionally, we enhance the training strategy by refining the existing data augmentation techniques, enabling the trained network to achieve improved results. The experimental results demonstrate the effectiveness of our proposed detection system across three categories on the KITTI dataset. These enhancements address the limitations of current approaches and highlight the superior performance of our proposed system.

    摘要 I Abstract II 誌謝 III Contents IV List of Tables VI List of Figures VII Chapter 1 Introduction 1 1.1 Research Background 1 1.2 Motivations 3 1.3 Thesis Organization 4 Chapter 2 Related Work 5 2.1 PointNet and PointNet++ 5 2.2 Point-based 3D Object Detection 7 2.3 Voxel-based 3D Object Detection 8 2.4 Two-stage 3D Object Detection 10 Chapter 3 The Proposed 3D Object Detection System 13 3.1 Overview of the Proposed Network 13 3.2 Data Retrieval Stage 15 3.3 Pyramid Decision Stage 18 3.4 Realistic-like Data Augmentation 21 3.5 Training Loss 24 Chapter 4 Experimental Results 26 4.1 Environment Settings and Dataset 26 4.2 Training Details 27 4.3 Experimental Results 27 4.3.1 Evaluation Metrics 28 4.3.2 Verification of Proposed Method 29 4.4 Ablation Study 30 Chapter 5 Conclusions 34 Chapter 6 Future Work 35 References 37

    [1] J. Deng, S. Shi, P. Li, W. Zhou, Y. Zhang, and H. Li, “Voxel r-cnn: Towards high performance voxel-based 3d object detection,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 2, 2021, pp. 1201–1209.
    [2] C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning on point sets for 3d classification and segmentation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 652–660.
    [3] C. R. Qi, L. Yi, H. Su, and L. J. Guibas, “Pointnet++: Deep hierarchical feature learning on point sets in a metric space,” Advances in Neural Information Processing Systems, vol. 30, 2017.
    [4] Z. Yang, Y. Sun, S. Liu, and J. Jia, “3dssd: Point-based 3d single stage object detector,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 11 040–11 048.
    [5] Y. Yan, Y. Mao, and B. Li, “Second: Sparsely embedded convolutional detection,” Sensors, vol. 18, no. 10, p. 3337, 2018.
    [6] B. Graham and L. Van der Maaten, “Submanifold sparse convolutional networks,” arXiv preprint arXiv:1706.01307, 2017.
    [7] A. H. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang, and O. Beijbom, “Pointpillars: Fast encoders for object detection from point clouds,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 12 697–12 705.
    [8] J. S. Hu, T. Kuai, and S. L. Waslander, “Point density-aware voxels for lidar 3d object detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 8469–8478.
    [9] G. Yu, Q. Chang, W. Lv, C. Xu, C. Cui, W. Ji, Q. Dang, K. Deng, G. Wang, Y. Du et al., “Pp-picodet: A better real-time object detector on mobile devices,” arXiv preprint arXiv:2111.00902, 2021.
    [10] H. Wu, J. Deng, C. Wen, X. Li, C. Wang, and J. Li, “Casa: A cascade attention network for 3-d object detection from lidar point clouds,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–11, 2022.
    [11] S. Qiao, L.-C. Chen, and A. Yuille, “Detectors: Detecting objects with recursive feature pyramid and switchable atrous convolution,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 10 213–10 224.
    [12] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
    [13] A. Simonelli, S. R. Bulo, L. Porzi, M. L ́opez-Antequera, and P. Kontschieder, “Disentangling monocular 3d object detection,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 1991–1999.
    [14] “OpenPCDet,” GitHub, Jun. 26, 2023. https://github.com/open-mmlab/OpenPCDet (accessed Jun. 26, 2023). ‌
    [15] Q. Xu, Y. Zhong, and U. Neumann, “Behind the curtain: Learning occluded shapes for 3d object detection,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 3, 2022, pp. 2893–2901.
    [16] C. Zhang, H. Wan, X. Shen, and Z. Wu, “Pvt: Point-voxel transformer for point cloud learning,” International Journal of Intelligent Systems, vol. 37, no. 12, pp. 11 985–12 008, 2022.
    [17] J. Mao, Y. Xue, M. Niu, H. Bai, J. Feng, X. Liang, H. Xu, and C. Xu, “Voxel transformer for 3d object detection,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 3164–3173.

    無法下載圖示 校內:2028-07-31公開
    校外:2028-07-31公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE