| 研究生: |
姚亦鴻 Yao, Yi-Hong |
|---|---|
| 論文名稱: |
基於旋轉、尺度不變性之光達點雲車輛檢測網路研究與開發 Development of LiDAR Point Cloud Vehicle Detection Method Based on Deep learning of Rotation and Scale Invariant Features |
| 指導教授: |
江佩如
Chiang, Pei-Ju |
| 學位類別: |
碩士 Master |
| 系所名稱: |
工學院 - 系統及船舶機電工程學系 Department of Systems and Naval Mechatronic Engineering |
| 論文出版年: | 2022 |
| 畢業學年度: | 110 |
| 語文別: | 中文 |
| 論文頁數: | 61 |
| 中文關鍵詞: | 車輛檢測 、旋轉不變性 、尺度不變性 |
| 外文關鍵詞: | Vehicle detection, rotation invariance, scale invariance |
| 相關次數: | 點閱:84 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
旋轉、平移及尺度不變之特性對於檢測網路是極其重要的,而傳統基於不變性的點雲特徵,都是針對單一物體的點雲進行提取,而在大範圍的車輛點雲空間中很難將其特徵導入使用,因此我們透過PointPillars的啟發,將大範圍的點雲空間整理成Pillar的形式,並且在選取出來的Pillar當中,引入了矩不變量的計算,利用三維的矩不變量公式,在點雲空間中,成功提取出帶有旋轉、平移及尺度不變的特徵,豐富了每根Pillar所附帶的資訊,也為點雲的特徵選取新增了新的選項。最後,本研究在網路後端的架構中,加入了原本使用於影像分割領域的特徵金字塔增強模塊,透過對特徵圖不斷的精煉,達到更好的特徵描述,實驗也證實了此模塊在點雲檢測網路中,也能發揮增進的效果,更進一步的提升了網路的精度。
The characteristics of rotation, translation and scale invariance are extremely important for the detection network, and the traditional point cloud features based on invariance are extracted for the point cloud of a single object, which is very difficult in the large-scale vehicle point cloud space. It is difficult to import and use its features, so we use the inspiration of PointPillars to organize a large range of point cloud spaces into the form of Pillars, and in the selected Pillars, the calculation of moment invariants is introduced, and the three-dimensional moment invariant formula is used. , In the point cloud space, the features with rotation, translation and scale invariance are successfully extracted, which enriches the information attached to each Pillar, and also adds a new option for the feature selection of the point cloud. Finally, this research adds the feature pyramid enhancement module originally used in the field of image segmentation to the network back-end architecture. Through the continuous refinement of the feature map, a better feature description is achieved. In the detection network, it can also play an enhanced effect, and further improve the accuracy of the network.
[1] Lang, A. H., Vora, S., Caesar, H., Zhou, L., Yang, J., & Beijbom, O. Pointpillars: Fast encoders for object detection from point clouds. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 12697-12705), 2019.
[2] Wang, W., Xie, E., Song, X., Zang, Y., Wang, W., Lu, T., ... & Shen, C. Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 8440-8449), 2019.
[3] Chabot, F., Chaouch, M., Rabarisoa, J., Teuliere, C., & Chateau, T. Deep manta: A coarse-to-fine many-task network for joint 2d and 3d vehicle analysis from monocular image. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2040-2049), 2017.
[4] Chen, X., Kundu, K., Zhu, Y., Berneshawi, A. G., Ma, H., Fidler, S., & Urtasun, R. 3d object proposals for accurate object class detection. Advances in neural information processing systems, 28, 2015.
[5] Girshick, R. Fast r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 1440-1448), 2015.
[6] Mousavian, A., Anguelov, D., Flynn, J., & Kosecka, J. 3d bounding box estimation using deep learning and geometry. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (pp. 7074-7082), 2017.
[7] Li, B., Ouyang, W., Sheng, L., Zeng, X., & Wang, X. Gs3d: An efficient 3d object detection framework for autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 1019-1028), 2019.
[8] Weng, X., & Kitani, K. Monocular 3d object detection with pseudo-lidar point cloud. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (pp. 0-0), 2019.
[9] Qi, C. R., Su, H., Mo, K., & Guibas, L. J. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 652-660), 2017.
[10] Zhou, Y., & Tuzel, O. Voxelnet: End-to-end learning for point cloud based 3d object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4490-4499), 2018.
[11] Yang, B., Luo, W., & Urtasun, R. Pixor: Real-time 3d object detection from point clouds. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (pp. 7652-7660), 2018.
[12] Simony, M., Milzy, S., Amendey, K., & Gross, H. M. Complex-yolo: An euler-region-proposal for real-time 3d object detection on point clouds. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops (pp. 0-0), 2018.
[13] Redmon, J., & Farhadi, A. YOLO9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7263-7271), 2017.
[14] Yan, Y., Mao, Y., & Li, B. Second: Sparsely embedded convolutional detection. Sensors, 18(10), 3337, 2018.
[15] Chen, X., Ma, H., Wan, J., Li, B., & Xia, T. Multi-view 3d object detection network for autonomous driving. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (pp. 1907-1915), 2017.
[16] Ku, J., Mozifian, M., Lee, J., Harakeh, A., & Waslander, S. L. Joint 3d proposal generation and object detection from view aggregation. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 1-8), 2018.
[17] Liang, M., Yang, B., Wang, S., & Urtasun, R. Deep continuous fusion for multi-sensor 3d object detection. In Proceedings of the European conference on computer vision (ECCV) (pp. 641-656), 2018.
[18] Yoo, J. H., Kim, Y., Kim, J., & Choi, J. W. 3d-cvf: Generating joint camera and lidar features using cross-view spatial feature fusion for 3d object detection. In European Conference on Computer Vision (pp. 720-736), 2020.
[19] Geiger, A., Lenz, P., & Urtasun, R. Are we ready for autonomous driving? the kitti vision benchmark suite. In 2012 IEEE conference on computer vision and pattern recognition (pp. 3354-3361), 2012.
[20] Jiang, M., Wu, Y., Zhao, T., Zhao, Z., & Lu, C. Pointsift: A sift-like network module for 3d point cloud semantic segmentation. arXiv preprint arXiv:1807.00652, 2018.
[21] Hu, M. K. Visual pattern recognition by moment invariants. IRE transactions on information theory, 8(2), (pp. 179-187), 1962.
[22] Sadjadi, F. A., & Hall, E. L. Three-dimensional moment invariants. IEEE Transactions on Pattern Analysis and Machine Intelligence, (2), (pp. 127-136), 1980.
[23] Ng, P. C., & Henikoff, S. SIFT: Predicting amino acid changes that affect protein function. Nucleic acids research, 31(13), (pp 3812-3814), 2003.
[24] Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. Ssd: Single shot multibox detector. In European conference on computer vision (pp. 21-37), 2016.
[25] Lin, T. Y., Goyal, P., Girshick, R., He, K., & Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision (pp. 2980-2988), 2017.
[26] Song, X., Zhan, W., Che, X., Jiang, H., & Yang, B. Scale-aware attention-based pillarsnet (sapn) based 3d object detection for point cloud. Mathematical Problems in Engineering, (pp. 1-12) ,2020.
[27] Limaye, A., Mathew, M., Nagori, S., Swami, P. K., Maji, D., & Desappan, K. SS3D: single shot 3D object detector. arXiv preprint arXiv:2004.14674, 2020.
[28] Ren, S., He, K., Girshick, R., & Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28,(pp. 91-99), 2015.
[29] Chen, X., Kundu, K., Zhang, Z., Ma, H., Fidler, S., & Urtasun, R. Monocular 3d object detection for autonomous driving. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2147-2156), 2016.
[30] Li, P., Chen, X., & Shen, S. Stereo r-cnn based 3d object detection for autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 7644-7652), 2019.
[31] Tung, F., & Little, J. J. MF3D: Model-free 3D semantic scene parsing. In 2017 IEEE International Conference on Robotics and Automation (ICRA) (pp. 4596-4603), 2017.
[32] Kuang, H., Wang, B., An, J., Zhang, M., & Zhang, Z. Voxel-FPN: Multi-scale voxel feature aggregation for 3D object detection from LIDAR point clouds. Sensors, 20(3), 704, 2020.
[33] Ye, M., Xu, S., & Cao, T. Hvnet: Hybrid voxel network for lidar based 3d object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 1631-1640), 2020.
[34] Shi, S., Wang, X., & Li, H. Pointrcnn: 3d object proposal generation and detection from point cloud. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 770-779), 2019.
[35] Yang, Z., Sun, Y., Liu, S., & Jia, J. 3dssd: Point-based 3d single stage object detector. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 11040-11048), 2020.
[36] Shi, W., & Rajkumar, R. Point-gnn: Graph neural network for 3d object detection in a point cloud. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 1711-1719), 2020.
[37] Chen, Y., Liu, S., Shen, X., & Jia, J. Fast point r-cnn. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 9775-9784), 2019.
[38] Yang, Z., Sun, Y., Liu, S., Shen, X., & Jia, J. Std: Sparse-to-dense 3d object detector for point cloud. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 1951-1960), 2019.
[39] Shi, S., Guo, C., Jiang, L., Wang, Z., Shi, J., Wang, X., & Li, H. Pv-rcnn: Point-voxel feature set abstraction for 3d object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 10529-10538), 2020.
[40] He, C., Zeng, H., Huang, J., Hua, X. S., & Zhang, L. Structure aware single-stage 3d object detection from point cloud. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 11873-11882), 2020.
[41] Vora, S., Lang, A. H., Helou, B., & Beijbom, O. Pointpainting: Sequential fusion for 3d object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 4604-4612), 2020.
[42] Xie, L., Xiang, C., Yu, Z., Xu, G., Yang, Z., Cai, D., & He, X. PI-RCNN: An efficient multi-sensor 3D object detector with point-based attentive cont-conv fusion module. In Proceedings of the AAAI conference on artificial intelligence (Vol. 34, No. 07, pp. 12460-12467), 2020.
[43] Huang, T., Liu, Z., Chen, X., & Bai, X. Epnet: Enhancing point features with image semantics for 3d object detection. In European Conference on Computer Vision (pp. 35-52), 2020.
[44] Pang, S., Morris, D., & Radha, H. CLOCs: Camera-LiDAR object candidates fusion for 3D object detection. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 10386-10393), 2020.
[45] Bhabatosh, C. Digital image processing and analysis. PHI Learning Pvt. Ltd,1977.
[46] Simonyan, K., & Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556,2014.
校內:2027-08-26公開