簡易檢索 / 詳目顯示

研究生: 鍾昕燁
Jhong, Sin-Ye
論文名稱: 多模態感測器融合物件偵測技術於自動駕駛之研究
Research on Multi-modal Sensor Fusion for Object Detection in Autonomous Driving
指導教授: 賴槿峰
Lai, Chin-Feng
共同指導教授: 夏至賢
Hsia, Chih-Hsien
學位類別: 博士
Doctor
系所名稱: 工學院 - 工程科學系
Department of Engineering Science
論文出版年: 2024
畢業學年度: 113
語文別: 英文
論文頁數: 102
中文關鍵詞: 自動駕駛系統物件偵測異質感測器融合多模態學習語意指導視覺-語言指導
外文關鍵詞: Autonomous driving systems, Object detection, Heterogeneous sensor fusion, Multimodal learning, Semantic guidance, Vision-language guidance
ORCID: https://orcid.org/0000-0003-4481-1633
相關次數: 點閱:130下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 自動駕駛系統在實際應用中面臨諸多挑戰,包括光線不足、惡劣天氣,以及複雜的交通情境。如何克服這些挑戰,並實現安全、便利且穩定的自駕技術,是當前重要的課題。物件偵測作為系統中關鍵的技術之一,在感知環境扮演重要的角色,因為準確的物件偵測是保障行車安全的基礎。然而,目前許多偵測方法依賴單一感測器資訊,導致在動態且不可預測的場景中難以穩定運作。為了解決這一問題,基於多模態感測器融合的物件偵測技術被受關注,期待透過不同感測器資訊的整合來提升系統的感知能力。
    在本論文中,我們提出了兩個新的多模態感測器融合框架來解決當前在自駕系統中物體偵測技術的問題。第一個框架專注於二維物件偵測任務,透過融合可見光影像及熱影像感測器資訊,來提升系統在不同環境條件下的表現。第二個框架針對三維物件偵測任務,結合影像及點雲模態中的語意與深度資訊,來強化對遠距離、小型及遮擋物件的偵測能力。此外,本研究建置一台實驗車,除了收集真實行駛資料外,也將技術整合並實現於車載平台,以驗證所提出框架的可行性。最後,藉由多個具代表性的公開資料集及實際應用場景測試的全方面實驗證實,所提出的解決方案相較於現有方法具備更高的偵測性能與及時的執行速度,這使其能更有效應對現實駕駛環境中的各種挑戰,為未來更安全及可靠的自動駕駛系統開發奠定基礎。

    Autonomous driving systems face numerous challenges in real-world applications, including low-light conditions, adverse weather, and complex traffic scenarios. Overcoming these challenges to achieve safe, convenient, and stable autonomous driving technology is a critical issue. Object detection, one of the key technologies in autonomous systems, plays a crucial role in environmental perception, as accurate detection is essential for ensuring driving safety. However, many current detection methods rely on single-sensor information, making them less stable in dynamic and unpredictable environments. To address this problem, multi-modal sensor fusion for object detection has gained attention, aiming to improve system perception by integrating data from various sensors.
    In this dissertation, we propose two new multi-modal sensor fusion frameworks to address current challenges in object detection for autonomous driving systems. The first framework focuses on 2D object detection, enhancing performance under varying environmental conditions by fusing data from visible and thermal sensors. The second framework is designed for 3D object detection, combining semantic and depth information from both image and point cloud modalities to improve detection of distant, small, and occluded objects. Furthermore, we developed an experimental vehicle that not only collected real-world driving data but also integrated and implemented these technologies on an in-vehicle platform to validate the feasibility of the proposed frameworks. Finally, comprehensive experiments conducted on multiple representative public datasets and real-world scenarios demonstrate that the proposed solutions outperform existing methods in both detection accuracy and real-time execution, providing a foundation for the development of safer and more reliable autonomous driving systems in the future.

    摘要 i ABSTRACT ii ACKNOWLEDGEMENT iv TABLE OF CONTENT vi LIST OF FIGURES viii LIST OF TABLES x CHAPTER 1. INTRODUCTION 1 1.1 Background and Motivation 1 1.2 Research Challenges 8 1.3 Research Contribution 12 CHAPTER 2. RELATED WORKS 14 2.1 Visible-based 2D/3D Object Detection 14 2.2 Visible-thermal-based 2D Object Detection 15 2.3 Visible-lidar-based 3D Object Detection 16 CHAPTER 3. PROPOSED VL-ACFDET FRAMEWORK FOR 2D OBJECT DETECTION 19 3.1 Adaptive Cross-contextual Attention Module 20 3.2 Vision–language-guided Channel Attention Transfer Module 24 3.3 Joint Learning of Detection and Transfer Losses 27 CHAPTER 4. PROPOSED SD-AFDET FRAMEWORK FOR 3D OBJECT DETECTION 29 4.1 Multi-modality Feature Extraction 31 4.2 Relevant Point Collection 34 4.3 Density-Aware Artificial Point Shift and Fusion Strategies 36 4.4 Loss Function 40 CHAPTER 5. EXPERIMENT RESULTS AND ANALYSIS 43 5.1 Dataset Collection 45 5.2 Experiment Settings and Evaluation Metrics 46 5.2.1 2D Object Detection Task Settings 47 5.2.2 3D Object Detection Task Settings 47 5.3 Quantitative Evaluation 49 5.3.1 Performance Comparison of VL-ACFDet with SOTA 2D Object Detection Methods 49 5.3.2 Performance Comparison of SD-AFDet with SOTA 3D Object Detection Methods 51 5.4 Ablation Studies 53 5.4.1 Analysis of the VL-ACFDet Framework 53 5.4.2 Analysis of the SD-AFDet Framework 56 CHAPTER 6. CONCLUSION 65 CHAPTER 7. FEATURE WORKS 66 REFERENCE 69 BIOGRAPHY 81

    [1] X. Wang, K. Li and A. Chehri, “Multi-Sensor Fusion Technology for 3D Object Detection in Autonomous Driving: A Review,” IEEE Transactions on Intelligent Transportation Systems, vol. 25, no. 2, pp. 1148–1165, Feb. 2024.
    [2] E. Arnold, O. Y. Al-Jarrah, M. Dianati, S. Fallah, D. Oxtoby and A. Mouzakitis, “A Survey on 3D Object Detection Methods for Autonomous Driving Applications,” IEEE Transactions on Intelligent Transportation Systems, vol. 20, no. 10, pp. 3782–3795, Oct. 2019.
    [3] D. Feng et al., “Deep Multi-Modal Object Detection and Semantic Segmentation for Autonomous Driving: Datasets, Methods, and Challenges,” IEEE Transactions on Intelligent Transportation Systems, vol. 22, no. 3, pp. 1341–1360, Mar. 2021.
    [4] E. Yurtsever, J. Lambert, A. Carballo and K. Takeda, “A Survey of Autonomous Driving: Common Practices and Emerging Technologies,” IEEE Access, vol. 8, pp. 58443–58469, Mar. 2020.
    [5] A. Singh, “Vision-RADAR fusion for Robotics BEV Detections: A Survey,” IEEE Intelligent Vehicles Symposium, 2023, pp. 1–7.
    [6] Y. -C. Chen, S. -Y. Jhong, and C. -H. Hsia, “Roadside Unit-based Unknown Object Detection in Adverse Weather Conditions for Smart Internet of Vehicles,” ACM Transactions on Management Information Systems, vol. 13, no. 4, pp. 47–67, Jan. 2023.
    [7] S. -Y. Jhong, Y. -Y. Chen, C. -H. Hsia, Y. -Q. Wang and C. -F. Lai, “Density-Aware and Semantic-Guided Fusion for 3D Object Detection using LiDAR-Camera Sensors,” IEEE Sensors Journal, vol. 23, no. 18, pp. 22051–22063, Sep. 2023.
    [8] X. Ma, W. Ouyang, A. Simonelli and E. Ricci, “3D Object Detection from Images for Autonomous Driving: A Survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 46, no. 5, pp. 3537–3556, May 2024.
    [9] Á. Takács, D. A. Drexler, P. Galambos, I. J. Rudas and T. Haidegger, “Assessment and Standardization of Autonomous Vehicles,” International Conference on Intelligent Engineering Systems, 2018, pp. 185–192.
    [10] Y. -Y. Chen, S. -Y. Jhong and Y. -J. Lo, “Reinforcement-and-Alignment Multispectral Object Detection Using Visible–Thermal Vision Sensors in Intelligent Vehicles,” IEEE Sensors Journal, vol. 23, no. 21, pp. 26873–26886, Nov. 2023.
    [11] Y. Sun, B. Cao, P. Zhu and Q. Hu, “Drone-Based RGB-Infrared Cross-Modality Vehicle Detection Via Uncertainty-Aware Learning,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 10, pp. 6700–6713, Oct. 2022.
    [12] Q. Fang, D. Han and Z. Wang, “Cross-Modality Fusion Transformer for Multispectral Object Detection”, SSRN Electronic Journal, Sep. 2022.
    [13] T. Huang, Z. Liu, X. Chen, and X. Bai, “EPNet: Enhancing Point Features with Image Semantics for 3D Object Detection,” European Conference on Computer Vision, 2020, pp. 35–52.
    [14] X. Pan, Z. Xia, S. Song, L. -E. Li, and G. Huang, “3D Object Detection with Pointformer,” IEEE Conference on Computer Vision and Pattern Recognition, 2021, pp. 7463–7472.
    [15] C. Lin, D. Tian, X. Duan, J. Zhou, D. Zhao and D. Cao, “CL3D: Camera-LiDAR 3D Object Detection With Point Feature Enhancement and Point-Guided Fusion,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 10, pp. 18040–18050, Oct. 2022.
    [16] H. Caesar, V. Bankiti, A. H. Lang, S. Vora, V. E. Liong, Q. Xu, A. Krishnan, Y. Pan, G. Baldan, and O. Beijbom, “nuScenes: A Multimodal Dataset for Autonomous Driving,” IEEE Conference on Computer Vision and Pattern Recognition, 2020, pp. 11618–11628.
    [17] H. Zhang, E. Fromont, S. Lefevre and B. Avignon, “Guided Attentive Feature Fusion for Multispectral Pedestrian Detection,” IEEE Winter Conference on Applications of Computer Vision, 2021, pp. 72–80.
    [18] K. Zhou, L. Chen and X. Cao, “Improving Multispectral Pedestrian Detection by Addressing Modality Imbalance Problems,” European Conference on Computer Vision, 2020, pp. 787–803.
    [19] A. Radford et al., “Learning Transferable Visual Models from Natural Language Supervision,” International Conference on Machine Learning, 2021, pp. 8748–8763.
    [20] C. R. Qi, W. Liu, C. Wu, H. Su, and L. J. Guibas, “Frustum PointNets for 3D Object Detection from RGB-D Data,” IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 918–927.
    [21] Z. Wang and K. Jia, “Frustum Convnet: Sliding Frustums to Aggregate Local Point-Wise Features for A Modal 3D Object Detection,” IEEE International Conference on Intelligent Robots and Systems, 2019, pp. 1742–1749.
    [22] X. Chen, H. Ma, J. Wan, B. Li, and T. Xia, “Multi-View 3D Object Detection Network for Autonomous Driving,” IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 1907–1915.
    [23] J. Ku, M. Mozifian, J. Lee, A. Harakeh, and S. L. Waslander, “Joint 3D Proposal Generation and Object Detection from View Aggregation,” IEEE International Conference on Intelligent Robots and Systems, 2018, pp. 1–8.
    [24] G. Wang, B. Tian, Y. Zhang, L. Chen, D. Cao, and J. Wu, “Multi-View Adaptive Fusion Network for 3D Object Detection,” arXiv preprint arXiv:2011.00652, 2020.
    [25] A. Som, H. Choi, K. N. Ramamurthy, M. P. Buman, and P. Turaga, “Pi-Net: A deep learning approach to extract topological persistence images,” IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2020, pp. 834–835.
    [26] S. Vora, A. H. Lang, B. Helou, and O. Beijbom, “Pointpainting: sequential fusion for 3D object detection,” IEEE Conference on Computer Vision and Pattern Recognition, 2020, pp. 4604–4612.
    [27] Z. Liu, T. Huang, B. Li, X. Chen, X. Wang and X. Bai, “EPNet++: Cascade Bi-Directional Fusion for Multi-Modal 3D Object Detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 7, pp. 8324–8341, Jul. 2023.
    [28] J. Liu, X. Fan, Z. Huang, G. Wu, R. Liu, W. Zhong and Z. Luo, “Target-aware Dual Adversarial Learning and a Multi-scenario Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection,” IEEE Conference on Computer Vision and Pattern Recognition, 2022, pp. 5792–5801.
    [29] A. Geiger, P. Lenz and R. Urtasun, “Are We Ready for Autonomous Driving? The KITTI Vision Benchmark Suite,” IEEE Conference on Computer Vision and Pattern Recognition, 2012, pp. 3354–3361.
    [30] Z. Li, P. Xu, X. Chang, L. Yang, Y. Zhang, L. Yao, and X. Chen, “When Object Detection Meets Knowledge Distillation: A Survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 8, pp. 10555–10579.
    [31] S. Ren, K. He, R. Girshick and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137-1149, Jun. 2017.
    [32] J. Redmon, S. Divvala, R. Girshick and A. Farhadi, “You Only Look Once: Unified, Real-Time Object Detection,” IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 779-788.
    [33] P. Jiang, D. Ergu, F. Liu, Y. Cai, and B. Ma, “A Review of YOLO Algorithm Developments,” Procedia Computer Science, vol. 199, pp. 1066-1073, 2022.
    [34] X. He, C. Tang, X. Zou, and W. Zhang, “Multispectral Object Detection via Cross-Modal Conflict-Aware Learning,” ACM International Conference on Multimedia, 2023, pp. 1465–1474.
    [35] J. Shen, Y. Chen, Y. Liu, X. Zuo, H. Fan and W. Yang, “ICAFusion: Iterative Cross-Attention Guided Feature Fusion for Multispectral Object Detection,” Pattern Recognition, vol.145, Jan. 2024.
    [36] C. Reading, A. Harakeh, J. Chae, and S. L. Waslander, “Categorical Depth Distribution Network for Monocular 3D Object Detection,” IEEE Conference on Computer Vision and Pattern Recognition, 2021, pp. 8555–8564.
    [37] X. Chen, K. Kundu, Z. Zhang, H. Ma1, S. Fidler, and R. Urtasun, “Monocular 3D Object Detection for Autonomous Driving,” IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 2147–2156.
    [38] G. Brazil, G. Pons-Moll, X. Liu, and B. Schiele, “Kinematic 3D Object Detection in Monocular Video,” European Conference on Computer Vision, 2020, pp. 135–152.
    [39] P. Li, H. Zhao, P. Liu, and F. Cao, “RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving,” European Conference on Computer Vision, 2020, pp. 135–152.
    [40] T. Kim, S. Chung, D. Yeom, Y. Yu, H. -G. Kim and Y. -M. Ro, “MSCoTDet: Language-Driven Multi-Modal Fusion for Improved Multispectral Pedestrian Detection,” arXiv preprint arXiv:2403.15209, 2024.
    [41] Y. Xiao, F. Meng, Q. Wu, L. Xu, M. He and H. Li, “GM-DETR: Generalized Multispectral Detection Transformer with Efficient Fusion Encoder for Visible-Infrared Detection,” IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2024, pp. 5541–5549.
    [42] H. Zhang, E. Fromont, S. Lefevre, and B. Avignon, “Multispectral Fusion for Object Detection with Cyclic Fuse-and-Refine Blocks,” IEEE International Conference on Image Processing, Oct. 2020, pp. 276–280.
    [43] C. Li, D. Song, R. Tong, and M. Tang, “Illumination-Aware Faster R-CNN For Robust Multispectral Pedestrian Detection,” Pattern Recognition, vol. 85, pp. 161–171, Jan. 2019.
    [44] D. Guan, Y. Cao, J. Yang, Y. Cao, and M. Y. Yang, “Fusion of multispectral data through illumination-aware deep neural networks for pedestrian detection,” Information Fusion, vol. 50, pp. 148–157, Oct. 2019.
    [45] H. Wang et al., “Cross-Modal Oriented Object Detection of UAV Aerial Images Based on Image Feature,” IEEE Transactions on Geoscience and Remote Sensing, vol. 62, 2024, Art. no. 5403021.
    [46] X. Yang, Y. Qian, H. Zhu, C. Wang and M. Yang, “BAANet: Learning Bi-directional Adaptive Attention Gates for Multispectral Pedestrian Detection,” International Conference on Robotics and Automation, 2022, pp. 2920–2926.
    [47] J. H. Yoo, Y. Kim, J. Kim, and J. W. Choi, “3D-CVF: Generating Joint Camera and Lidar Features Using Cross-View Spatial Feature Fusion For 3D Object Detection,” European Conference on Computer Vision, 2020, pp. 720–736.
    [48] X. Zhang, S. -Y. Cao, F. Wang, R. Zhang, Z. Wu, X. Zhang, X. Bai and H. -L. Shen, “Rethinking Early-Fusion Strategies for Improved Multispectral Object Detection,” arXiv preprint arXiv:2405.16038, 2024.
    [49] M. Yuan and X. Wei, “C²Former: Calibrated and Complementary Transformer for RGB-Infrared Object Detection,” IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1–12, 2024.
    [50] D. Ulyanov, A. Vedaldi, and V. Lempitsky, “Instance Normalization: The Missing Ingredient for Fast Stylization,” arXiv preprint arXiv:1607.08022, 2016.
    [51] S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, “CBAM: Convolutional Block Attention Module,” European Conference on Computer Vision, 2018, pp. 3–19.
    [52] J. Hu, L. Shen and G. Sun, “Squeeze-and-Excitation Networks,” IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 7132–7141.
    [53] D. Ulyanov, A. Vedaldi, and V. Lempitsky, “Instance Normalization: The Missing Ingredient for Fast Stylization,” arXiv preprint arXiv:1607.08022, 2016.
    [54] A. Tripathi, M. K. Gupta, C. Srivastava, P. Dixit and S. K. Pandey, “Object Detection using YOLO: A Survey,” International Conference on Contemporary Computing and Informatics, 2022, pp. 747-752
    [55] H. Rezatofighi, N. Tsoi, J. Gwak, A. Sadeghian, I. Reid and S. Savarese, “Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression,” IEEE Conference on Computer Vision and Pattern Recognition, 2024, pp. 658–666.
    [56] L. C. Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adam, “Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation,” European Conference on Computer Vision, 2018, pp. 801–818.
    [57] S. Shi, X. Wang, and H. Li, “PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud,” IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 770–779.
    [58] W. Shi and R. R. Rajkumar, “Point-GNN: Graph Neural Network For 3D Object Detection in A Point Cloud,” IEEE Conference on Computer Vision and Pattern Recognition, 2020, pp. 1711–1719.
    [59] Z. Yang, Y. Sun, S. Liu, and J. Jia, “3DSSD: Point-Based 3D Single Stage Object Detector,” IEEE Conference on Computer Vision and Pattern Recognition, 2020, pp. 11040–11048.
    [60] Z. Li, F. Wang, and N. Wang, “Lidar R-CNN: An Efficient and Universal 3D Object Detector,” IEEE Conference on Computer Vision and Pattern Recognition, pp. 7546–7555.
    [61] H. Zhu, J. Deng, Y. Zhang, J. Ji, Q. Mao, H. Li, and Y. Zhang, “VPFNet: Improving 3D Object Detection with Virtual Point Based LiDAR and Stereo Data Fusion,” IEEE Transactions on Multimedia, vol. 25, pp. 5291–5304, Jul. 2022.
    [62] L. Xie, C. Xiang, Z. Yu, G. Xu, Z. Yang, D. Cai, and X. He, “PI-RCNN: An Efficient Multi-Sensor 3D Object Detector with Point-Based Attentive Cont-Conv Fusion Module,” AAAI Conference on Artificial Intelligence, 2020, pp. 12460–12467.
    [63] M. Liang, J. Hu, C. Bao, H. Feng, F. Deng and T. L. Lam, “Explicit Attention-Enhanced Fusion for RGB-Thermal Perception Tasks,” IEEE Robotics and Automation Letters, vol. 8, no. 7, pp. 4060–4067, Jul. 2023.
    [64] J. Ouyang, P. Jin and Q. Wang, “Multimodal Feature-Guided Pretraining for RGB-T Perception,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 17, pp. 16041–16050, Sep. 2024.
    [65] G. Neuhold, T. Ollmann, S. Rota Bulo, and P. Kontschieder, “The Mapillary Vistas Dataset for Semantic Understanding Of Street Scenes,” International Conference on Computer Vision, 2017, pp. 4990–4999.
    [66] M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, “The Cityscapes Dataset for Semantic Urban Scene Understanding,” IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 3213–3223.
    [67] Y. Zhang, C. Xu, W. Yang, G. He, H. Yu, L. Yu, G. -S. Xia, “Drone-Based RGBT Tiny Person Detection,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 204, pp. 61–76, Apr. 2023.
    [68] M. He, Q. Wu, KN. Ngan, F. Jiang, F. Meng and L. Xu, “Misaligned RGB-Infrared Object Detection via Adaptive Dual-Discrepancy Calibration,” Remote Sensing, vol.15, no. 19, pp.4887, Jul. 2023.
    [69] W. Dong, H. Zhu, S. Lin, X. Luo, Y. Shen, X. Liu, J. Zhang, G. Guo and B. Zhang, “Fusion-Mamba for Cross-modality Object Detection,” arXiv preprint arXiv:2404.09146, 2024.
    [70] C. Chen, Z. Chen, J. Zhang, and D. Tao, “SASA: Semantics-Augmented Set Abstraction for Point-Based 3D Object Detection,” AAAI Conference on Artificial Intelligence, 2022, pp. 221–229.
    [71] A. H. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang and O. Beijbom, “PointPillars: Fast Encoders for Object Detection from Point Clouds,” IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 12689–12697.
    [72] Q. Tang, X. Bai, J. Guo, B. Pan, and W. Jiang, “DFAF3D: A Dual-Feature-Aware Anchor-Free Single-Stage 3D Detector for Point Clouds,” Image and Vision Computing, vol. 129, 2023, Art. no. 104594.
    [73] S. Shi, Z. Wang, J. Shi, X. Wang and H. Li, “From Points to Parts: 3D Object Detection from Point Cloud with Part-Aware and Part-Aggregation Network,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 8, pp. 2647–2664, Aug. 2021.
    [74] Y. Zhang, Q. Hu, G. Xu, Y. Ma, J. Wan, and Y. Guo, “Not All Points Are Equal: Learning Highly Efficient Point-Based Detectors For 3D Lidar Point Clouds,” IEEE Conference on Computer Vision and Pattern Recognition, 2022, pp. 18953–18962.
    [75] S. Chen, H. Zhang, and N. Zheng, “Leveraging Anchor-based LiDAR 3D Object Detection via Point Assisted Sample Selection,” arXiv preprint arXiv:2403.01978, 2024.
    [76] I. Lang, A. Manor and S. Avidan, “Samplenet: Differentiable Point Cloud Sampling,” IEEE Conference on Computer Vision and Pattern Recognition, 2020, pp. 7575–7585.
    [77] Y. Yan, Y. Mao, and B. Li, “SECOND: Sparsely Embedded Convolutional Detection,” Sensors, vol. 18, no. 10, p. 3337, 2018.
    [78] J. Deng, S. Shi, P. Li, W. Zhou, Y. Zhang, and H. Li, “Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection,” AAAI Conference on Artificial Intelligence, 2021, pp. 1201–1209.
    [79] S. Shi, C. Guo, L. Jiang, Z. Wang, J. Shi, X. Wang, and H. Li, “PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection,” IEEE Conference on Computer Vision and Pattern Recognition, 2021, pp. 1201–1209.
    [80] Z. Yang, L. Jiang, Y. Sun, B. Schiele, and J. Jia, “A Unified Query-based Paradigm for Point Cloud Understanding,” IEEE Conference on Computer Vision and Pattern Recognition, 2022, pp. 8531–8541.
    [81] Y. Lv, Z. Liu and G. Li, “Context-Aware Interaction Network for RGB-T Semantic Segmentation,” IEEE Transactions on Multimedia, vol. 26, pp. 6348–6360, Jan. 2024.
    [82] J. Li, H. Dai, H. Han and Y. Ding, “MSeg3D: Multi-Modal 3D Semantic Segmentation for Autonomous Driving,” IEEE Conference on Computer Vision and Pattern Recognition, 2023, pp. 21694–21704.
    [83] Zhang, D. Yuan, X. Shu, Z. Li, Q. Liu, X. Chang, Z. He, and G. Shi, “A Comprehensive Review of RGBT Tracking,” IEEE Transactions on Instrumentation and Measurement, vol. 73, pp. 1–23, Jul. 2024.
    [84] S. Feng, X. Li, Z. Yan, C. Xia, S. Li, X. Wang, and Y. Zhou, “Tightly Coupled Integration of LiDAR and Vision for 3D Multiobject Tracking,” IEEE Transactions on Intelligent Vehicles, pp. 1–14, Jun. 2024.
    [85] L. Chen, P. Wu, K. Chitta, B. Jaeger, A. Geiger, and H. Li, “End-to-end Autonomous Driving: Challenges and Frontiers,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1–20, Jul. 2024.
    [86] S. S. Ahmed, A. Schiessl, F. Gumbmann, M. Tiebout, S. Methfessel, and L.-P. Schmidt, “Advanced Microwave Imaging,” IEEE microwave magazine, vol. 13, no. 6, pp. 26–43, 2012.
    [87] G. L. Charvat, L. C. Kempel, E. J. Rothwell, C. M. Coleman and E. L. Mokole, “A Through-Dielectric Ultrawideband Switched-Antenna-Array Radar Imaging System,” IEEE Transactions on Antennas and Propagation, vol. 60, no. 11, pp. 5495–5500, Nov. 2012.
    [88] N. Poredi, Y. Chen, X. Li and E. Blasch, “Enhance Public Safety Surveillance in Smart Cities by Fusing Optical and Thermal Cameras,” International Conference on Information Fusion, 2023, pp. 1–7.
    [89] X. Yi, H. Xu, H. Zhang, L. Tang and J. Ma, “Text-IF: Leveraging Semantic Text Guidance for Degradation-Aware and Interactive Image Fusion,” IEEE Conference on Computer Vision and Pattern Recognition, 2024, pp. 27016–27025.
    [90] X. Li, X. Li, T. Ye, X. Cheng, W. Liu and H. Tan, “Bridging the Gap between Multi-focus and Multi-modal: A Focused Integration Framework for Multi-modal Image Fusion,” IEEE Winter Conference on Applications of Computer Vision, 2024, pp. 1617–1626.
    [91] S. A. Deevi, C. Lee, L. Gan, S. Nagesh, G. Pandey and S. -J. Chung, “RGB-X Object Detection via Scene-Specific Fusion Modules,” IEEE Winter Conference on Applications of Computer Vision, 2024, pp. 7351–7360.
    [92] P. Svenningsson, F. Fioranelli, and A. Yarovoy, “Radar-PointGNN: Graph Based Object Recognition for Unstructured Radar Point-cloud Data,” IEEE Radar Conference, 2021, pp. 1–6.
    [93] J. Liu, Q. Zhao, W. Xiong, T. Huang, Q. -L. Han and B. Zhu, “SMURF: Spatial Multi-Representation Fusion for 3D Object Detection With 4D Imaging Radar,” IEEE Transactions on Intelligent Vehicles, vol. 9, no. 1, pp. 799–812, Jan. 2024.
    [94] L. Fan, J. Wang, Y. Chang, Y. Li, Y. Wang and D. Cao, “4D mmWave Radar for Autonomous Driving Perception: A Comprehensive Survey,” IEEE Transactions on Intelligent Vehicles, vol. 9, no. 4, pp. 4606–4620, Apr. 2024.
    [95] A. Palffy, E. Pool, S. Baratam, J. F. P. Kooij and D. M. Gavrila, “Multi-Class Road User Detection With 3+1D Radar in the View-of-Delft Dataset,” IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 4961–4968, Apr. 2022.
    [96] B. Tan, Z. Ma, X. Zhu, S. Li, L. Zheng, S. Chen, L. Huang, and J. Bai, “3-D Object Detection for Multiframe 4-D Automotive Millimeter-Wave Radar Point Cloud,” IEEE Sensors Journal, vol. 23, no. 11, pp. 11125–11138, Jun. 2023.
    [97] W. Xiong, J. Liu, T. Huang, Q. -L. Han, Y. Xia and B. Zhu, “LXL: LiDAR Excluded Lean 3D Object Detection With 4D Imaging Radar and Camera Fusion,” IEEE Transactions on Intelligent Vehicles, vol. 9, no. 1, pp. 79–92, Jan. 2024.

    無法下載圖示 校內:2029-11-19公開
    校外:2029-11-19公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE