簡易檢索 / 詳目顯示

研究生: 江仲馨
Chiang, Chung-Hsin
論文名稱: 應用密集殘差融合網路於駕駛輔助系統之點雲分類
Point Cloud Classification for Driving Assistance System Based on Dense Residual Fusion Network
指導教授: 郭致宏
Kuo, Chih-Hung
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 中文
論文頁數: 72
中文關鍵詞: 3D點雲卷積神經網路密集殘差融合網路深度圖像方向角圖像
外文關鍵詞: Object classification, 3D point cloud, Convolution neural network
相關次數: 點閱:171下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著將光達引入行車輔助系統,點雲辨識架構不斷地推陳出新。現有的辨識架構常將未經預處理的3D點雲作為2D卷積神經網路的輸入,這容易限制網路對於點雲的特徵表達能力。此外,由於對無人駕駛車輛的安全性要求越來越高,僅使用相機所獲取的資訊易受到光線的影響進而造成誤判。我們的方法額外增添從光達中獲取的點雲資訊,並將點雲轉換成2D方向角圖和深度圖,其中分割好的點雲會進一步投影至RGB圖上,用於切割出與之相對應的感興趣區域(Region of Interest, ROI)以增加輸入資訊的多樣性。基於上述原因,我們提出了結合2D方向角(BA)圖、深度圖和RGB圖的高精度分類網路。然而,僅添加輸入訊息種類並不足以提高神經網路的分類能力。在本文中,我們使用密集殘差融合網路(Dense Residual Fusion Network, DRF-Net)來整合不同的輸入訊息,該網路由密集殘差模塊(Dense Residual Block, DRB)組成以提高網路的特徵提取能力,這有利於特徵的重複使用。除此之外,我們提出的階層輔助學習機制可以使中間層的網路學習到更好的特徵。將KITTI原始數據資料集經過3D點雲前處理後,我們所提出的DRF-Net可以實現98.33%的準確率。

    Compared with the state-of-the-art architectures, using the 3D point cloud as the input of the 2D convolutional neural network without preprocessing will restrict the feature expression of the network. To address this issue, we propose a high-precision classification network using bearing angle (BA) images, depth images, and RGB images. Due to the development of unmanned vehicles, determining how to recognize objects from the information collected by sensors is important. Our approach takes data from Lider and a camera and projects a 3D point cloud into 2D BA images and depth images. The RGB image captured by the camera is used to select the region of interest (ROI) corresponding to the point cloud.
    However, only adding input information is not enough to improve the classification ability of general convolutional neural networks. In our approach, we use a Dense-Residual Fusion Network (DRF-Net), which consists of three Dense-Residual blocks(DRBs). Our Dense-Residual block adopts the structure of residual blocks with dense connections. The Dense-Residual Fusion Network can achieve high accuracy with these three inputs on a KITTI raw dataset.

    中文摘要 II 目錄 XVI 表目錄 XVIII 圖目錄 XIX 中英文對照表 XXIII 第一章 緒論 1 1-1 前言 1 1-2 研究動機 2 1-3 研究貢獻 3 1-4 論文架構 4 第二章 相關研究背景介紹 5 2-1 深度影像 (Depth Image) 5 2-2 方向角影像(Bearing Angle Image) 7 2-3 深度學習(Deep Learning) 9 2-3-1 人工神經網路(Artificial Neural Network) 9 2-3-2 深度神經網路(Deep Neural Network) 10 2-3-3 激活函數(Activation Function) 12 2-3-4 反向傳播法(Back Propagation) 13 2-3-5 卷積神經網路(Convolution Neural Network) 14 2-3-6批量歸一化(Batch Normalization) 16 第三章 相關文獻回顧 17 3-1基於神經網路的圖像辨識演算法 17 3-2基於方向特徵直方圖之人體檢測演算法 22 3-3基於神經網路之物件偵測演算法 23 3-4基於神經網路的點雲辨識架構 27 3-5基於神經網路的點雲辨識架構與行車應用 31 第四章 應用密集殘差融合網路於駕駛輔助系統之點雲分類演算法 35 4-1 適應性地面點偵測演算法 36 4-2 基於分水嶺分群法之點雲分割演算法 38 4-3 將感興趣之點雲投影至2D影像 39 4-4 基於密集殘差模塊之特徵融合神經網路架構 42 4-4-1樸素貝氏選擇融合架構 46 4-4-2階層輔助學習之特徵融合架構 49 4-4-3階層分段學習之選擇融合架構 51 4-5 損失函數 53 第五章 實驗環境與數據分析 55 5-1 實驗數據資料集 55 5-2 消融實驗分析 57 5-3 選擇融合方式比較 60 5-4 階層輔助學習之影響與討論 62 5-5 階層分段訓練學習之選擇融合影響與討論 63 5-6 與其他點雲辨識架構比較 64 5-7 錯誤分析 65 第六章 結論與未來展望 67 6-1 結論 67 6-2 未來展望 67 參考文獻 68

    [1] N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), vol. 1, pp. 886–893, IEEE, 2005.
    [2] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 580–587, 2014.
    [3] D. Maturana and S. Scherer, “Voxnet: A 3d convolutional neural network for real-time object recognition,” in Proceedings of 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 922–928, IEEE, 2015.
    [4] H. Su, S. Maji, E. Kalogerakis, and E. Learned-Miller, “Multi-view convolutional neural networks for 3d shape recognition,” in Proceedings of the IEEE international conference on computer vision, pp. 945–953, 2015.
    [5] C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning on point sets for 3d classification and segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 652–660, 2017.
    [6] C. R. Qi, L. Yi, H. Su, and L. J. Guibas, “Pointnet++: Deep hierarchical feature learning on point sets in a metric space,” in Advances in neural information processing systems, pp. 5099–5108, 2017.
    [7] Y. Zhang and M. Rabbat, “A graph-cnn for 3d point cloud classification,” in Proceedings of the Conference on 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6279–6283, IEEE, 2018.
    [8] B. Douillard, J. Underwood, N. Kuntz, V. Vlaskine, A. Quadros, P. Morton, and A. Frenkel, “On the segmentation of 3d lidar point clouds,” in Proceedings of the Conference on 2011 IEEE International Conference on Robotics and Automation, pp. 2798–2805, IEEE, 2011.
    [9] A. Börcs, B. Nagy, and C. Benedek, “Instant object detection in lidar point clouds,” IEEE Geoscience and Remote Sensing Letters, vol. 14, no. 7,pp. 992–996, 2017.
    [10] H.-T. C. Chien-Chou Lin, “Object classification in point cloud using convolutional neural network with 2d images,” in Proceedings of the Conference on IPPR Conference on Computer Version, Graphics and Image Processing, 2017.
    [11] LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324,1998.
    [12] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in neural information processing systems, pp. 1097–1105, 2012.
    [13] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in Proceedings of the Conference on ICLR, 2015.
    [14] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016.
    [15] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4700–4708, 2017.
    [16] R. Girshick, “Fast R-CNN,” in Proceedings of the IEEE international conference on computer vision, pp. 1440–1448, 2015.
    [17] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-time object detection with region proposal networks,” in Advances in neural information processing systems, pp. 91–99, 2015.
    [18] J. R. Uijlings, K. E. Van De Sande, T. Gevers, and A. W. Smeulders, Selective search for object recognition,” International journal of computer vision, vol. 104, no. 2, pp. 154–171, 2013.
    [19] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779–788, 2016.
    [20] T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, Feature pyramid networks for object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2117–2125, 2017.
    [21] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3431–3440, 2015.
    [22] X. Chen, H. Ma, J.Wan, B. Li, and T. Xia, “Multi-view 3d object detection network for autonomous driving,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1907–1915, 2017.
    [23] A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: The kitti dataset,” The International Journal of Robotics Research, vol. 32, no. 11, pp. 1231–1237, 2013.
    [24] P. M. Chu, S. Cho, Y.W. Park, and K. Cho, “Fast point cloud segmentation based on flood-fill algorithm,” in Proceedings of the 2017 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI), pp. 656–659, IEEE, 2017.
    [25] D. Scaramuzza, A. Harati, and R. Siegwart, “Extrinsic self calibration of a camera and a 3d laser range finder from natural scenes,” in Proceedings of the 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4164–4169, IEEE, 2007.
    [26] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” arXiv preprint arXiv:1502.03167, 2015.
    [27] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” in Proceedings of the IEEE international conference on computer vision, pp. 2980–2988, 2017.
    [28] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel. Backpropagation applied to handwritten zip code recognition. Neural computation, 1989.
    [29] David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, Yutian Chen, Timothy Lillicrap, Fan Hui, Laurent Sifre, George van den Driessche, Thore Graepel, and Demis Hassabis. Mastering the game of go without human knowledge. Nature, 550:354–359, 2017.
    [30] D. Silver et al., “A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play,” Science, vol. 362, no. 6419, pp. 1140–1144, 2018.
    [31] Szegedy, Christian, Liu, Wei, Jia, Yangqing, Sermanet, Pierre, Reed, Scott, Anguelov, Dragomir, Erhan, Dumitru, Vanhoucke, Vincent, and Rabinovich, Andrew. Going deeper with convolutions. CoRR, abs/1409.4842, 2014.
    [32] Charles R Qi, Wei Liu, Chenxia Wu, Hao Su, and Leonidas J. Guibas. Frustum pointnets for 3d object detection from RGB-D data. In CVPR, 2018
    [33] Martin Simonovsky and Nikos Komodakis, “Dynamic edge-conditioned filters in convolutional neural networks on graphs,” in Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), Hawaii, United States, 2017.
    [34] K. Fukushima. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological cybernetics, 36(4):193–202, 1980

    下載圖示 校內:2022-08-29公開
    校外:2022-08-29公開
    QR CODE