簡易檢索 / 詳目顯示

研究生: 王祥兆
Wang, Hsiang-Chao
論文名稱: 基於迭代機率更新方法之點雲物件分類
Point Cloud Object Classification with Iterative Probability Updating
指導教授: 郭致宏
Kuo, Chih-Hung
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2022
畢業學年度: 110
語文別: 中文
論文頁數: 54
中文關鍵詞: 3-D 點雲深度學習迭代機率更新注意力機制
外文關鍵詞: 3-D point cloud, deep learning, Iterative probability updating, attention mechanism
相關次數: 點閱:30下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著光達技術的應用越來越廣泛,點雲分類系統也不斷地推陳出新。然而點雲的數據結構具有無序和稀疏等特性,因此從點雲中學習有意義的特徵非常具有挑戰性。本文提出一種基於迭代機率更新方法的網路架構 (Iterative Probability Updating Network, IPU-Net) 實現點雲分類任務。以往的點雲分類架構僅在網路最後階段獲取分類機率,IPU-Net透過我們所提出的機率更新模塊 (Probability Updating Module, PUM) 逐層更新分類機率,並以前一階段預測的分佈作為當前階段的先驗機率以得到更準確的預測結果。且在每個編碼階段,為了有效利用特徵,我們設計雙尺度注意力融合模塊(Dual-Scale Attention Fusion Module, DSAFM)分別從局部和全局尺度中提取豐富的特徵。與目前其他最先進的點雲分類方法相比,我們的方法在 ScanObjectNN分類數據集上提升0.9%的準確率。

    As the application of lidar technology becomes more and more extensive, the point cloud classification system is also iteratively updated. However, the data structure of point clouds is disordered and sparse, making it challenging to learn meaningful features from point clouds. In this paper, we propose the Iterative Probability Updating Network (IPU-Net) to perform the classification task for point clouds. Different from the previous methods that only obtain the classification probabilities at the final stage of the network, the IPU-Net predicts the classification probabilities stage by stage. We use the distribution predicted from the previous stage as the prior knowledge of the current stage and more accurate predictions are obtained. For each encoding layer, we design the Dual-Scale Attention Fusion Module (DSAFM) which is based on the attention mechanism to extract rich representations from both local and global scales. Compared with state-of-the-art methods, the proposed IPU-Net achieves an accuracy improvement of up to 0.9% on the ScanObjectNN classification benchmark.

    中文摘要 I 目錄 XI 圖目錄 XIV 表目錄 XVI 第一章 緒論 1 1-1 前言 1 1-2 研究動機 1 1-3 研究貢獻 3 1-4 論文架構 4 第二章 相關研究背景介紹 5 2-1 深度學習(Deep Learning) 5 2-1-1 人工神經網路(Artificial Neural Network) 6 2-1-2 深度神經網路(Deep Neural Network) 6 2-1-3 激活函數(Activation Function) 8 2-1-4 反向傳播法(Back Propagation) 9 2-1-5 卷積神經網路(Convolutional Neural Networks) 10 2-1-6 批量標準化(Batch Normalization) 12 2-2 注意力機制(Attention Mechanism) 13 2-2-1 自注意力機制(Self-Attention) 14 2-2-2 通道注意力機制(Channel Attention) 16 第三章 相關文獻回顧 18 3-1 基於投影(Projection-Based) 的點雲物件分類演算法 18 3-1-1 基於多視圖投影的點雲物件分類演算法 18 3-1-2 基於體素轉換的點雲物件分類演算法 18 3-2 基於點(Point-Based) 的點雲物件分類演算法 20 3-2-1 PointNet系列 20 3-2-2 基於動態圖卷積的點雲物件分類演算法 22 3-2-3 基於Transformer的點雲物件分類演算法 24 3-2-4 基於雙尺度架構之點雲物件分類演算法 24 3-3 點雲物件分類相關研究方法比較 26 第四章 基於迭代機率更新之點雲分類演算法 28 4-1 基於迭代機率更新之網路架構(IPU-Net) 29 4-2 機率更新模塊(Probability Updating Module, PUM) 31 4-3 雙尺度融合模塊(Dual-Scale Attention Fusion Module, DSAFM) 33 4-3-1 局部尺度編碼器(Local-Scale Encoder, LSE) 34 4-3-2 全局尺度編碼器(Global-Scale Encoder, GSE) 35 4-3-3 通道注意力模塊(Channel Attention Module, CAM) 36 4-4 損失函數(Loss Function) 37 第五章 實驗環境與數據分析 38 5-1 實驗數據資料集(Dataset) 38 5-1-1 ModelNet40 數據集 38 5-1-2 ScanObjectNN 數據集 39 5-2 網路實施細節 40 5-3 架構分析 40 5-3-1 迭代機率更新架構之有效性 41 5-3-2 網路迭代次數對點雲辨識性能的影響 43 5-3-3 雙尺度注意力模塊之有效性 44 5-4 與其他點雲分類架構比較 45 第六章 結論與未來展望 49 6-1 結論 49 6-2 未來展望 49 參考文獻 50

    [1] C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017.
    [2] C. R. Qi, L. Yi, H. Su, and L. J. Guibas, “PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space,” Advances in Neural Information Processing Systems, vol. 30, 2017.
    [3] G. E. Hinton, S. Osindero, and Y.-W. Teh, “A Fast Learning Algorithm for Deep Belief Nets,” Neural Computation, vol. 18, pp. 1527–1554, 07 2006.
    [4] X. Glorot, A. Bordes, and Y. Bengio, “Deep Sparse Rectifier Neural Networks,” in Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (G. Gordon, D. Dunson, and M. Dudà k, eds.), vol. 15 of Proceedings of Machine Learning Research, (Fort Lauderdale, FL, USA), pp. 315–323, Apr 2011.
    [5] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” in Advances in Neural Information Processing Systems, vol. 25, Curran Associates, Inc., 2012.
    [6] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.
    [7] S. Ioffe and C. Szegedy, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift,” in Proceedings of the 32nd International Conference on Machine Learning (F. Bach and D. Blei, eds.), vol. 37 of Proceedings of Machine Learning Research, (Lille, France), pp. 448–456, Jul 2015.
    [8] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is All you Need,” in Advances in Neural Information Processing Systems, vol. 30, 2017.
    [9] J. Hu, L. Shen, and G. Sun, “Squeeze-and-Excitation Networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
    [10] H. Su, S. Maji, E. Kalogerakis, and E. Learned-Miller, “Multi-View Convolutional Neural Networks for 3D Shape Recognition,” in Proceedings of the IEEE International Conference on Computer Vision (ICCV), December 2015.
    [11] D. Maturana and S. Scherer, “VoxNet: A 3D Convolutional Neural Network for real-time object recognition,” in 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 922–928, 2015.
    [12] S. Ji,W. Xu, M. Yang, and K. Yu, “3D Convolutional Neural Networks for Human Action Recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 1, pp. 221–231, 2013.
    [13] Y.Wang, Y. Sun, Z. Liu, S. E. Sarma, M. M. Bronstein, and J. M. Solomon, “Dynamic Graph CNN for Learning on Point Clouds,” ACM Transactions On Graphics (TOG), vol. 38, no. 5, pp. 1–12, 2019.
    [14] M.-H. Guo, J.-X. Cai, Z.-N. Liu, T.-J. Mu, R. R. Martin, and S.-M. Hu, “PCT: Point cloud transformer,” Computational Visual Media, vol. 7, no. 2, pp. 187–199, 2021.
    [15] Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, and J. Xiao, “3D ShapeNets: A Deep Representation for Volumetric Shapes,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015.
    [16] M. A. Uy, Q.-H. Pham, B.-S. Hua, T. Nguyen, and S.-K. Yeung, “Revisiting Point Cloud Classification: A New Benchmark Dataset and Classification Model on Real-World Data,” October 2019.
    [17] K. He, X. Zhang, S. Ren, and J. Sun, “Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification,” in Proceedings of the IEEE International Conference on Computer Vision (ICCV), December 2015.
    [18] I. Loshchilov and F. Hutter, “SGDR: Stochastic Gradient Descent with Warm Restarts,” arXiv preprint arXiv:1608.03983, 2016.
    [19] Y. Feng, Z. Zhang, X. Zhao, R. Ji, and Y. Gao, “GVCNN: Group-View Convolutional Neural Networks for 3D Shape Recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
    [20] S. V. Sheshappanavar and C. Kambhamettu, “Dynamic Local Geometry Capture in 3D Point Cloud Classification,” in 2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR), pp. 158–164, 2021.
    [21] Y. Li, R. Bu, M. Sun,W.Wu, X. Di, and B. Chen, “PointCNN: Convolution On X-Transformed Points,” Advances in Neural Information Processing Systems, vol. 31, 2018.
    [22] A. Berg, M. Oskarsson, and M. O’Connor, “Points to Patches: Enabling the Use of Self-Attention for 3D Shape Recognition,” arXiv preprint arXiv:2204.03957, 2022.
    [23] Y. Liu, B. Fan, S. Xiang, and C. Pan, “Relation-Shape Convolutional Neural Network for Point Cloud Analysis,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.
    [24] S. Qiu, S. Anwar, and N. Barnes, “Dense-Resolution Network for Point Cloud Classification and Segmentation,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 3813–3822, January 2021.
    [25] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the Inception Architecture for Computer Vision,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016.
    [26] X. Wang, R. Girshick, A. Gupta, and K. He, “Non-Local Neural Networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
    [27] J. Fu, J. Liu, H. Tian, Y. Li, Y. Bao, Z. Fang, and H. Lu, “Dual Attention Network for Scene Segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.
    [28] F. Rosenblatt, “The perceptron: A probabilistic model for information storage and organization in the brain.,” Psychological review, vol. 65, no. 6, p. 386, 1958.
    [29] M. Gardner and S. Dorling, “Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences,” Atmospheric Environment, vol. 32, no. 14, pp. 2627–2636, 1998.
    [30] D. Rumelhart, G. Hinton, and R. Williams, “Learning representations by back-propagating errors”. Nature 323, 533–536, 1986.
    [31] V. Vapnik, “The Support Vector Method of Function Estimation,” in Nonlinear modeling, pp. 55–85, Springer, 1998.
    [32] L. Breiman, “Random Forests,” Machine learning, vol. 45, no. 1, pp. 5–32, 2001.
    [33] C. M. Bishop and N. M. Nasrabadi, “Pattern Recognition and Machine Learning,” vol. 4. Springer, 2006.
    [34] X. Yan, C. Zheng, Z. Li, S. Wang, and S. Cui, “PointASNL: Robust Point Clouds Processing Using Nonlocal Neural Networks With Adaptive Sampling,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020.
    [35] Ma, Xu, et al. "Rethinking network design and local geometry in point cloud: A simple residual mlp framework." arXiv preprint arXiv:2202.07123 (2022).

    下載圖示 校內:2024-08-31公開
    校外:2024-08-31公開
    QR CODE