| 研究生: |
邱建智 Qiu, Jian-Zhi |
|---|---|
| 論文名稱: |
應用模擬退火於點雲語義分割之自訓練域適應演算法研究 Self-Training Based Domain Adaptation Algorithm Using Simulated Annealing for Point Cloud Semantic Segmentation |
| 指導教授: |
謝明得
Shieh, Ming-Der |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2023 |
| 畢業學年度: | 111 |
| 語文別: | 英文 |
| 論文頁數: | 52 |
| 中文關鍵詞: | 點雲語義分割 、域偏移 、域適應 、自訓練 、模擬退火 |
| 外文關鍵詞: | point cloud semantic segmentation, domain shift, domain adaptation, self-training, simulated annealing |
| 相關次數: | 點閱:88 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
點雲語義分割是現今仍被熱烈關注的議題,它是電腦視覺中重要的任務之一,被廣泛應用於自動駕駛、無人機、機器導航等與視覺相關的任務。然而,因帶有標註的點雲資料不易取得,且不同的點雲場景有著不同的資料分布之特色,使得在真實世界實作上會遇到域偏移(domain shift)的問題。域適應(domain adaptation)旨在解決此問題,但域適應在3D資料領域中仍屬於早期發展階段,本論文採用2D域適應的方法-自訓練(self-training),將其應用於點雲語義分割任務中。
自訓練屬於半監督學習(semi-supervised learning)的一種,本論文在自訓練技術的基礎上受模擬退火(simulated annealing)的啟發,調整了原先自訓練候選偽標籤的機制,將原先無法解釋的部分替換掉,取而代之的是更加符合自然定律的規則。與傳統自訓練方法在點雲語義分割的效果有限相比,本論文所提出之方法允許模型在自訓練過程中生成更穩定的偽標籤,從而提高模型的跨域效能。
本論文使用點雲室內場景的數據集來完成實驗以及效能評估,分別以S3DIS以及ScanNet依序當作源域資料集與目標域資料集,實驗結果顯示,所提出之做法可以在域轉移上獲得更好的效能,並於S3DISScanNet之域適應實驗中展現6.9%的準確率提升。
Point cloud semantic segmentation is a topic that is still being actively studied and is one of the important tasks in computer vision. It is widely used in tasks related to vision such as autonomous driving, drones, machine navigation, etc. However, obtaining annotated point cloud data is difficult and different point cloud scenes have different data distribution, leading to the problem of domain shift in real-world implementation. Domain adaptation aims to solve this problem, but domain adaptation in the 3D data is still in early stages of development. This thesis adopts a 2D domain adaptation method - self-training and applies it to the point cloud semantic segmentation task.
Self-training belongs to one type of semi-supervised learning. This thesis is inspired by simulated annealing and adjust the mechanism of the original self-training candidate pseudo-labels, replacing the previously inexplicable parts with rules more in line with natural laws. Compared to the limited effect of traditional self-training methods in point cloud semantic segmentation, the proposed method allows the model to generate more stable pseudo-labels during the self-training process, thus improving the model’s cross-domain performance.
The experiments and performance evaluation in this thesis are conducted using point cloud indoor scene datasets. S3DIS and ScanNet are used as the source and target domain datasets respectively. The results show that the proposed method can achieve better performance in domain transfer, with a 6.9% accuracy improvement demonstrated in the experiment of S3DIS to ScanNet.
[1] Zou, Yang, et al. "Unsupervised domain adaptation for semantic segmentation via class-balanced self-training." Proceedings of the European conference on computer vision (ECCV). 2018.
[2] Zou, Yang, et al. "Confidence regularized self-training." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019.
[3] Guo, Yulan, et al. "Deep learning for 3d point clouds: A survey." IEEE transactions on pattern analysis and machine intelligence 43.12 (2020): 4338-4364.
[4] Su, Hang, et al. "Multi-view convolutional neural networks for 3d shape recognition." Proceedings of the IEEE international conference on computer vision. 2015.
[5] Maturana, Daniel, and Sebastian Scherer. "Voxnet: A 3d convolutional neural network for real-time object recognition." 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, 2015.
[6] Qi, Charles Ruizhongtai, et al. "Pointnet: Deep learning on point sets for 3d classification and segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
[7] Qi, Charles Ruizhongtai, et al. "Pointnet++: Deep hierarchical feature learning on point sets in a metric space." Advances in neural information processing systems 30 (2017).
[8] Ronneberger, Olaf, Philipp Fischer, and Thomas Brox. "U-net: Convolutional networks for biomedical image segmentation." International Conference on Medical image computing and computer-assisted intervention. Springer, Cham, 2015.
[9] Jiang, Mingyang, et al. "Pointsift: A sift-like network module for 3d point cloud semantic segmentation." arXiv preprint arXiv:1807.00652 (2018).
[10] Zhang, Zhiyuan, Binh-Son Hua, and Sai-Kit Yeung. "Shellnet: Efficient point cloud convolutional neural networks using concentric shells statistics." Proceedings of the IEEE/CVF international conference on computer vision. 2019.
[11] Hu, Qingyong, et al. "Randla-net: Efficient semantic segmentation of large-scale point clouds." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.
[12] Zhao, Hengshuang, et al. "Point transformer." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021.
[13] Gretton, Arthur, et al. "A kernel method for the two-sample-problem." Advances in neural information processing systems 19 (2006).
[14] Tzeng, Eric, et al. "Deep domain confusion: Maximizing for domain invariance." arXiv preprint arXiv:1412.3474 (2014).
[15] Goodfellow, Ian, et al. "Generative adversarial networks." Communications of the ACM 63.11 (2020): 139-144.
[16] Ganin, Yaroslav, et al. "Domain-adversarial training of neural networks." The journal of machine learning research 17.1 (2016): 2096-2030.
[17] Kirkpatrick, Scott, C. Daniel Gelatt Jr, and Mario P. Vecchi. "Optimization by simulated annealing." science 220.4598 (1983): 671-680.
[18] Armeni, Iro, et al. "3d semantic parsing of large-scale indoor spaces." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
[19] Dai, Angela, et al. "Scannet: Richly-annotated 3d reconstructions of indoor scenes." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
[20] Zhang, Youshan. "A survey of unsupervised domain adaptation for visual recognition." arXiv preprint arXiv:2112.06745 (2021).
[21] Jaritz, Maximilian, et al. "xmuda: Cross-modal unsupervised domain adaptation for 3d semantic segmentation." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020.
[22] Yi, Li, Boqing Gong, and Thomas Funkhouser. "Complete & label: A domain adaptation approach to semantic segmentation of lidar point clouds." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021.
[23] Bian, Yikai, et al. "Unsupervised Domain Adaptation for Point Cloud Semantic Segmentation via Graph Matching." 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2022.
[24] Richter, Stephan R., et al. "Playing for data: Ground truth from computer games." Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part II 14. Springer International Publishing, 2016.
[25] Ros, German, et al. "The synthia dataset: A large collection of synthetic images for semantic segmentation of urban scenes." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.