成功大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	邱建智 Qiu, Jian-Zhi
論文名稱：	應用模擬退火於點雲語義分割之自訓練域適應演算法研究 Self-Training Based Domain Adaptation Algorithm Using Simulated Annealing for Point Cloud Semantic Segmentation
指導教授：	謝明得 Shieh, Ming-Der
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 電機工程學系 Department of Electrical Engineering
論文出版年：	2023
畢業學年度：	111
語文別：	英文
論文頁數：	52
中文關鍵詞：	點雲語義分割、域偏移、域適應、自訓練、模擬退火
外文關鍵詞：	point cloud semantic segmentation, domain shift, domain adaptation, self-training, simulated annealing
相關次數：	點閱：88 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

點雲語義分割是現今仍被熱烈關注的議題，它是電腦視覺中重要的任務之一，被廣泛應用於自動駕駛、無人機、機器導航等與視覺相關的任務。然而，因帶有標註的點雲資料不易取得，且不同的點雲場景有著不同的資料分布之特色，使得在真實世界實作上會遇到域偏移(domain shift)的問題。域適應(domain adaptation)旨在解決此問題，但域適應在3D資料領域中仍屬於早期發展階段，本論文採用2D域適應的方法-自訓練(self-training)，將其應用於點雲語義分割任務中。
自訓練屬於半監督學習(semi-supervised learning)的一種，本論文在自訓練技術的基礎上受模擬退火（simulated annealing）的啟發，調整了原先自訓練候選偽標籤的機制，將原先無法解釋的部分替換掉，取而代之的是更加符合自然定律的規則。與傳統自訓練方法在點雲語義分割的效果有限相比，本論文所提出之方法允許模型在自訓練過程中生成更穩定的偽標籤，從而提高模型的跨域效能。
本論文使用點雲室內場景的數據集來完成實驗以及效能評估，分別以S3DIS以及ScanNet依序當作源域資料集與目標域資料集，實驗結果顯示，所提出之做法可以在域轉移上獲得更好的效能，並於S3DISScanNet之域適應實驗中展現6.9%的準確率提升。

Point cloud semantic segmentation is a topic that is still being actively studied and is one of the important tasks in computer vision. It is widely used in tasks related to vision such as autonomous driving, drones, machine navigation, etc. However, obtaining annotated point cloud data is difficult and different point cloud scenes have different data distribution, leading to the problem of domain shift in real-world implementation. Domain adaptation aims to solve this problem, but domain adaptation in the 3D data is still in early stages of development. This thesis adopts a 2D domain adaptation method - self-training and applies it to the point cloud semantic segmentation task.
Self-training belongs to one type of semi-supervised learning. This thesis is inspired by simulated annealing and adjust the mechanism of the original self-training candidate pseudo-labels, replacing the previously inexplicable parts with rules more in line with natural laws. Compared to the limited effect of traditional self-training methods in point cloud semantic segmentation, the proposed method allows the model to generate more stable pseudo-labels during the self-training process, thus improving the model’s cross-domain performance.
The experiments and performance evaluation in this thesis are conducted using point cloud indoor scene datasets. S3DIS and ScanNet are used as the source and target domain datasets respectively. The results show that the proposed method can achieve better performance in domain transfer, with a 6.9% accuracy improvement demonstrated in the experiment of S3DIS to ScanNet.

摘要 i
ABSTRACT ii
致謝 iv
Content v
List of Tables vii
List of Figures viii
Chapter 1	Introduction  1
1	Motivation  1
2	Thesis Overview  2
3	Thesis Organization  3
Chapter 2	Background  4
1	Point Cloud Semantic Segmentation  4
1.1	Deep Learning-Based Point Cloud Semantic Segmentation Method  5
1.2	Representative Work Review  7
2	Domain Adaptation  10
2.1	Domain Adaptation Method  11
2.2	Domain Adaptation on  Cloud Semantic Segmentation  14
3	Self-Training  15
3.1	Traditional Self-Training  15
3.2	Class-Balanced Self-Training  17
3.3	Confidence Regularized Self-Training  19
4	Concept of Simulated Annealing  20
4.1	Simulated Annealing  21
Chapter 3	Proposed Algorithm based on CRST-LR and Simulated Annealing  22
1	Flow Overview  22
1.1	System Flow	 22
1.2	Loss Function  28
2	Temperature of Soft Pseudo-Label  28
2.1	Softmax with Temperature  28
3	Criteria of Pseudo-Label Selection  29
3.1	Growth Criteria of k_c  30
3.2	Sampling with Probability by Rerverse Fermi-Dirac Function  30
3.3	Treated as Weighted Loss  32
Chapter 4	Experimental Results and Evaluation  33
1	Experimental Setup  33
1.1	S3DIS Dataset & ScanNet Dataset  33
1.2	Experiment Label Mapping  35
1.3	Other Experiment Settings  37
2	Performance of Domain Adaptation  38
2.1	Hyperparameter Selection  38
2.2	Domain Adaptation Results  39
2.3	Ablation Study  44
3	Implementation on 2D Dataset  45
Chapter 5	Conclusion and Future Work  48
1	Conclusion  48
2	Future Work  49
References  50
                                    

[1] Zou, Yang, et al. "Unsupervised domain adaptation for semantic segmentation via class-balanced self-training." Proceedings of the European conference on computer vision (ECCV). 2018.
[2] Zou, Yang, et al. "Confidence regularized self-training." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019.
[3] Guo, Yulan, et al. "Deep learning for 3d point clouds: A survey." IEEE transactions on pattern analysis and machine intelligence 43.12 (2020): 4338-4364.
[4] Su, Hang, et al. "Multi-view convolutional neural networks for 3d shape recognition." Proceedings of the IEEE international conference on computer vision. 2015.
[5] Maturana, Daniel, and Sebastian Scherer. "Voxnet: A 3d convolutional neural network for real-time object recognition." 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, 2015.
[6] Qi, Charles Ruizhongtai, et al. "Pointnet: Deep learning on point sets for 3d classification and segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
[7] Qi, Charles Ruizhongtai, et al. "Pointnet++: Deep hierarchical feature learning on point sets in a metric space." Advances in neural information processing systems 30 (2017).
[8] Ronneberger, Olaf, Philipp Fischer, and Thomas Brox. "U-net: Convolutional networks for biomedical image segmentation." International Conference on Medical image computing and computer-assisted intervention. Springer, Cham, 2015.
[9] Jiang, Mingyang, et al. "Pointsift: A sift-like network module for 3d point cloud semantic segmentation." arXiv preprint arXiv:1807.00652 (2018).
[10] Zhang, Zhiyuan, Binh-Son Hua, and Sai-Kit Yeung. "Shellnet: Efficient point cloud convolutional neural networks using concentric shells statistics." Proceedings of the IEEE/CVF international conference on computer vision. 2019.
[11] Hu, Qingyong, et al. "Randla-net: Efficient semantic segmentation of large-scale point clouds." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.
[12] Zhao, Hengshuang, et al. "Point transformer." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021.
[13] Gretton, Arthur, et al. "A kernel method for the two-sample-problem." Advances in neural information processing systems 19 (2006).
[14] Tzeng, Eric, et al. "Deep domain confusion: Maximizing for domain invariance." arXiv preprint arXiv:1412.3474 (2014).
[15] Goodfellow, Ian, et al. "Generative adversarial networks." Communications of the ACM 63.11 (2020): 139-144.
[16] Ganin, Yaroslav, et al. "Domain-adversarial training of neural networks." The journal of machine learning research 17.1 (2016): 2096-2030.
[17] Kirkpatrick, Scott, C. Daniel Gelatt Jr, and Mario P. Vecchi. "Optimization by simulated annealing." science 220.4598 (1983): 671-680.
[18] Armeni, Iro, et al. "3d semantic parsing of large-scale indoor spaces." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
[19] Dai, Angela, et al. "Scannet: Richly-annotated 3d reconstructions of indoor scenes." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
[20] Zhang, Youshan. "A survey of unsupervised domain adaptation for visual recognition." arXiv preprint arXiv:2112.06745 (2021).
[21] Jaritz, Maximilian, et al. "xmuda: Cross-modal unsupervised domain adaptation for 3d semantic segmentation." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020.
[22] Yi, Li, Boqing Gong, and Thomas Funkhouser. "Complete & label: A domain adaptation approach to semantic segmentation of lidar point clouds." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021.
[23] Bian, Yikai, et al. "Unsupervised Domain Adaptation for Point Cloud Semantic Segmentation via Graph Matching." 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2022.
[24] Richter, Stephan R., et al. "Playing for data: Ground truth from computer games." Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part II 14. Springer International Publishing, 2016.
[25] Ros, German, et al. "The synthia dataset: A large collection of synthetic images for semantic segmentation of urban scenes." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.

2026-02-06公開

簡易檢索 / 詳目顯示

相關論文