成功大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	陳品彤 Chen, Pin-Tong
論文名稱：	結合多重監督以提升基於查詢的多項式建模車道線偵測網路訓練效率 Enhancing Training Efficiency of Query-Based Lane Detection with Polynomial Modeling via Mixed Supervision
指導教授：	楊家輝 Yang, Jar-Ferr
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 電腦與通信工程研究所 Institute of Computer & Communication Engineering
論文出版年：	2025
畢業學年度：	113
語文別：	英文
論文頁數：	51
中文關鍵詞：	深度學習、自動駕駛、車道線辨識、基於查詢之偵測網路、Transformer
外文關鍵詞：	Deep Learning, Autonomous Driving, Lane Detection, Query-based Detection Network, Transformer
相關次數：	點閱：4 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

車道線偵測是自動駕駛中的核心任務，對於車輛控制、路徑規劃與行車安全至關重要。近年來，LSTR 等基於查詢的框架利用多項式曲線來表示車道線，並採用 Transformer 架構，推動了此領域的進展。然而，這類方法通常依賴匈牙利演算法進行一對一匹配，導致監督訊號稀疏，進而使訓練收斂速度變慢。本論文引入一種高效的混合監督策略，以改善基於查詢的車道線偵測模型的訓練效率。不同於其他方法需額外引入查詢，我們透過調整 Transformer 解碼器中自注意力與交叉注意力的順序，提取中間特徵並應用一對多的匹配損失進行監督。此設計提供更穩定的訓練訊號，同時不改變原有模型的推論結構。
本研究在 TuSimple 與 CULane 兩個車道線資料集上進行實驗，結果顯示所提出的方法在不增加計算成本的情況下，能夠有效提升訓練效率並維持準確性，為實現穩健且高效的自動駕駛車道線偵測提供一項實用方案。

Lane detection is a fundamental task in autonomous driving, essential for vehicle control, path planning, and safety. Recent query-based frameworks like LSTR have advanced this field by representing lane lines as polynomial curves and using transformer-based architectures. However, these methods rely on one-to-one matching via the Hungarian algorithm, which often leads to sparse supervision and slow training convergence. This thesis introduces an efficient mixed supervision strategy for query-based lane detection. Instead of introducing additional queries, we extract intermediate decoder features by reordering self-attention and cross-attention layers and supervising them using a one-to-many matching loss. This design provides stronger training signals and improves convergence, all while maintaining the original model structure during inference.
Experiments on the TuSimple and CULane datasets show that the proposed method achieves competitive accuracy with significantly improved training efficiency. Our results demonstrate that mixed supervision can enhance the performance of query-based lane detectors without increasing computational cost, offering a practical solution for robust and efficient autonomous driving systems.

摘 要	II
Abstract	III
誌謝	IV
Contents	V
List of Tables	IX
List of Figures	X
Chapter 1	1
Introduction	1
1.1. Research Background	1
1.2. Motivations	2
1.3. Literature Reviews	4
1.3.1 Based on Lane Representations	4
1.3.2 Based on Prediction Mechanisms	5
1.4. Thesis Organization	6
Chapter 2	7
Related Work	7
2.1. Query- based Detection Network	7
2.2 Mixed Supervision in Transformer-based Detection	9
Chapter 3	11
The Proposed Query-based Lane Detection System with Mixed Supervision	11
3.1 Overview of the Proposed MS-LSTR System	11
3.2. Lane Curve Modeling	12
3.2.1 Curve Definition on the Ground Plane	12
3.2.2 Projection onto the Image Plane (Without Pitch)	13
3.2.3 Generalization for a Tilted Camera	14
3.3 Model Architecture	15
3.3.1 Transformer Encoder	16
3.3.2 Transformer Decoders	17
3.4 Matching Costs an Losses Calculation	18
3.4.1 Matching Cost	19
3.4.2 Loss Function	21
3.4.2.1 One-to-one Loss	22
3.4.2.2 One-to-many Loss	22
3.4.2.3 Total Loss	23
Chapter 4	24
Experimental Results	24
4.1. Experimental Environments	24
4.2. Datasets and Evaluation Metrics	25
4.3 Implementation Details	27
4.4. Quantitative Results Comparing to Other Methods	28
4.5. Qualitative Results Compare to Other Methods	31
4.5.1 Qualitative Results on CULane	31
4.5.2 Qualitative Results on TuSimple	32
4.6 Ablation Studies	34
4.6.1 Effectiveness of Mixed Supervision on Training Efficiency	34
4.6.2 Effect of Descendance of One-to-many (O2M) Matching Loss Weight	35
Chapter 5	37
Conclusions	37
Chapter 6	38
Future Work	38
References	39
                                    

[1] X. Pan, J. Shi, P. Luo, X. Wang and X. Tang, "Spatial As Deep: Spatial CNN for Traffic Lane Detection," in Proc. of AAAI Conference on Artificial Intelligence, vol. 32, no. 1, 2018.

[2] TuSimple, "TuSimple Lane Detection Challenge," [Online]. Available: https://github.com/TuSimple/tusimple-benchmark [Accessed: Jul. 2025].
[3] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser and I. Polosukhin, "Attention Is All You Need," Advances in Neural Information Processing Systems (NeurIPS), vol. 30, pp. 5998–6008, 2017.
[4] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., & Zagoruyko, S., End-to-end object detection with transformers, European Conference on Computer Vision (ECCV), pp. 213–229, 2020.
[5] Y. Zhang, X. Liu, Z. Wang, L. Yang and D. Yang, "End-to-End Lane Shape Prediction with Transformers," IEEE Intelligent Vehicles Symposium (IV), pp. 288–293, 2021.
[6] L. Tabelini, R. Berriel, T. M. Paixão, C. Badue, A. F. De Souza and T. Oliveira-Santos, "Keep Your Eyes on the Lane: Real-Time Attention-Guided Lane Detection," in Proc. of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 294–302, 2021.
[7] X. Zhang, Y. Sun, J. Jiang, C. Cao and Z. Luo, "Accelerating DETR Convergence via Dynamic Anchor Boxes," in Proc. of European Conference on Computer Vision (ECCV), pp. 147–164, 2022.
[8] X. Liu, H. Zhang, Y. Wu, Y. Wang and W. Liu, "Efficient DETR Training with Mixed Supervision," in Proc. of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14577–14586 , 2022.
[9] Y. Hou, Z. Ma, C. Liu, C. Chang and J. Yan, "Inter-Region Affinity Distillation for Road Marking Segmentation," in Proc. of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12484–12493, 2020,
[10] T. Zheng, X. Li, Q. Sun, H. Lu and L. Cheng, "CLRNet: Cross Layer Refinement Network for Lane Detection," in Proc. of IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1028–1035, 2022.
[11] D. Neven, B. De Brabandere, S. Georgoulis, M. Proesmans and L. Van Gool, "Towards End-to-End Lane Detection: an Instance Segmentation Approach," in Proc. of IEEE Intelligent Vehicles Symposium (IV), pp. 286–291, 2018.
[12] Behrendt, K., & Soussan, R., Unsupervised labeled lane markers using maps, IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), pp. –, 2019.
[13] Wang, R., Qin, J., Li, K., Li, Y., Cao, D., & Xu, J., BEV-LaneDet: An efficient 3D lane detection based on virtual camera via key-points, IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1002–1011, 2023.
[14] Wang, H., Qin, J., Li, K., Li, Y., Cao, D., Xu, J., et al., OpenLane-V2: A topology reasoning benchmark for unified 3D HD mapping, Advances in Neural Information Processing Systems (NeurIPS), vol. 36, pp. 18873–18884, 2023.

校內：立即公開
校外：立即公開

簡易檢索 / 詳目顯示

相關論文