簡易檢索 / 詳目顯示

研究生: 賴冠綸
Lai, Guan-Lun
論文名稱: 物件追蹤與空間定位技術研究
Study of object tracking and spatial positioning technology
指導教授: 蔡宗祐
Tsai, Tzong-Yow
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 微電子工程研究所
Institute of Microelectronics
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 69
中文關鍵詞: Yolov7物件辨識振鏡器全光纖式光達
外文關鍵詞: Yolov7, object detection, scanning mirror device, all-fiber LiDAR
相關次數: 點閱:141下載:11
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文的研究目標是實作出在任意空間下對特定物體進行追蹤和測距,目前已經完成的部分為固定距離下的測距能力。實驗的架構較為龐大區分為兩大部分,前半部為中央研究院所推出的yolov7,主要是能夠快速的偵測和框選出物體。後半部為振鏡設備和全光纖式脈衝雷射系統,稱為全光纖式光達。將物體的預測框中心座標給予振鏡設備進行轉向達到追蹤,然後利用脈衝雷射打到物體表面,將發射和反射的脈衝雷射進行時間差的分析計算出距離。如果只運行前半部的物件辨識系統整體速度可以到達平均18.27 fps,則運行所有的設備整體速度會下降2 fps只剩下16.23 fps。測距能力也是我們很重視的部分,將物體固定在距離1.5 m和3 m處,我們所測得兩者之間的距離差為150.48 cm,誤差大約為0.32%,可以說是相當精準了。

    The aim of this research is to develop a system capable of tracking and measuring the distance to specific objects in any space. The current implementation focuses on the measurement of distance at a fixed distance. The experimental structure consists of two main components: the first part utilizes YOLOv7, developed by the National Academy of Taiwan, which enables fast object detection and bounding box localization. The second part involves a scanning mirror device and a system of all-fiber pulsed laser , referred to as a all-fiber LiDAR. The scanning mirror device tracks the center coordinates of the predicted bounding boxes, while the pulsed laser emits light towards the surface of the tracked objects. By analyzing the delay time between the emitted and reflected signal, the distance between the device and the object is calculated. Running only the object detection system achieves an average speed of 18.27 fps. However, the all devices are operational, the overall speed decreases to 16.23 fps, resulting in a 2 fps reduction. The accuracy of the distance measurement is also a significant aspect in this research. By placing objects at fixed distances of 1.5 meters and 3 meters, the measured results between these distances was calculated 150.48 cm, with an error of approximately 0.32%. This performance is a remarkably high level of precision.

    摘要 i 誌謝 xi 目錄 xii 表目錄 xv 圖目錄 xvi 第一章 緒論 1 1.1 前言 1 1.2 研究動機 4 第二章 實驗原理 7 2.1 深度學習 7 2.1.1 資料集預處理 8 2.1.2 損失函數 8 2.1.3 優化 9 2.1.4 學習率 10 2.1.5 訓練週期與批次 11 2.1.6 低度擬合和過度擬合 11 2.2 YOLO 13 2.2.1 YOLO網路架構 14 2.2.2 卷積層 15 2.2.3 填充 16 2.2.4 池化層 17 2.2.5 平坦化 18 2.2.6 全連接層 18 2.3 YOLOv7 19 2.3.1 擴展的高效聚合網路 19 2.3.2 模型縮放 21 2.3.3 模型重參數化 22 2.3.4 動態標籤分配 24 第三章 物件偵測實作 25 3.1 模擬流程 25 3.2 硬體設備 26 3.3 自主訓練的前準備 26 3.3.1 軟體介紹 26 3.3.2 環境建置 27 3.4 資料處理 29 3.4.1 影片收集和分割 29 3.4.2 圖片標記 30 3.4.3 資料集的分割 31 3.5 開始訓練 32 3.5.1 設定訓練參數 32 3.5.2 訓練過程與生成的結果 33 3.6 模型的分析方法 34 3.6.1 IoU 34 3.6.2 混淆矩陣 35 3.7 訓練結果分析 38 3.7.1 訓練圖形的比較 39 3.8 模型實測 44 第四章 全光纖式光達實作 48 4.1 振鏡設備 48 4.2 Matlab調用Python 50 4.3 軟體和設備的結合 51 4.3.1 使用Matlab讀取鏡頭 52 4.3.2 資料型態轉換和座標正規化 53 4.4 實測結果 54 4.4.1 結果分析 54 4.4.2 問題討論 56 4.5 全光纖式光達 59 4.5.1 全光纖式光達架構 59 4.5.2 脈衝測量 61 第五章 結論與未來展望 65 5.1 總結 65 5.2 未來展望 66 參考文獻 67

    [1] D. Silver et al., "Mastering the game of Go with deep neural networks and tree search," (in English), Nature, Article vol. 529, no. 7587, pp. 484-+, Jan 2016, doi: 10.1038/nature16961.
    [2] Y. Liu et al., "Summary of chatgpt/gpt-4 research and perspective towards the future of large language models," arXiv preprint arXiv:2304.01852, 2023.
    [3] D. R. Cox, "REGRESSION MODELS AND LIFE-TABLES," (in English), J. R. Stat. Soc. Ser. B-Stat. Methodol., Article vol. 34, no. 2, pp. 187-+, 1972. [Online]. Available: <Go to ISI>://WOS:A1972N572600003.
    [4] S. Wang and Z. Su, "Metamorphic testing for object detection systems," arXiv preprint arXiv:1912.12162, 2019.
    [5] Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," (in English), Nature, Review vol. 521, no. 7553, pp. 436-444, May 2015, doi: 10.1038/nature14539.
    [6] J. Schmidhuber, "Deep learning in neural networks: An overview," Neural networks, vol. 61, pp. 85-117, 2015.
    [7] S. Royo and M. Ballesta-Garcia, "An Overview of Lidar Imaging Systems for Autonomous Vehicles," Applied Sciences, vol. 9, p. 4093, 09/30 2019, doi: 10.3390/app9194093.
    [8] L. R. Marshall, J. Kasinski, and R. L. Burnham, "Diode-pumped eye-safe laser source exceeding 1% efficiency," Opt. Lett., vol. 16, no. 21, pp. 1680-1682, 1991/11/01 1991, doi: 10.1364/OL.16.001680.
    [9] K. O'Shea and R. Nash, "An introduction to convolutional neural networks," arXiv preprint arXiv:1511.08458, 2015.
    [10] H. Gholamalinejad and H. Khosravi, Pooling Methods in Deep Neural Networks, a Review. 2020.
    [11] S. S. Basha, S. R. Dubey, V. Pulabaigari, and S. Mukherjee, "Impact of fully connected layers on performance of convolutional neural networks for image classification," Neurocomputing, vol. 378, pp. 112-119, 2020.
    [12] X. Zhou, V. Koltun, and P. Krähenbühl, "Probabilistic two-stage detection," arXiv preprint arXiv:2103.07461, 2021.
    [13] H. Zhang and R. Cloutier, "Review on One-Stage Object Detection Based on Deep Learning," EAI Endorsed Transactions on e-Learning, vol. 7, p. 174181, 06/09 2022, doi: 10.4108/eai.9-6-2022.174181.
    [14] R. Girshick, J. Donahue, T. Darrell, and J. Malik, "Rich feature hierarchies for accurate object detection and semantic segmentation," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 580-587.
    [15] S. Q. Ren, K. M. He, R. Girshick, and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," (in English), IEEE Trans. Pattern Anal. Mach. Intell., Article vol. 39, no. 6, pp. 1137-1149, Jun 2017, doi: 10.1109/tpami.2016.2577031.
    [16] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You only look once: Unified, real-time object detection," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779-788.
    [17] A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, "Yolov4: Optimal speed and accuracy of object detection," arXiv preprint arXiv:2004.10934, 2020.
    [18] Z. Ge, S. Liu, F. Wang, Z. Li, and J. Sun, "Yolox: Exceeding yolo series in 2021," arXiv preprint arXiv:2107.08430, 2021.
    [19] J. Redmon and A. Farhadi, "YOLO9000: better, faster, stronger," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 7263-7271.
    [20] J. Redmon and A. Farhadi, "Yolov3: An incremental improvement," arXiv preprint arXiv:1804.02767, 2018.
    [21] C.-Y. Wang, I.-H. Yeh, and H.-Y. M. Liao, "You only learn one representation: Unified network for multiple tasks," arXiv preprint arXiv:2105.04206, 2021.
    [22] X. Zhu, S. Lyu, X. Wang, and Q. Zhao, "TPH-YOLOv5: Improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios," in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 2778-2788.
    [23] C.-Y. Wang, A. Bochkovskiy, and H.-Y. M. Liao, "YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 7464-7475.
    [24] T.-Y. Tsai, Y.-C. Fang, and S.-H. Hung, "Passively Q-switched erbium all-fiber lasers by use of thulium-doped saturable-absorber fibers," Opt. Express, vol. 18, no. 10, pp. 10049-10054, 2010/05/10 2010, doi: 10.1364/OE.18.010049.
    [25] D. Bashir, G. D. Montañez, S. Sehra, P. S. Segura, and J. Lauw, "An information-theoretic perspective on overfitting and underfitting," in AI 2020: Advances in Artificial Intelligence: 33rd Australasian Joint Conference, AI 2020, Canberra, ACT, Australia, November 29–30, 2020, Proceedings 33, 2020: Springer, pp. 347-358.
    [26] Y. Lee, J.-w. Hwang, S. Lee, Y. Bae, and J. Park, "An energy and GPU-computation efficient backbone network for real-time object detection," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, 2019, pp. 0-0.
    [27] T. Cohen and M. Welling, "Group equivariant convolutional networks," in International conference on machine learning, 2016: PMLR, pp. 2990-2999.
    [28] X. Zhang, X. Zhou, M. Lin, and J. Sun, "Shufflenet: An extremely efficient convolutional neural network for mobile devices," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 6848-6856.

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE