簡易檢索 / 詳目顯示

研究生: 陳彥廷
Chen, Yen-Ting
論文名稱: 基於人工智慧之智慧型羽球競賽分析系統
An AI-based Smart Badminton Game Analysis System
指導教授: 楊家輝
Yang, Jar-Ferr
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電腦與通信工程研究所
Institute of Computer & Communication Engineering
論文出版年: 2022
畢業學年度: 110
語文別: 英文
論文頁數: 47
中文關鍵詞: 智慧運動羽球競賽物件偵測物件追蹤卷積神經網路人工智慧通道空間注意模塊跨級通道融合模塊
外文關鍵詞: intelligent sports, badminton contests, object detection, object tracking, convolutional neural networks, artificial intelligence, channel spatial attention modules, cross-level channel fusion module
相關次數: 點閱:155下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著體育賽事的觀看人數不斷提高,越來越多的體育賽事會將智慧運動系統引入比賽中。引入智慧運動系統不僅能提升整體比賽的觀賞性,更能提升比賽的公平性。在本論文中,我們提出了一套運用於羽球競賽分析的智慧運動系統,其可被應用在羽球比賽轉播及日常羽球影片分析。我們的系統內主要有三個模組,偵測模組、追蹤模組、展示模組。偵測模組負責偵測圖像中物體的位置和類別。在這個模組中,我們提出了通道空間注意模塊(CSAM)和跨級通道融合模塊(CCFM),以及隱性解耦頭來改善小物件的偵測結果。我們還設計了一個輔助偵測器用來提升推理時的偵測結果。在追蹤模組中,我們分別追蹤四位球員並獲得他們在球場上的位置。在展示模組中,我們計算了四個玩家各自的移動距離,並在軌跡圖上繪製了他們的移動軌跡。從實驗結果中,我們可以發現我們提出的偵測模塊在偵測小物體方面的性能可以優於其他 YOLO 系列方法。

    As the number of audiences watching sports competitions continues increasing, more and more sports events will introduce their smart sport analyzing systems for the contests. The smart system can not only improve the watching experience of audiences but also improve the fairness of the game. In this thesis, we proposed a smart sports system for badminton contests, which can be used in TV broadcasts and daily badminton video analysis. There are three main modules, detection module, tracking module and exhibition module, in our system. The detection module is responsible for detecting the location and category of objects in the image. In this module, we proposed a channel spatial attention module (CSAM) and a cross-level channel fusion module (CCFM), as well as an implicit decoupled head to improve the detection results of small objects. We also design an auxiliary detector to improve detection results while inference. In the tracking module, we track these four players and get their positions on the court. In the exhibition module, we calculate the respective moving distances of the players and draw their moving trajectories on a trajectory map. From experimental results, we can find that the performance of our proposed detection module in detecting small objects can outperform other YOLO-series methods.

    摘 要 I Abstract II 誌謝 III Contents V List of Tables VIII List of Figures IX Chapter 1 Introduction 1 1.1. Research Background 1 1.2. Motivations 2 1.3. Literature Reviews 3 1.4. Thesis Organization 6 Chapter 2 Related Work 8 2.1. Cross Stage Partial Network 8 2.2. YOLOv4 10 2.3. Spatial Pyramid Pooling Module 11 2.4. Path Aggregation Network 12 2.5. Efficient Channel Attention Network 13 2.6. Cross-level Gating Module 14 Chapter 3 The Proposed Smart Badminton Game Analysis System 15 3.1 Overview of the Proposed SBGA System 16 3.2. Network Architecture of Detection Module 17 3.2.1. Backbone Module 17 3.2.2. Channel Spatial Attention Module 19 3.2.3. Neck Module 20 3.2.4. Cross-level Channel Fusion Module 22 3.2.4. Head Module 23 3.3. Data Postprocessor of Detection Module 24 3.4. Loss Functions of Detection Module 25 3.4.1. Bounding Box Regression Loss 26 3.4.2. Object Loss 27 3.4.3. Classification Loss 27 3.5. Auxiliary Detector 28 3.6. Tracking Module 29 3.7. Exhibition Module 30 Chapter 4 Experimental Results 32 Experimental Results 32 4.1. Environmental Settings and Datasets 32 4.2. Comparisons with Other Methods 34 4.3. Ablation Studies 39 4.3.1 Proposed Detection Module 39 4.3.2 Auxiliary Detector 39 4.4. Demonstration Results 40 Chapter 5 Conclusions 42 Conclusions 42 Chapter 6 Future Work 43 Future Work 43 References 44

    [1] N. Dalal, B. Triggs, “Histograms of oriented gradients for human detection,” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition ,vol. 1, pp. 886-893, 2005.
    [2] D. G. Lowe, “Object recognition from local scale-invariant features,” Proceedings of the seventh IEEE International Conference on Computer Vision, vol. 2, pp. 1150-1157, 1999.
    [3] B. Herbert, T. Tuytelaars, and L. V. Gool, “Surf: Speeded up robust features,” Proceedings of the European Conference on Computer Vision (ECCV), pp. 404-417, May 2006.
    [4] A, Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in Neural Information Processing Systems, 2012.
    [5] R. Girshick, J. Donahue, T. Darrell and J. Malik “Rich feature hierarchies for accurate object detection and semantic segmentation,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 580-587, 2014.
    [6] J. R. R. Uijlings, K. E. A. van de Sande, T. Gevers and A. W. M. Smeulders, “Selective search for object recognition,” International Journal of Computer Vision , pp. 154-171, 2013.
    [7] R. Girshick, “Fast r-cnn,” Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp.1440-1448, 2015.
    [8] S. Ren, K He, R. Girshick and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” Advances in Neural Information Processing Systems, 2015.
    [9] J. Redmon, S. Divvala, R.Girshick and Ali Farhadi, “You only look once: Unified, real-time object detection,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779-788, 2016.
    [10] W. Liu et al, “Ssd: Single shot multibox detector,” Proceedings of the European Conference on Computer Vision (ECCV), pp. 1520-1528, October 2016.
    [11] J. Redmon, and A. Farhadi, “YOLO9000: better, faster, stronger,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7263-7271, 2017.
    [12] T.Y. Lin, P. Goyal, R. Girshick, K. He and P. Dollar, “Focal loss for dense object detection,” Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 2980-2988,2017.
    [13] T.Y. Lin, P. Dollar, R. Girshick, K. He, B. Hariharan and S. Belongie, “Feature pyramid networks for object detection,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2117-2125, 2017.
    [14] K. He, G. Gkioxari, P. Dollar, R. Girshick “Mask r-cnn,” Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 2961-2969, 2017.
    [15] M. Tan and Quoc V. Le, “Efficientnet: Rethinking model scaling for convolutional neural networks,” International Conference on Machine Learning, pp. 6105-6114, 2019.
    [16] A. Bochkovskiy, C-Y. Wang, and H-Y. M. Liao, “Yolov4: Optimal speed and accuracy of object detection,” arXiv:2004.10934[cs.CV], 2020.
    [17] H. Law, and J. Deng, “Cornernet: Detecting objects as paired keypoints,” Proceedings of the European Conference on Computer Vision (ECCV), pp. 734-750, 2018.
    [18] K. Duan, S. Bai, L. Xie, H. Qi, Q. Huang, and Q. Tian, “Centernet: Keypoint triplets for object detection,” Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 6569-6578, 2019.
    [19] C-Y Wang, H-Y M. Liao, I-H Yeh, Y-H Wu, P-Y Chen, and J-W Hsieh, “CSPNet: A new backbone that can enhance learning capability of CNN,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition workshops (CVPR), pp. 390-391, 2020.
    [20] K. He, X. Zhang, S. Ren, and J. Sun, “Spatial pyramid pooling in deep convolutional networks for visual recognition,” IEEE transactions on Pattern Analysis and Machine Intelligence, vol. 37, no.9, pp. 1904-1916, September 2015.
    [21] S. Liu, L. Qi, H. Qin, J. Shi, and J. Jia, “Path aggregation network for instance segmentation,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8759-8768, 2018.
    [22] Q. Wang, B. Wu, P. Zhu, P. Li, W. Zuo, and Q. Hu, “ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11534-11542, 2020.
    [23] J. Fu, J. Liu, J. Jiang, Y. Li, Y. Bao, and H. Lu, “Scene segmentation with dual relation-aware attention network,” IEEE Transactions on Neural Networks and Learning Systems, vol. 32, no. 6, pp.2547-2560, June 2020.
    [24] S. Woo, J. Park, J-Y Lee, and I. S. Kweon, “Cbam: Convolutional block attention module,” Proceedings of the European conference on computer vision (ECCV), pp. 3-19, 2018.
    [25] J. Hu, L. Shen, S. Albanie, G. Sun and E. Wu, “Squeeze-and-excitation networks,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7132-7141, 2018.
    [26] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770-778, 2016.
    [27] Z. Ge, S. Liu, F. Wang, Z. Li, and J. Sun, “Yolox: Exceeding yolo series in 2021,” arXiv:2107.08430[cs.CV], 2021.
    [28] C-Y Wang, I-H Yeh, and H-Y M. Liao, “You only learn one representation: Unified network for multiple tasks,” arXiv:2105.04206[cs.CV], 2021.
    [29] Z. Zheng, P. Wang, W. Liu, J. Li, and R. Ye, “Distance-IoU loss: Faster and better learning for bounding box regression,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, No. 07, pp. 12993-13000, 2020.
    [30] N. Wojke, A. Bewley and D. Paulus, “Simple online and realtime tracking with a deep association metric,” IEEE International Conference on Image Processing (ICIP), pp. 3645-3649, September, 2017.
    [31] R. E. Kalman, “A new approach to linear filtering and prediction problems,” Journal of Basic Engineering, vol. 82, pp. 35–45, 1960.
    [32] H. W. Kuhn, “The Hungarian method for the assignment problem,” Naval Research Logistics Quarterly, vol. 2, pp. 83–97, 1955.
    [33] C-Y Wang, A. Bochkovskiy and H-Y Mark Liao, “Scaled-yolov4: Scaling cross stage partial network,” Proceedings of the IEEE/cvf conference on computer vision and pattern recognition(CVPR), pp. 13029-13038, 2021.

    下載圖示 校內:2024-07-29公開
    校外:2024-07-29公開
    QR CODE