簡易檢索 / 詳目顯示

研究生: 葉詩棋
Ye, Shi-Qi
論文名稱: 基於深度學習之裁判手勢辨識應用於排球賽事偵測於比賽切片及自動計分
Deep Learning-Based Referee Gesture Recognition for Volleyball Event Detection and Match Segmentation with Automatic Scoring
指導教授: 徐禕佑
Hsu, Yi-Yu
學位類別: 碩士
Master
系所名稱: 敏求智慧運算學院 - 智慧科技系統碩士學位學程
MS Degree Program on Intelligent Technology Systems
論文出版年: 2025
畢業學年度: 113
語文別: 中文
論文頁數: 102
中文關鍵詞: 排球裁判手勢辨識手勢序列分析賽事影片自動切割
外文關鍵詞: Volleyball Referee Gesture Recognition, Gesture Sequence Analysis, Automated Match Video Segmentation
相關次數: 點閱:52下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 中文摘要 I Abstract II 誌謝 XIII 目錄 XIV 表目錄 XVII 圖目錄 XIX 第一章 緒論 1 1-1. 前言 1 1-2. 研究動機 2 1-3. 研究貢獻 2 1-4. 論文架構 3 第二章 相關研究背景介紹 4 2-1. 裁判手勢辨識與實事分析 4 2-1.1 球類運動中的裁判手勢識別 4 2-1.2 排球賽事分析方法 6 2-2. 3D卷積神經網路發展 7 2-2.1 Convolutional 3D (C3D) 8 2-2.2 Two-Stream Inflated 3D ConvNet (I3D) 9 2-2.3 Separable 3D Convolution (S3D) 10 2-2.4 New spatiotemporal convolution block R(2+1)D 12 2-2.5 SlowFast Networks 12 2-2.6 Expand 3D(X3D) 13 2-3. Transformer在視覺領域的應用 14 2-3.1 Transformer基礎架構與自注意力機制 15 2-3.2 視覺Transformer模型演進 16 2-3.3 影片識別的Transformer技術 18 2-3.4 Transformer在目標檢測中的應用 19 第三章 實驗方法 21 3-1. 裁判手勢資料集構建 21 3-1.1 資料收集 21 3-1.2 動作與手勢標別定義 23 3-1.3 資料預處理 25 3-1.4 資料集結構與統計 27 3-2. 相關技術與基準方法 29 3-2.1 基於傳統機器學習方法 30 3-2.2 基於關鍵點的深度學習 34 3-2.3 基於影像的深度學習 36 3-3. 裁判手勢識別系統 38 3-3.1 主要模型架構 39 3-3.2 特徵提取與優化方法 40 3-4. 自動化系統實現 41 3-4.1 手勢檢測與切割演算法 42 3-4.2 自動計分機制 47 3-4.3 系統整合與介面設計 48 第四章 實驗設置與結果分析 50 4-1. 實驗設置 50 4-2. 評估指標 52 4-3. 模型與系統性能評估 53 4-3.1 辨識準確率分析 54 4-3.2 延遲與即時性分析 57 4-3.3 影片自動切割功能評估 60 4-3.4 計分系統準確率分析 64 4-4. 錯誤分析 66 4-4.1 YouTube官方比賽影片測試錯誤分析 66 4-4.2 中正企業排球聯賽測試結果錯誤分析 67 第五章 討論 69 5-1. 系統優勢與限制 69 5-2. 實際應用場景中的挑戰 71 第六章 結論與未來展望 72 6-1. 結論 72 6-2. 未來展望 73 參考文獻 74

    [1] Du Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, and Manohar Paluri. Learning spatiotemporal features with 3d convolutional networks. In Proceedings of the IEEE international conference on computer vision, pages 4489–4497, 2015.
    [2] Joao Carreira and Andrew Zisserman. Quo vadis, action recognition? a new model and the kinetics dataset. In proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 6299–6308, 2017.
    [3] Saining Xie, Chen Sun, Jonathan Huang, Zhuowen Tu, and Kevin Murphy. Rethinking spatiotemporal feature learning: Speed-accuracy trade-offs in video classification. In Proceedings of the European conference on computer vision (ECCV), pages 305–321, 2018.
    [4] Indrajeet Ghosh, Sreenivasan Ramasamy Ramamurthy, Avijoy Chakma, and Nirmalya Roy. Sports analytics review: Artificial intelligence applications, emerging technologies, and algorithmic perspective. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 13(5):e1496, 2023.
    [5] Chung-Wei Yeh, Tse-Yu Pan, and Min-Chun Hu. A sensor-based official basketball referee signals recognition system using deep belief networks. In MultiMedia Modeling: 23rd International Conference, MMM 2017, Reykjavik, Iceland, January 4-6, 2017, Proceedings, Part I 23, pages 565–575. Springer, 2017.
    [6] Julius Žemgulys, Vidas Raudonis, Rytis Maskeliūnas, and Robertas Damaševičius. Recognition of basketball referee signals from videos using histogram of oriented gradients (hog) and support vector machine (svm). Procedia computer science, 130:953–960, 2018.
    [7] 獨立媒體. 【甲一女排】交青投訴計算分排錯致有出入佬「以電子計分系統為準」, 2023. Accessed: 2025-03-13.
    [8] 文匯報. 世界女排聯賽計分出錯事件. Online, 2023. Accessed: 2025-03-13.
    [9] Ivan Gonzalez-Cabrera, Diego Luis González, and Diego Darío Herrera. Rally-based performance evaluation model for highly competitive volleyball. 2024.
    [10] Geoffrey E Hinton, Simon Osindero, and Yee-Whye Teh. A fast learning algorithm for deep belief nets. Neural computation, 18(7):1527–1554, 2006.
    [11] Navneet Dalal and Bill Triggs. Histograms of oriented gradients for human detection. In 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05), volume 1, pages 886–893. Ieee, 2005.
    [12] Corinna Cortes and Vladimir Vapnik. Support-vector networks. Machine learning, 20:273–297, 1995.
    [13] Feng Gao and Xing Shen. A rapid and and efficient method for recognizing basketball umpire signals using iccg-yolo. IEEE Access, 2024.
    [14] Duo Li, Jie Hu, Changhu Wang, Xiangtai Li, Qi She, Lei Zhu, Tong Zhang, and Qifeng Chen. Involution: Inverting the inherence of convolution for visual recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12321–12330, 2021.
    [15] Qibin Hou, Daquan Zhou, and Jiashi Feng. Coordinate attention for efficient mobile network design. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 13713–13722, 2021.
    [16] Jiaqi Wang, Kai Chen, Rui Xu, Ziwei Liu, Chen Change Loy, and Dahua Lin. Carafe: Content-aware reassembly of features. In Proceedings of the IEEE/CVF international conference on computer vision, pages 3007–3016, 2019.
    [17] Rinki Gupta, Varun Kavanal, and Anya Joshi. Deep learning models for recognition of hand gestures in basketball sports. In 2024 1st International Conference on Cognitive, Green and Ubiquitous Computing (IC-CGU), pages 1–6. IEEE, 2024.
    [18] Zhiyuan Yang, Yuanyuan Shen, and Yanfei Shen. Football referee gesture recognition algorithm based on yolov8s. Frontiers in Computational Neuroscience, 18:1341234, 2024.
    [19] Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
    [20] Robert Rein and Daniel Memmert. Big data and tactical analysis in elite soccer: future challenges and opportunities for sports science. SpringerPlus, 5:1–13, 2016.
    [21] Miguel Silva, Rui Marcelino, Daniel Lacerda, and Paulo Vicente João. Match analysis in volleyball: a systematic review. Montenegrin Journal of Sports Science and Medicine, 5(1):35, 2016.
    [22] José M Palao, Ana Belén López-Martínez, David Valadés, and Enrique Ortega. Physical actions and work-rest time in women's beach volleyball. International Journal of Performance Analysis in Sport, 15(1):424–429, 2015.
    [23] J Sánchez-Moreno, R Marcelino, I Mesquita, and A Ureña. Analysis of the rally length as a critical incident of the game in elite male volleyball. International Journal of Performance Analysis in Sport, 15(2):620–631, 2015.
    [24] Alejandro Sánchez-Pay, José Antonio Ortega-Soto, and Bernardino J Sánchez-Alcaraz. Notational analysis in female grand slam tennis competitions. Kinesiology, 53(1):154–161, 2021.
    [25] Miguel-Ángel Gomez, Fernando Rivas, Jonathan D Connor, and Anthony S Leicht. Performance differences of temporal parameters and point outcome between elite men's and women's badminton players according to match-related contexts. International journal of environmental research and public health, 16(21):4057, 2019.
    [26] Raúl Hileno, Marc González-Franqué, Albert Iricibar, Lorenzo Laporta, and Antonio García-de Alcaraz. Comparison of rally length between women and men in high-level spanish volleyball. Journal of Human Kinetics, 89:171, 2023.
    [27] Shuiwang Ji, Wei Xu, Ming Yang, and Kai Yu. 3d convolutional neural networks for human action recognition. IEEE transactions on pattern analysis and machine intelligence, 35(1):221–231, 2012.
    [28] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1–9, 2015.
    [29] Antoine Miech, Ivan Laptev, and Josef Sivic. Learnable pooling with context gating for video classification. arXiv preprint arXiv:1706.06905, 2017.
    [30] Du Tran, Heng Wang, Lorenzo Torresani, Jamie Ray, Yann LeCun, and Manohar Paluri. A closer look at spatiotemporal convolutions for action recognition. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 6450–6459, 2018.
    [31] Christoph Feichtenhofer, Haoqi Fan, Jitendra Malik, and Kaiming He. Slowfast networks for video recognition. In Proceedings of the IEEE/CVF international conference on computer vision, pages 6202–6211, 2019.
    [32] Christoph Feichtenhofer. X3d: Expanding architectures for efficient video recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 203–213, 2020.
    [33] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017.
    [34] Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
    [35] Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021.
    [36] Gedas Bertasius, Heng Wang, and Lorenzo Torresani. Is space-time attention all you need for video understanding? In ICML, volume 2, page 4, 2021.
    [37] Ze Liu, Jia Ning, Yue Cao, Yixuan Wei, Zheng Zhang, Stephen Lin, and Han Hu. Video swin transformer. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 3202–3211, 2022.
    [38] Haoqi Fan, Bo Xiong, Karttikeya Mangalam, Yanghao Li, Zhicheng Yan, Jitendra Malik, and Christoph Feichtenhofer. Multiscale vision transformers. In Proceedings of the IEEE/CVF international conference on computer vision, pages 6824–6835, 2021.
    [39] Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. End-to-end object detection with transformers. In European conference on computer vision, pages 213–229. Springer, 2020.
    [40] Xizhou Zhu, Weijie Su, Lewei Lu, Bin Li, Xiaogang Wang, and Jifeng Dai. Deformable detr: Deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159, 2020.
    [41] Hao Zhang, Feng Li, Shilong Liu, Lei Zhang, Hang Su, Jun Zhu, Lionel M Ni, and Heung-Yeung Shum. Dino: Detr with improved denoising anchor boxes for end-to-end object detection. arXiv preprint arXiv:2203.03605, 2022.
    [42] Yian Zhao, Wenyu Lv, Shangliang Xu, Jinman Wei, Guanzhong Wang, Qingqing Dang, Yi Liu, and Jie Chen. Detrs beat yolos on real-time object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 16965–16974, 2024.

    無法下載圖示 校內:2030-07-17公開
    校外:2030-07-17公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE