簡易檢索 / 詳目顯示

研究生: 陳守志
Chen, Shou-Chih
論文名稱: 基於棒球軌跡特徵之機器學習的投球種類自動分類
Automatic Pitch Type Classification Based on Baseball Trajectory Features Using Machine Learning
指導教授: 侯廷偉
Hou, Ting-Wei
學位類別: 碩士
Master
系所名稱: 工學院 - 工程科學系
Department of Engineering Science
論文出版年: 2026
畢業學年度: 114
語文別: 中文
論文頁數: 58
中文關鍵詞: 棒球投球分析球種自動分類電腦視覺投手姿態估測深度學習
外文關鍵詞: Baseball Pitching Analysis, Pitch Type Classification, Computer Vision, Pitcher Pose Estimation, Deep Learning
相關次數: 點閱:8下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本研究設計並實作一套以單一比賽轉播影像為輸入之棒球投球種類自動分類系統,旨在降低高階投球分析設備之成本門檻,同時維持具實務價值之分類效能。系統整合電腦視覺與機器學習技術,從實際 MLB 比賽影片中自動擷取投手姿態資訊與棒球飛行軌跡,並據此進行球種辨識與即時分析。
    在系統架構上,首先採用 YOLO 系列模型進行棒球物件偵測與投手姿態估計,並透過卡爾曼濾波器重建棒球於放球後之飛行軌跡。接著,從重建軌跡中萃取統計式與時序式特徵,分別建構隨機森林(Random Forest, RF)、人工神經網路(Artificial Neural Network, ANN)與長短期記憶網路(Long Short-Term Memory, LSTM)三種分類模型,並比較其於不同特徵配置下之分類效能。
    實驗結果顯示,在納入球速特徵的情況下,ANN 模型於八種球種分類任務中達到最高 92.2% 的分類準確率(Accuracy),相較於既有僅使用轉播影像之相關研究(約 89.0%)具有明顯提升,且與專業雷達型系統(約 96.0%)之效能差距已顯著縮小。此外,LSTM 模型於整體測試中展現出最高 91.7% 的召回率(Recall),顯示其在捕捉投球軌跡時序動態特性方面具有優勢。
    針對棒球資料中常見之球種樣本不平衡問題,實驗進一步比較不同損失函數對模型效能之影響,結果顯示採用 Focal Loss 訓練之模型,其整體分類準確率優於傳統交叉熵損失函數(Cross-Entropy Loss)約 1.6%(90.6% vs. 89.0%),且能有效提升少數球種之辨識表現,在精確率與召回率之間取得較佳平衡。
    綜合上述結果,本研究證實僅透過單一轉播攝影機所取得之影像資訊,即可建構具備高分類效能與近即時運算能力之投球分析系統,為低成本棒球數據分析與智慧化轉播應用提供可行方案。

    This study presents an automatic baseball pitch type classification system using a single broadcast video, aiming to reduce the cost of pitch analysis while maintaining practical classification performance. The proposed system integrates computer vision and machine learning techniques to automatically extract pitcher pose information and reconstruct ball flight trajectories from real MLB broadcast footage.
    YOLO-based models are employed for baseball detection and pitcher pose estimation, while a Kalman filter is applied to reconstruct the ball trajectory after release. Based on the reconstructed trajectories, statistical and temporal features are extracted and used to train three classification models: Random Forest (RF), Artificial Neural Network (ANN), and Long Short-Term Memory (LSTM).
    Experimental results show that, with the inclusion of pitch velocity information, the ANN model achieves the highest classification accuracy of 92.2% in an eight-class pitch classification task. Compared with previous broadcast-video-based studies (approximately 89.0% accuracy), the proposed method demonstrates a clear improvement, and the performance gap with professional radar-based systems (approximately 96.0%) is significantly reduced. In addition, the LSTM model achieves the highest overall recall of 91.7%, indicating its effectiveness in modeling temporal trajectory patterns.
    Furthermore, to address class imbalance in pitch datasets, this study evaluates different loss functions and shows that Focal Loss improves overall accuracy by approximately 1.6% compared with Cross-Entropy Loss. These results confirm that reliable pitch classification can be achieved using only a single broadcast camera, providing a cost-effective solution for baseball analytics applications.

    摘要 i Extended Abstract ii 誌謝 vi 目錄 vii 表目錄 ix 圖目錄 x 第一章 緒論 1 1.1 研究背景與動機 1 1.2 研究目的 1 1.3 預期貢獻 2 1.4 論文架構 2 第二章 文獻探討 3 2.1 棒球投球與球種分類 3 2.2 商業化投球追蹤技術之演進 5 2.3 核心偵測與追蹤技術 5 2.4 投球軌跡特徵提取與球種分類相關研究 7 2.5 本章小結 11 第三章 系統設計與研究方法 12 3.1 系統架構 12 3.2 資料獲取與前處理 15 3.3 投手姿態估計與棒球物件偵測 16 3.4 卡爾曼濾波軌跡重建 19 3.5 球路飛行軌跡特徵提取 19 3.6 分類模型選用與評估指標 22 3.7 視覺化與系統輸出 24 3.8 本章小結 24 第四章 實驗結果與分析 25 4.1 實驗設計與環境設定 25 4.2 棒球物件效能評估 26 4.3 球種分類之各模型效能分析 27 4.4 系統整合分析與即時視覺化成果 33 4.5 受試投手獨立驗證實驗 36 4.6 投球型態相似投手驗證實驗 38 4.7 與既有球種分類研究之對比分析 39 第五章 結論與未來展望 41 5.1 結論 41 5.2 研究限制與未來展望 42 參考文獻 43 附錄 45 附錄一 各投手球種統計表 45 附錄二 分類模型於不同投手資料集之效能評估 46

    [1] Trackman, “Trackman baseball.” [Online]. Available: https://www.trackman.com/baseball/. [Accessed: Jan. 2, 2026].
    [2] Hawk-Eye Innovations. [Online]. Available: https://www.hawkeyeinnovations.com/. [Accessed: Jan. 2, 2026].
    [3] M. Takahashi, M. Fujii, and N. Yagi, “Automatic pitch type recognition from baseball broadcast videos,” in Proc. 10th IEEE Int. Symp. on Multimedia (ISM), 2008, pp. 15–22, doi: 10.1109/ISM.2008.47.
    [4] G. S. Fleisig, S. W. Barrentine, N. Zheng, R. F. Escamilla, and J. R. Andrews, “Kinematic and kinetic comparison of baseball pitching among various levels of development,” J. Biomech., vol. 32, no. 12, pp. 1371–1375, Dec. 1999, doi: 10.1016/s0021-9290(99)00127-x.
    [5] MLB Advanced Media, L.P., “Statcast,” MLB.com Glossary. [Online]. Available: https://www.mlb.com/glossary/statcast. [Accessed: Jan. 3, 2026].
    [6] MLB Advanced Media, L.P., “Baseball savant.” [Online]. Available: https://baseballsavant.mlb.com/. [Accessed: Jan. 2, 2026].
    [7] P. Casella, “Introducing Statcast 2020: Hawk-Eye and Google Cloud,” Medium, Jul. 20, 2020. [Online]. Available: https://technology.mlblogs.com/introducing-statcast-2020-hawk-eye-and-google-cloud-a5f5c20321b8. [Accessed: Jan. 2, 2026].
    [8] C. T. Lai, “Pitch-by-pitch extraction for broadcast baseball videos,” M.S. thesis, Dept. Comput. Sci. Inf. Eng., Natl. Taiwan Univ. Sci. Technol., Taipei, Taiwan, 2012.
    [9] K. Y. Lin, “Pitching type analysis from baseball broadcast videos,” M.S. thesis, Dept. Comput. Sci. Inf. Eng., Natl. Dong Hwa Univ., Hualien, Taiwan, 2011.
    [10] O. Patel and I. Mehta, “Predicting MLB pitch outcomes from video data,” Dept. Comput. Sci., Stanford Univ., Stanford, CA, USA, Tech. Rep., 2025.
    [11] Ultralytics, “Object detection,” Ultralytics Docs. [Online]. Available: https://docs.ultralytics.com/tasks/detect/. [Accessed: Jan. 2, 2026].
    [12] Z. Cao, G. Hidalgo, T. Simon, S. E. Wei, and Y. Sheikh, “OpenPose: Realtime multi-person 2D pose estimation using Part Affinity Fields,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, no. 1, pp. 172–186, Jan. 2021, doi: 10.1109/TPAMI.2019.2929257.
    [13] C. Lugaresi et al., “MediaPipe: A framework for building perception pipelines,” arXiv preprint arXiv:1906.08172, Jun. 2019. [Online]. Available: https://doi.org/10.48550/arXiv.1906.08172
    [14] Ultralytics, “Pose estimation,” Ultralytics Docs. [Online]. Available: https://docs.ultralytics.com/tasks/pose/. [Accessed: Jan. 2, 2026].
    [15] Google, “MediaPipe pose landmarker,” MediaPipe documentation on GitHub, [Online]. Available: https://github.com/google-ai-edge/mediapipe/blob/master/docs/solutions/pose.md. [Accessed: Jan. 30, 2026].
    [16] R. E. Kálmán, "A new approach to linear filtering and prediction problems," J. Basic Eng., vol. 82, no. 1, pp. 35–45, Mar. 1960, doi: 10.1115/1.3662552.
    [17] H. S. Chen, H. T. Chen, W. J. Tsai, S. Y. Lee, and J. Y. Yu, “Pitch-by-pitch extraction from single view baseball video sequences,” in IEEE International Conference on Multimedia and Expo (ICME), Jul. 2007, pp. 1423–1426.
    [18] L. Breiman, “Random forests,” Mach. Learn., vol. 45, no. 1, pp. 5–32, Oct. 2001, doi: 10.1023/A:1010933404324.
    [19] J. Schuh and L. Kong, “Classifying pitch types in baseball using machine learning algorithms,” in Proc. IEEE Asia-Pacific Conf. Comput. Sci. Data Eng. (CSDE), Nadi, Fiji, 2023, pp. 1–6, doi: 10.1109/CSDE59766.2023.10487702.
    [20] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, Nov. 1997, doi: 10.1162/neco.1997.9.8.1735.
    [21] T. Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 2, pp. 318–327, Feb. 2020, doi: 10.1109/TPAMI.2018.2858826.
    [22] MLB Advanced Media, L.P., “Terms of use,” MLB.com, Mar. 11, 2025. [Online]. Available: https://www.mlb.com/official-information/terms-of-use. [Accessed: Jan. 30, 2026].
    [23] CVAT.ai, “Computer Vision Annotation Tool (CVAT),” [Online]. Available: https://www.cvat.ai/. [Accessed: Jan. 2, 2026].
    [24] C. Schwartz and S. Sharpe, “MLB pitch classification,” Medium, Feb. 3, 2020. [Online]. Available: https://technology.mlblogs.com/mlb-pitch-classification-64a1e32ee079. [Accessed: Jan. 8, 2026].

    下載圖示
    校外:立即公開
    QR CODE