| 研究生: |
陳守志 Chen, Shou-Chih |
|---|---|
| 論文名稱: |
基於棒球軌跡特徵之機器學習的投球種類自動分類 Automatic Pitch Type Classification Based on Baseball Trajectory Features Using Machine Learning |
| 指導教授: |
侯廷偉
Hou, Ting-Wei |
| 學位類別: |
碩士 Master |
| 系所名稱: |
工學院 - 工程科學系 Department of Engineering Science |
| 論文出版年: | 2026 |
| 畢業學年度: | 114 |
| 語文別: | 中文 |
| 論文頁數: | 58 |
| 中文關鍵詞: | 棒球投球分析 、球種自動分類 、電腦視覺 、投手姿態估測 、深度學習 |
| 外文關鍵詞: | Baseball Pitching Analysis, Pitch Type Classification, Computer Vision, Pitcher Pose Estimation, Deep Learning |
| 相關次數: | 點閱:8 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本研究設計並實作一套以單一比賽轉播影像為輸入之棒球投球種類自動分類系統,旨在降低高階投球分析設備之成本門檻,同時維持具實務價值之分類效能。系統整合電腦視覺與機器學習技術,從實際 MLB 比賽影片中自動擷取投手姿態資訊與棒球飛行軌跡,並據此進行球種辨識與即時分析。
在系統架構上,首先採用 YOLO 系列模型進行棒球物件偵測與投手姿態估計,並透過卡爾曼濾波器重建棒球於放球後之飛行軌跡。接著,從重建軌跡中萃取統計式與時序式特徵,分別建構隨機森林(Random Forest, RF)、人工神經網路(Artificial Neural Network, ANN)與長短期記憶網路(Long Short-Term Memory, LSTM)三種分類模型,並比較其於不同特徵配置下之分類效能。
實驗結果顯示,在納入球速特徵的情況下,ANN 模型於八種球種分類任務中達到最高 92.2% 的分類準確率(Accuracy),相較於既有僅使用轉播影像之相關研究(約 89.0%)具有明顯提升,且與專業雷達型系統(約 96.0%)之效能差距已顯著縮小。此外,LSTM 模型於整體測試中展現出最高 91.7% 的召回率(Recall),顯示其在捕捉投球軌跡時序動態特性方面具有優勢。
針對棒球資料中常見之球種樣本不平衡問題,實驗進一步比較不同損失函數對模型效能之影響,結果顯示採用 Focal Loss 訓練之模型,其整體分類準確率優於傳統交叉熵損失函數(Cross-Entropy Loss)約 1.6%(90.6% vs. 89.0%),且能有效提升少數球種之辨識表現,在精確率與召回率之間取得較佳平衡。
綜合上述結果,本研究證實僅透過單一轉播攝影機所取得之影像資訊,即可建構具備高分類效能與近即時運算能力之投球分析系統,為低成本棒球數據分析與智慧化轉播應用提供可行方案。
This study presents an automatic baseball pitch type classification system using a single broadcast video, aiming to reduce the cost of pitch analysis while maintaining practical classification performance. The proposed system integrates computer vision and machine learning techniques to automatically extract pitcher pose information and reconstruct ball flight trajectories from real MLB broadcast footage.
YOLO-based models are employed for baseball detection and pitcher pose estimation, while a Kalman filter is applied to reconstruct the ball trajectory after release. Based on the reconstructed trajectories, statistical and temporal features are extracted and used to train three classification models: Random Forest (RF), Artificial Neural Network (ANN), and Long Short-Term Memory (LSTM).
Experimental results show that, with the inclusion of pitch velocity information, the ANN model achieves the highest classification accuracy of 92.2% in an eight-class pitch classification task. Compared with previous broadcast-video-based studies (approximately 89.0% accuracy), the proposed method demonstrates a clear improvement, and the performance gap with professional radar-based systems (approximately 96.0%) is significantly reduced. In addition, the LSTM model achieves the highest overall recall of 91.7%, indicating its effectiveness in modeling temporal trajectory patterns.
Furthermore, to address class imbalance in pitch datasets, this study evaluates different loss functions and shows that Focal Loss improves overall accuracy by approximately 1.6% compared with Cross-Entropy Loss. These results confirm that reliable pitch classification can be achieved using only a single broadcast camera, providing a cost-effective solution for baseball analytics applications.
[1] Trackman, “Trackman baseball.” [Online]. Available: https://www.trackman.com/baseball/. [Accessed: Jan. 2, 2026].
[2] Hawk-Eye Innovations. [Online]. Available: https://www.hawkeyeinnovations.com/. [Accessed: Jan. 2, 2026].
[3] M. Takahashi, M. Fujii, and N. Yagi, “Automatic pitch type recognition from baseball broadcast videos,” in Proc. 10th IEEE Int. Symp. on Multimedia (ISM), 2008, pp. 15–22, doi: 10.1109/ISM.2008.47.
[4] G. S. Fleisig, S. W. Barrentine, N. Zheng, R. F. Escamilla, and J. R. Andrews, “Kinematic and kinetic comparison of baseball pitching among various levels of development,” J. Biomech., vol. 32, no. 12, pp. 1371–1375, Dec. 1999, doi: 10.1016/s0021-9290(99)00127-x.
[5] MLB Advanced Media, L.P., “Statcast,” MLB.com Glossary. [Online]. Available: https://www.mlb.com/glossary/statcast. [Accessed: Jan. 3, 2026].
[6] MLB Advanced Media, L.P., “Baseball savant.” [Online]. Available: https://baseballsavant.mlb.com/. [Accessed: Jan. 2, 2026].
[7] P. Casella, “Introducing Statcast 2020: Hawk-Eye and Google Cloud,” Medium, Jul. 20, 2020. [Online]. Available: https://technology.mlblogs.com/introducing-statcast-2020-hawk-eye-and-google-cloud-a5f5c20321b8. [Accessed: Jan. 2, 2026].
[8] C. T. Lai, “Pitch-by-pitch extraction for broadcast baseball videos,” M.S. thesis, Dept. Comput. Sci. Inf. Eng., Natl. Taiwan Univ. Sci. Technol., Taipei, Taiwan, 2012.
[9] K. Y. Lin, “Pitching type analysis from baseball broadcast videos,” M.S. thesis, Dept. Comput. Sci. Inf. Eng., Natl. Dong Hwa Univ., Hualien, Taiwan, 2011.
[10] O. Patel and I. Mehta, “Predicting MLB pitch outcomes from video data,” Dept. Comput. Sci., Stanford Univ., Stanford, CA, USA, Tech. Rep., 2025.
[11] Ultralytics, “Object detection,” Ultralytics Docs. [Online]. Available: https://docs.ultralytics.com/tasks/detect/. [Accessed: Jan. 2, 2026].
[12] Z. Cao, G. Hidalgo, T. Simon, S. E. Wei, and Y. Sheikh, “OpenPose: Realtime multi-person 2D pose estimation using Part Affinity Fields,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, no. 1, pp. 172–186, Jan. 2021, doi: 10.1109/TPAMI.2019.2929257.
[13] C. Lugaresi et al., “MediaPipe: A framework for building perception pipelines,” arXiv preprint arXiv:1906.08172, Jun. 2019. [Online]. Available: https://doi.org/10.48550/arXiv.1906.08172
[14] Ultralytics, “Pose estimation,” Ultralytics Docs. [Online]. Available: https://docs.ultralytics.com/tasks/pose/. [Accessed: Jan. 2, 2026].
[15] Google, “MediaPipe pose landmarker,” MediaPipe documentation on GitHub, [Online]. Available: https://github.com/google-ai-edge/mediapipe/blob/master/docs/solutions/pose.md. [Accessed: Jan. 30, 2026].
[16] R. E. Kálmán, "A new approach to linear filtering and prediction problems," J. Basic Eng., vol. 82, no. 1, pp. 35–45, Mar. 1960, doi: 10.1115/1.3662552.
[17] H. S. Chen, H. T. Chen, W. J. Tsai, S. Y. Lee, and J. Y. Yu, “Pitch-by-pitch extraction from single view baseball video sequences,” in IEEE International Conference on Multimedia and Expo (ICME), Jul. 2007, pp. 1423–1426.
[18] L. Breiman, “Random forests,” Mach. Learn., vol. 45, no. 1, pp. 5–32, Oct. 2001, doi: 10.1023/A:1010933404324.
[19] J. Schuh and L. Kong, “Classifying pitch types in baseball using machine learning algorithms,” in Proc. IEEE Asia-Pacific Conf. Comput. Sci. Data Eng. (CSDE), Nadi, Fiji, 2023, pp. 1–6, doi: 10.1109/CSDE59766.2023.10487702.
[20] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, Nov. 1997, doi: 10.1162/neco.1997.9.8.1735.
[21] T. Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 2, pp. 318–327, Feb. 2020, doi: 10.1109/TPAMI.2018.2858826.
[22] MLB Advanced Media, L.P., “Terms of use,” MLB.com, Mar. 11, 2025. [Online]. Available: https://www.mlb.com/official-information/terms-of-use. [Accessed: Jan. 30, 2026].
[23] CVAT.ai, “Computer Vision Annotation Tool (CVAT),” [Online]. Available: https://www.cvat.ai/. [Accessed: Jan. 2, 2026].
[24] C. Schwartz and S. Sharpe, “MLB pitch classification,” Medium, Feb. 3, 2020. [Online]. Available: https://technology.mlblogs.com/mlb-pitch-classification-64a1e32ee079. [Accessed: Jan. 8, 2026].