簡易檢索 / 詳目顯示

研究生: 陳彥菘
Chen, Yen-Sung
論文名稱: 以類神經網路為架構的運動影片精采片段擷取之全面性方法
A Unified Sport Video Highlights Extraction Framework Based on Artificial Neural Network (ANN) System
指導教授: 郭致宏
Kuo, Chih-Hung
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2008
畢業學年度: 96
語文別: 英文
論文頁數: 54
外文關鍵詞: video retrieval, video indexing, Highlight extraction, sport analysis, artificial neural network
相關次數: 點閱:98下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在此篇論文中,我們提出一個以類神經網路為架構的分類器,去分類運動影片中的精采片段。並利用同一套特徵值去訓練不同類型的運動影片,達到擷取運動影片精采片段的目的。我們所提出的方法,不需要去偵測精采片段中常出現的物件,也不需要預先定義好精采影節發生的順序,更不需要預先定義精采片段發生的規則。我們所提出的系統架構,分成兩個模組:訓練模組與分析模組。我們預先利用訓練模組去產生各種運動類型精采片段的類神經網路權值,當訓練完畢後,我們可以利用分析模組並導入這些已經訓練好的類神經網路權值去分析各種運動的精采片段,同時產生精采片段強度與節奏分佈曲線圖。我們所提出的方法,可以應用在不同類型的運動影片。可以幫助使用者快速瀏覽整個運動比賽中的精采片段,而不用花過多的時間觀賞完整個運動比賽。根據實驗結果,我們所提出的方法,能符合使用者感官的認知。

    This thesis proposes a new highlight extraction method for various sport video. The proposed method uses the Artificial Neural Network (ANN) algorithm to train a highlight model and use the unified features without predefining any highlight rule of the events. The framework is composed of the training mode and the analysis mode. In the training mode, we use the ANN to train and control the weights among different features. In the analysis mode, we use the trained ANN to generate the highlight tempo curve. The proposed method can be applied to different sport types and help viewers to browse highlight directly without wasting a lot of time watching whole length of sport games from video recorder. The experimental results show the proposed method could obtain highlight that match viewer’s perspective in sport programs.

    CONTENTS 中文摘要 II ABSTRACT III Acknowledgement IV CONTENTS V LIST OF TABLES VII LIST OF FIGURES VIII Chapter 1 Introduction 1 1.1 Background 1 1.2 An Overview of Sports Highlight 1 1.3 Motivation 2 1.4 Contribution 3 1.5 Thesis Structure 3 Chapter 2 Related Research 4 2.1 An Overview of Related Research 4 2.1.1 Object Detection 4 2.1.2 Hidden Markov Model 5 2.1.3 Dynamic Bayesian Network & Multi-level Semantic Network 7 2.2 Transform RGB into YUV 9 2.3 Edge Detection Technique 10 2.4 Discussion 11 Chapter 3 The Proposed Framework 14 3.1 An Overview of Proposed Framework 14 3.2 Shot Detection 16 3.3 Visual and Audio Feature 18 3.3.1 Shot Length 18 3.3.2 MPEG-7 Color Structure 19 3.3.3 Shot Frame Difference 20 3.3.4 Shot Motion 21 3.3.5 Keyframe Difference and Motion 22 3.3.6 Y-Histogram Difference 23 3.3.7 Sound Energy 23 3.3.8 Sound Zero-crossing Rate 24 3.4 ANN Training and Analysis System 24 3.4.1 ANN Training Mode 25 3.4.2 ANN Analysis Mode 25 3.5 Highlights Tempo Generator 25 Chapter 4 The Proposed Highlight Shot Classification Method 26 4.1 ANN Input Data Structure 26 4.2 ANN Training Mode Framework 27 4.2.1 Initialize the Training System 29 4.2.2 Update the Training System 30 4.3 ANN Analysis Mode Framework 31 Chapter 5 Experimental Results and Discussion 34 5.1 Highlight Extraction in Baseball 39 5.2 Highlight Extraction in Basketball 42 5.3 Highlight Extraction in Soccer 44 5.4 Discussion 46 Chapter 6 Conclusion and Future Work 49 6.1 Conclusion 49 6.2 Future Work 49 References 51 LIST OF TABLES TABLE 5 1 DETAILS OF EXPERIMENTED SPORTS 39 TABLE 5 2 HIGHLIGHT EXTRACTION SIMULATION RESULT OF BASEBALL GAME 40 TABLE 5 3 HIGHLIGHT EXTRACTION SIMULATION RESULT OF BASKETBALL GAME 43 TABLE 5 4 HIGHLIGHT EXTRACTION SIMULATION RESULT OF SOCCER GAME 45 TABLE 5 5 HIGHLIGHT EXTRACTION SIMULATION RESULT 47 TABLE 5 6 THE VERIFY ENVIRONMENT 48 TABLE 5 7 PERFORMANCE COMPARISON WITH OTHER METHODS 48 LIST OF FIGURES FIGURE 2 1 THE SYSTEM ARCHITECTURE IN [1] 5 FIGURE 2 2 THE EXAMPLE OF HMM MODEL 7 FIGURE 2 3 DYNAMIC BAYESIAN NETWORK OF GOAL-EVENT DETECTION 8 FIGURE 2 4 DYNAMIC BAYESIAN NETWORK OF CORNER KICK EVENT DETECTION 8 FIGURE 2 5 THE TRANSFORMATION RESULT OF RGB TO Y. 10 FIGURE 2 6 THE EDGE DETECTION RESULT. 11 FIGURE 2 7 BLOCK DIAGRAM OF THE VIDEO ABSTRACTION SYSTEM. 12 FIGURE 3 1 THE PROPOSED FRAMEWORK OF HIGHLIGHT SHOTS EXTRACTION SYSTEM 15 FIGURE 3 2 A TYPICAL SHOT DETECTION ALGORITHM. 17 FIGURE 3 3 THE EVENT OF MIXING FRAME DURING SCENE TRANSITION. 18 FIGURE 3 4 THE MPEG-7 COLOR STRUCTURE SIMULATION RESULT. 20 FIGURE 3 5 THE MPEG-7 COLOR STRUCTURE DESCRIPTOR CURVE IN A BASEBALL GAME. 20 FIGURE 3 6 EXAMPLE OF MOTION VECTOR. 22 FIGURE 3 7 METHOD OF MOTION ESTIMATION 22 FIGURE 4 1 A HOME RUN EVENT IN A BASEBALL GAME 26 FIGURE 4 2 THE NETWORK ARCHITECTURE OF ANN AND INPUT DATA STRUCTURE. 27 FIGURE 4 3 THE FORWARD PASS DESCRIBES OF ANN ALGORITHM. 28 FIGURE 4 4 THE INITIALIZATION OF TRAINING DATA. 29 FIGURE 4 5 THE FLOWCHART OF ANN TRAINING MODE. 31 FIGURE 4 6 THE FLOWCHART OF ANN ANALYSIS MODE. 32 FIGURE 4 7 A EXAMPLE OF HIGHLIGHTS TEMPO GENERATOR. 33 FIGURE 5 1 THE GUI OF INITIAL SYSTEM. 35 FIGURE 5 2 THE GUI OF EIGENVALUE EXTRACTION. 35 FIGURE 5 3 THE GUI OF EIGENVALUE ANALYSIS. 36 FIGURE 5 4 THE GUI OF SHOT PLAYER. 36 FIGURE 5 5 THE GUI OF INPUT TRAINING DATA. 37 FIGURE 5 6 THE GUI OF ANN TRAINING SYSTEM. 37 FIGURE 5 7 THE GUI OF ANN WEIGHTS ANALYSIS. 38 FIGURE 5 8 THE GUI OF HIGHLIGHT TEMPO GENERATOR. 38 FIGURE 5 9 HIGHLIGHT CURVE OF BASEBALL. 42 FIGURE 5 10 HIGHLIGHT CURVE OF BASKETBALL. 44 FIGURE 5 11 HIGHLIGHT CURVE OF SOCCER. 46

    References

    [1] Z. Xiong, R. Radhakrishnan, A. Divakaran, and T.S. Huang, “Highlights extraction from sports video based on an audio-visual marker detection framework,” in Proc. IEEE ICME, July 2005, pp. 29–32.
    [2] X. Tong, L. Duan, H. Lu, C. Xu, Q. Tian and J. S. Jin, “A mid-level visual concept generation framework for sports analysis,” in Proc. IEEE ICME, July 2005, pp. 646–649.
    [3] A. Hanjalic, “Multimodal approach to measuring excitement in video,” in Proc. IEEE ICME, July 2003, pp. 289–292.
    [4] A. Hanjalic, “Generic approach to highlights extraction from a sport video,” in Proc. IEEE ICIP, Sept. 2003, pp. I - 1–4.
    [5] L. Y. Duan, M. Xu, T. S. Chua, Q. Tian, and C. S.Xu, “A mid-level representation framework for semantic sports video analysis,” in Proc. ACM Multimedia, Nov. 2003, pp. 33–44.
    [6] Y. L. Chang, W. Zeng, I. Kamel, and R. Alonso, “Integrated image and speech analysis for content-based video indexing,” in Proc. IEEE ICMCS, May 1996, pp. 306–313.
    [7] K. Wan and C. Xu, “Efficient multimodal features for automatic soccer highlight generation,” in Proc. IEEE ICPR, Aug. 2004, pp. 973–976.
    [8] Q. Huang, J. Hu, W. Hu, T. Wang, H. Bai and Y. Zhang, “A reliable logo and replay detector for sports video,” in Proc. IEEE ICME, July 2007, pp. 1695–1698.
    [9] J. Assfalg, M. Bertini, A. Del Bimbo, W. Nunziati and P. Pala, “Soccer highlights detection and recognition using HMMs,” in Proc. IEEE ICME, Aug. 2002, pp. 825–828.
    [10] G. Xu, Y. F. Ma, H. J. Zhang and S. Yang, “A HMM based semantic analysis framework for sports game event detection,” in Proc. IEEE ICIP, Sept. 2003, pp. I - 25–8.
    [11] J. Wang, C. Xu, E. Chng and Q. Tian, “Sports highlight detection from keyword sequences using HMM,” in Proc. IEEE ICME, June 2004, pp. 599–602.
    [12] P. Chang, M. Han and Y. Gong, “Extract highlights from baseball game video with hidden Markov models,” in Proc. IEEE ICIP, Sept. 2002, pp. 609–612.
    [13] N. H. Bach, K. Shinoda and S. Furui, “Robust highlight extraction using multi-stream hidden Markov models for baseball video,” in Proc. IEEE ICIP, Sept. 2005, pp. III - 173–6.
    [14] Z. Xiong, R. Radhakrishnan, A. Divakaran and T. S. Huang, “Audio events detection based highlights extraction from baseball, golf and soccer games in a unified framework,” in Proc. IEEE ICME, July 2003, pp. III - 401–4.
    [15] B. Zhang, W. Chen, W. Dou, Y. J. Zhang and L. Chen, “Content-based table tennis games highlight detection utilizing audiovisual clues,” in Proc. IEEE ICIG, Aug. 2007, pp. 833–838.
    [16] C. C. Cheng and C. T. Hsu, “Fusion of audio and motion information on HMM-based highlight extraction for baseball games,” IEEE Trans. Multimedia, pp. 585–599, June 2006.
    [17] C. Y. Chao, H. C. Shih and C. L. Huang, “Semantics-based highlight extraction of soccer program using DBN,” in Proc. IEEE ICASSP, March 2005, pp. ii/1057–ii/1060.
    [18] F. Wang, Y. F. Ma, H. J. Zhang and J. T. Li, “Dynamic Bayesian network based event detection for soccer highlight extraction,” in Proc. IEEE ICIP, Oct. 2004, pp. 633–636.
    [19] H. C. Shih and C. L. Huang, “Detection of the highlights in baseball video program,” in Proc. IEEE ICME, June 2004, pp. 595–598.
    [20] H. C. Shih and C. L. Huang, “MSN: statistical understanding of broadcasted baseball video using multi-level semantic network,” IEEE Trans. Broadcasting, pp. 449–459, Dec. 2005.
    [21] L. C. Chang, Y. S. Chen, R. W. Liou, C. H. Kuo, C. H. Yeh and B. D. Liu, “A real time and low cost hardware architecture for video abstraction system,” in Proc. IEEE ISCAS, May 2007, pp. 773–776.
    [22] MPEG-7 Requirements Document V.18, ISO/IEC JTC1/SC29/WG11/ N6881, January 2005.
    [23] MPEG-7 Overview (version 10), ISO/IEC JTC1/SC29/WG11, October 2004.
    [24] C. H. Kuo, M. Shen and C.-C. Jay Kuo, “Fast motion search with efficient inter-prediction mode decision for H.264,” Journal of Visual Communication and Image Representation, pp. 217–242, 2006.
    [25] Iain E. G. Richardson, H.264 and MPEG-4 Video Compression, WILEY, 2003.
    [26] X. Jing and L. P. Chau, “An efficient three-step search algorithm for block motion estimation,” IEEE Transactions on Multimedia, vol. 6, pp. 435 – 438, June 2004.
    [27] R. Li, B. Zeng and M. L. Liou, “A new three-step search algorithm for block motion estimation,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 4, pp. 438 – 442, Aug. 1994.
    [28] J. B. Li and Y. K. Chung, “A novel back-propagation neural network training algorithm designed by an ant colony optimization,” in Proc. IEEE/PES, TDC. 2005, pp. 1–5.
    [29] S. J. Li, Y. Li, Y. Liu, Z. G. Liu and J. Tang, “Hybrid method of BPN and genetic algorithm for completion time prediction,” in Proc. IEEE ICMLC, Aug. 2005, pp. 4625–4630.
    [30] S.-C. Chen, S.-W. Lin, T.-Y. Tseng and H.-C. Lin, “Optimization of back-propagation network using simulated annealing approach,” in Proc. IEEE ICSMC, Oct. 2006, pp. 2819–2824.
    [31] D. Jinhui, L. Duan, T. Xiaofeng, C. Xu, Q. Tian, L. Hanqing and J.S. Jin, “Replay scene classification in soccer video using web broadcast text,” in Proc. IEEE ICME, July 2005, pp. 1098–1101.

    下載圖示 校內:2010-08-19公開
    校外:2010-08-19公開
    QR CODE