成功大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	劉榮文 Liou, Rung-Wen
論文名稱：	應用在多媒體影像精采片段擷取之全面性方法 A Comprehensive Approach for Extracting Highlights from Video Media
指導教授：	葉家宏 Yeh, Chia-Hung 郭致宏 Kuo, Chih-Hung
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 電機工程學系 Department of Electrical Engineering
論文出版年：	2006
畢業學年度：	94
語文別：	英文
論文頁數：	68
中文關鍵詞：	精采片段、索引、擷取
外文關鍵詞：	extraction, index, video content analysis, highlights
相關次數：	點閱：87 下載：1
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

在此篇論文中，我們提出了一種應用在多媒體影像精采片段擷取的全面性方法。我們提出的方法有別於現存精采片段擷取方法,必須先對特定的片段加以詳細觀察此片段的特性並且利用這些觀察結果去訓練一個對應此片段的模組，最後在利用這模組去擷取出特定的精采片段。我們得方法是直接結合視覺和聽覺的特性去分析輸入的影像序列，並且透過我們系統中的判斷機制自動產生視覺上和聽覺上的節奏分佈曲線。這些節奏分部曲線是由系統中時域裝置自動產生，這些節奏分佈曲線將連接各種低階語意去理解輸入影像序列故事張力的高階語意。根據實驗結果，我們擷取出的精采片段能夠符合人類感官的認知。我們提出的演算法非常實用於在索引、尋找、瀏覽多媒體資料。

In this thesis, we propose a comprehensive method for extracting highlights from video media. Unlike most current approaches that require specific domain knowledge or models to extract special specific events, our proposed method directly combines visual and audio features to analyze input video programs and automatically generate video tempo curve and audio tempo curve through the decision rules. The tempo curves are generated by time domain mechanism. These tempo curves will bridge the semantic gap to the catch high-level semantics called the story intensity. The proposed algorithm provides a natural way via tempo to segment a video into manageable parts. Furthermore, our proposed method can cover different kinds of sports and movies. According to the experimental results, the detected interesting skimming clips match the human perspective and the proposed algorithm is very useful for indexing, search and browsing multimedia data.

中文摘要	I
Abstract	II
Acknowledgement	III
List of Tables	VI
List of Figures	VII
Chapter 1 Introduction	1
1 Motivation	1
2 An Overview of Film Grammar	2
3 Story Intensity Representation	3
4 Highlights of the Movies	4
5 Highlights of the Sports	4
6 Thesis Structure	5
Chapter 2 Related Research	6
1 An Overview of Related Research	6
1.1 Simultaneous or Sequential Fusion	6
1.2 Statistical or Knowledge-based Fusion	7
2 Visual Feature	11
2.1 Motion Estimation	11
2.2 Shot Change Detection	13
2.3 Transform RGB into HSV	15
2.4 Transform RGB into YUV	17
2.5 Edge Detection	18
3 Audio Feature	19
3.1 Energy	20
3.2 Zero Crossing Rate	20
Chapter 3 The Proposed Approach	22
1 An Overview of Proposed Algorithm	22
2 Proposed Algorithm	24
2.1 Preprocessing	25
2.1.1 Shot Detection	26
2.1.2 Keyframe Selection	28
2.1.3 Conclusion	32
2.2 Low-Level Feature Extraction	32
2.2.1 Histogram Calculator	33
2.2.2 Motion Vector Calculator	34
2.2.3 Zero-Crossing Rate Calculator	36
2.2.4 Energy Calculator	37
2.2.5 Video threshold determining unit	37
2.2.6 Audio threshold determining unit	38
2.2.7 Conclusion	39
2.3 Audiovisual tempo analysis	39
2.3.1 Video tempo generator	41
2.3.2 Audio tempo generator	41
2.4 Multimodal Data Fusion	42
2.4.1 Story tempo generator	42
2.4.2 Conclusion	43
Chapter 4 Experimental Results and Discussion	44
1 Movies	44
1.1 Banlieue 13	45
1.2 Kung Fu Hustle	49
1.3 Discussion	51
2 Sports	52
2.1 Basketball	53
2.2 Football	58
2.3 Soccer	62
2.4 Discussion	63
Chapter 5 Conclusions and Future Work	64
1 Conclusions	64
2 Future Work	65
References	66

                                    

[1]. Y. Li, S. H. Lee, C. H. Yeh & C.-C. Jay Kuo, “Techniques for movie content analysis and skimming,” IEEE Signal Processing Magazine, vol. 23, no. 2, pp. 79-89, 2006.
[2]. C. H. Yeh, S. H. Lee & C. -C. Jay Kuo, “Content-based video analysis for knowledge discovery, ” Handbook of Pattern Recognition and Computer Vision 3th Edition Version, Editor by Prof. C. H. Chen and Prof. P.S.P. Wang, World Scientific Publishing Co. ISBN: 981-256-105-6.
[3]. Z. Xiong, R. Radhakrishnan, A. Divakaran, and T.S. Huang, “Highlights Extraction from Sports Video Based on An Audio-Visual Marker Detection Framework,” Proceedings of IEEE International Conferences on Multimedia and Expo., pp.29-32, 2005.
[4]. A. Hanjalic, “Generic approach to highlights extraction from a sport video,” Proceedings of IEEE International Conference on Image Processing, pp.1-4, 2003.
[5]. A. Hanjalic, “Multimodal approach to measuring excitement in video,” Proceedings of IEEE International Conferences on Multimedia and Expo, pp. 289-292, 2003
[6]. L.-Y. Duan, M. Xu, T.-S. Chua, Q. Tian, and C.-S.Xu, “A mid-level representation framework for semantic sports video analysis,” Proceedings of ACM Conference on Multimedia, pp. 33–44, 2003.
[7]. Y.-L. Chang, W. Zeng, I. Kamel, and R. Alonso, “Integrated image and speech analysis for content-based video indexing,” Proceedings of the IEEE International Conference Multimedia Computing and Systems, pp. 306–313, 1996.
[8]. K. WAN and C. XU, “Efficient Multimodal Features For Automatic Soccer Highlight Generation,” IEEE Proceedings of the 17th International Conference on Pattern Recognition,pp.973-976,Aug. 2004
[9]. P. Chang, M. Han, and Y. Gong, “Extract highlights from baseball game video with hidden Markov models,” International Conference on Image Processing, pp.I-609 - I-612, 2002.
[10]. C.Y. Chao, H.C. Shih, and C.L. Huang, “Semantics-based highlight extraction of soccer program using DBN,” Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Proceedings, pp.1057-1060, 2005.
[11]. H. C. Shih and C. L. Huang, “Detection of the highlights in baseball video program,” IEEE International Conference on Multimedia and Expo., pp.595-598, June 2004.
[12]. Arijon, D., Grammar of the Film Language, Silman-James Press, 1976.
[13]. Iain E. G. Richardson, H.264 and MPEG-4 Video Compression, WILEY, 2003.
[14]. S. H. Lee, C. H. Yeh & C. -C. Jay Kuo, “Automatic movie skimming with story units via general tempo analysis, ” Proceedings of SPIE Electronic Image Storage and Retrieval Methods and Applications for Multimedia, vol. 5307, pp. 396-407, 2004.
[15]. C. Cotsaces, N. Nikolaidis, and I. Pitas, “Video shot detection and condensed representation,” Signal Processing Magazine, IEEE, vol. 23, pp. 28-37, March 2006.
[16]. N. Kazakova, M. Margala and N.G. Durdle, “Sobel edge detection processor for a real-time volume rendering system,” Proceedings of the 2004 International Symposium on Circuits and Systems, vol. 2 , pp. II - 913-16, May 2004.
[17]. S.E. El-Khamy, M. Lotfy and N. El-Yamany, “A modified fuzzy Sobel edge detector,” Radio Science Conference, 2000, pp. C32/1 - C32/9, Feb. 2000.
[18]. X. Jing and L. P. Chau, “An efficient three-step search algorithm for block motion estimation,” IEEE Transactions on Multimedia, vol. 6, pp. 435 – 438, June 2004.
[19]. X. Jing and L. P. Chau, “New three-step search algorithm for block motion estimation,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 4, pp. 438 – 442, Aug. 1994.
[20]. C. H. Kuo, M. Shen and C.-C. Jay Kuo, “Fast motion search with efficient inter-prediction mode decision for H.264,” Journal of Visual Communication Image Representation, pp. 217-242, 2006.
[21]. Bruce Block, The Visual Story: Seeing the Structure of Film, TV, and New Media, Focal Press, 2001.
[22]. http://en.wikipedia.org/wiki/Main_Page
[23]. H. W. Chen, “Action movies segmentation and summarization based on tempo analysis” National Taiwan University, Taiwan.

校內：2007-08-18公開
校外：2008-08-18公開

簡易檢索 / 詳目顯示

相關論文