| 研究生: |
李建興 Lee, Chien-Hsing |
|---|---|
| 論文名稱: |
採用MPEG-7動作活動性描述子之視訊索引與擷取 Video Indexing and Retrieval Using MPEG-7 Motion Activity Descriptors |
| 指導教授: |
何裕琨
Ho, Yu-Kun |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2005 |
| 畢業學年度: | 93 |
| 語文別: | 中文 |
| 論文頁數: | 63 |
| 中文關鍵詞: | 動作向量 、視訊索引 、動作描述子 |
| 外文關鍵詞: | Motion Vector, Video Indexing, Motion Descriptor |
| 相關次數: | 點閱:60 下載:1 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來由於多媒體視訊內容快速的增加,能否有效管理這些內容是一個值得探討的問題,而MPEG-7標準的提出,則是為管理這些內容提供了一個標準的工具和有效的辦法。對於視訊檢索系統的功能,鏡頭分割方法與關鍵畫面擷取技術是不可獲缺的。在傳統上,視訊檢索的技術大多在原視訊上使用像素差異值或顏色直方圖之計算來達成。但由於現時多媒體視訊資料之傳輸,大多以壓縮的方式來傳遞,因此利用壓縮領堿資料作鏡頭分割及關鍵畫面擷取,在以內容為基礎(Content-Based)的視訊檢索技術是值得研究之課題。
在鏡頭分割上,本論文是利用視訊壓縮標準中所定義的B畫面之巨方塊(Marco-Block)個數分佈狀況來作主要的鏡頭分割之依據。主要是發現當視訊在其鏡頭轉換之處,相對應之連續B畫面間其雙向預測編碼(Bidirectional-Predicted)的巨方塊個數會有著極大的差異,因此利用此雙向編碼巨方塊個數間的差異加上所設定之臨界值來測出鏡頭邊界以達到鏡頭分割之目的。
而在關鍵畫面的擷取上,本論文則利用與時間、空間有關的MPEG-7動作活動性描述子(Motion Activity Descriptors)作為關鍵畫面擷取之依據,此動作描述子之組合可顯示出畫面之時間性動作之強度和動作活動性空間之分佈。時間活動描述子是藉由求出P畫面中所有巨方塊的動作向量之標準差,然後將標準差量化成五個不同強度的等級來獲得,而空間活動描述子則使用P畫面中經量化後的動作向量,取其畫面中間區域的平均值計算來獲取。此方法最主要的動機是去擷取具有較高動作強度及畫面中央有最大動作趨向之畫面作為關鍵畫面(Key Frame)。
實驗結果顯示鏡頭邊界偵測和關鍵畫面擷取上,由於計算量上較少因此甚為快速,而對於直接轉換之鏡頭其邊界偵測的正確率也很高,至於在關鍵畫面擷取上,選取畫面呈現最大的活動量及畫面中央物體具有最大的動作趨向是主要的目標,所以在效果上比擷取鏡頭中的第一個畫面和中間畫面為關鍵畫面等方法更具代表性。
Recently, due to the increasing Media Video content, managing these Media becomes an important problem. MPEG-7 can afford a standard tool and efficient way to control that. In traditional method, most video indexing technology use pixel-difference or calculate by color-histogram. But, owing to the modern media video transition use by compression. Thus, using compression data for shot segmentation and key frame retrieval, and the Content-Based by video indexing technology are worth to study.
In the shot segmentation, this article utilize the definition of the video compression standard’s B frame, Marco-Block’s distribution for the foundation of shot segmentation. The major discover of this paper is the video transfer its shot, the B frame’s Marco-Block’s number of Bidirectional-Predicted will become a huge differences. Therefore, utilizing Marco-Block’s number of the Bidirectional-Predicted’s differences and the seated approach number to find out shot boundary to reach the purpose of shot segmentation.
As the key frame retrieval, this research is according to time, space and MPEG-7’s Motion-Activity for key frame retrieval. The activity descriptor combinations can demonstrate the distribution of frame’s timing intensity and active space. Temporal activity descriptor is going to find out the standard difference of P frame’s all Macrco-Block motion vector, then separate five different intensity levels. Spatial descriptor is according to P Frame’s motion vector and get from the center area average number. The major motivation is to get the higher motion intensity and take the center area’s huge motion as the key frame.
The result of this research shows that the shot boundary investigation and key frame retrieval become faster and less of counting. And the correctly rate in boundary investigation of hard-cut shot becomes higher as well. As for key frame retrieval, the major target is select the maxi-activity frame and central object with maxi-motion. In result, it is more persuasiver than catching the first frame and middle frame for key frame.
[1]B. S. Manjunath et al., “Introduction to MPEG-7”, John Wiley & Sons Ltd., West Sussex, England, 2002.
[2]S. Smoliar and H. J. Zhang, “Content-Based Video Indexing and Retrieval”, Multimedia, IEEE , Volume: 1 , Issue: 2 , Summer 1994 Pages:62 -72.
[3]Ali M.Dawood and M. Ghanbari, “Direct Scene Cut Detection From MPEG Coded Bit Streams”, Multimedia Databases and MPEG-7 (Ref. No. 1999/056), IEE Colloquium on , 29 Jan. 1999 ,Pages:11/1 - 11/5.
[4]Jongho Nang, Seungwook Hong and Youngin Ihm, “An Efficient Video Segmentation Scheme for MPEG Video Stream using Macroblock Information”, Image Analysis and Interpretation, 1998 IEEE Southwest Symposium on , 5-7 April 1998 Pages:12 - 17.
[5]H.Zhang, J.Y.A.Wang and Y. Altunbasak,“An integrated system for content-based video retrieval and browsing”, Pattern Recognition, vol. 30, pp. 643-648, 1997.
[6]Y. Zhuang, Y. Rui, T. S. Huang, and S. Mehrotra, “Adaptive Key Frame Extraction Using Unsupervised Clustering”, Proc. Of IEEE Int Cont on Image Processing (ICIP), pp. 866-870, 1998.
[7]X.Sun, M.S.Kankanhalli, Y.Zhu, and J.Wu, “Content-based Representative Frame Extraction for Digital Video”, ICMCS, pp. 190-193, 1988.
[8]W. Wolf, “Key Frame selection by motion analysis”, ICASSP96, pp. 1228-1231, 1996.
[9]A. Divakaran, R. Regunathan, and K. A. Peker, “Video Summarization Using Descriptors of Motion Activity: A Motion Activity Based Approach to Key Frame Extraction from Video Shots”, Journal of Electronic Imaging, vol.10, pp.909-916, October 2001.
[10]A. Divakaran and H. Sun, “Descriptor for spatial distribution of motion activity for compressed video”, Proceedings of SPIE on Storage and Retrieval for Media Databases 2000, vol. 3972, pp. 24-28, Jan 2000.
[11]Ajay Divakaran, Anthony Vetro, Kohtaro Asai and Hirofumi Nishikawa “Video Browsing System Based on Compressed Domain Feature Extraction”,IEEE ,June 19, 2000
[12]Jae-ho Lee, Gwang-Gook Lee, and Whoi-Yul Kim, “Automatic Video Summarizing Tool using MPEG-7 Descriptors for Personal Video Recorder”, IEEE Transactions on Consumer Electronics, Vol. 49, No. 3, August 2003.
[13]Rainer Lienhart, “Comparison of Automatic Shot Boundary Detection Algorithm”, Mrcrocomputer Research Labs, Intel Corporation
[14]Irena Koprinska and Sergio Carrato, “Video Segmentation of MPEG Compressed Data”, Electronics, Circuits and Systems, 1998 IEEE International Conference on , Volume: 2 , 7-10 Sept. 1998 ,Pages:243 - 246 vol.2
[15]Mehran Yazdi and Andre Zaccarin, “Scene Break Detection and Classification Using A Block-Wise Difference Method”, Image Processing,2001. Proceedings. 2001 International Conference on , Volume: 3 , 7-10 Oct. 2001 Pages:394 - 397 vol.3
[16]Sylvie Jeannin and Ajay Divakaran, “MPEG-7 Visual Motion Descriptors”, IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 11, NO. 6, JUNE 2001.
[17]Xinding Sun, Ajay Divakaran, and B. S. Manjunath, “A Motion Activity Descriptor and Its Extraction in Compressed Domain”, IEEE Pacific-Rim Conference on Multimedia, vol. LNCS 2195, pp. 450-453, October 2001.
[18]Earl Gose, Richard Johnsonbaugh and Steve Jost, “Pattern Recognition and Image Analysis”, 1996 by Prentice Hall PTR