| 研究生: |
麥世昕 Mai, Shi-xin |
|---|---|
| 論文名稱: |
影片檢索中應用文字與畫面推論作概念物件切割之研究 Concept-based Object Segmentation for Video Retrieval Using Text-Vision Inference |
| 指導教授: |
吳宗憲
Wu, Chung-Hsien |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2009 |
| 畢業學年度: | 97 |
| 語文別: | 中文 |
| 論文頁數: | 52 |
| 中文關鍵詞: | 影片檢索 、基因演算法 、概念偵測器 、概念性物件切割 、影片概念 |
| 外文關鍵詞: | Concept detector, Concept-based object segmentation, Genetic Algorithm, Video retrieval, Video concept |
| 相關次數: | 點閱:145 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在影片檢索的議題上,影片的文字詞彙和畫面特徵是了解影片內容重要的資訊。近年來,研究者進一步藉由找尋更高層級的影片概念來了解影片內容,評估影片之間的相似程度。但是由於目前在偵測影片概念上相關的研究,大多直接利用概念偵測器於概念偵測上,而缺乏影像背景雜訊的去除,導致影片概念上的偵測不是那麼準確。因此,本研究利用新聞中的文字詞彙與新聞畫面作概念推論,並藉由畫面切割來找尋概念物件,希望能達到減少畫面中背景雜訊的影響,提高影片檢索正確率的目的。
本論文主要的研究目標分為三項重點:1.)利用文字詞彙與影像推論作概念性物件切割,其中包括藉由文字擴展提供影像切割上的線索,藉由影像切割來找出概念物件,藉由概念擴展來增加影像切割的線索;2.)在文字擴展中,藉由新聞中相鄰詞彙收集,提供影像切割上更具關連性的線索;3.) 在文字擴展中,藉由詞彙本體對應,將線索轉換成概念;4.)在影像切割中,利用基因演算法在畫面切
割上找尋出最佳的切割方法。
在實驗中,我們分別對影像背景雜訊、概念偵測器正確率、影像切割結果、文字與概念擴展和影片檢索進行評估,由實驗結果得知,本論文所提出的方法,確實能在概念偵測上減少影像背景雜訊的干擾,提高影片檢索的正確率。
In the issue of video retrieval, text terms and image feature of videos are important information
for video content understanding. In recent years, researchers understand video content by searching
high-level video concepts for similarity evaluation between videos. Nevertheless, research related to
video concept detection presently, has focused mainly on concept detection by concept detector
directly, which neglects deletion of image background noise, result in inaccuracy of video concept
detection. Therefore, the study exploits the news terms and news images to infer the concepts in the
news image, and search concept-based objects by image segmentation. We hope to reduce the effect
of image background noise, increase the accuracy of video retrieval system.
The major research purpose of the thesis has four key points. 1.) Use news terms and news images
for concept-based object segmentation which include that provide clue for image decomposition by
text extension, search concept-based object by image decomposition, increase clue for image
decomposition by concept extension. 2.) In text extension, provide more correlation clue for image
decomposition by neighboring terms collection. 3.) In text extension, transform the clues into
concepts by ontology mapping. 4.) In image decomposition, search the best segmentation result in
image segmentation by Genetic Algorithm.
In the experiment, we carry out the evaluation of image background noise, concept detector
accuracy, image segmentation result, text extension and video retrieval system. According to the
experimental results, we reduce the effect of image background noise for concept detection, and
increase the accuracy of video retrieval system.
[1] H.D.Wactlar, T.Kanade, M.A.Smith, and S.M.Stevens, “Intelligent access to digital video: Informedia project.” IEEE Computer, vol. 29, no. 5, pp. 46-53, 1996.
[2] M.R.Lyu, E.Yau, and K.S.Sze, “iview: An intelligent video over internet and wireless access system.” in Proceedings of World Wide Web (WWW’02), 2002.
[3] T. Volkmer, and A. Natsev, “Exploring automatic query refinement for text-based video retrieval.” in Proceedings of International Conference on Multimedia and Expo (ICME’06), 2006.
[4] D. Heesch, M. Pickering, S. R. uger, and A. Yavlinsky, “Video retrieval using search and browsing with key frames.” in Proceedings of TREC Video Retrieval Evaluation (TRECVID’03), 2003.
[5] D.G. Lowe, “Object recognition from local scale-invariant features.” in Proceedings of International Conference on Computer Vision (ICCV’99), 1999.
[6] C. G. M. Snoek, B. Huurnink, L. Hollink, M. de Rijke, G. Schreiber, and M. Worring, “Adding semantics to detectors for video retrieval.” IEEE Transaction on Multimedia, vol. 9, pp. 975-986, 2007.
[7] Hoi SCH, Lyu MR, “A multimodal and multilevel ranking scheme for large-scale video retrieval.” IEEE Transaction on Multimedia, vol. 10, pp. 607-619, 2008.
[8] Q. Ke and T. Kanade. “Robust subspace clustering by combined use of knnd metric and svd algorithm.” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’04), 2004.
[9] Jiangjian Xiao and Mubarak Shah, “Motion Layer Extraction in the Presence of Occlusion using Graph Cut.”, IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol. 27, No. 10, pp. 1644-1659, 2005.
[10] Eun-Young Kang, Isaac Cohen, and Gerard Medioni, “A Layer Extraction System based on Dominant Motion Estimation and Global Registration.” in Proceedings of the IEEE International Conference on Multimedia Expo (ICME’04), 2004.
[11] Turgay Celik, Hasan Demirel, Huseyin Ozkaramanli, Mustafa Uyguroglu, “Fire Detection Using Statistical Color Model in Video Sequences.” Journal of Visual Communication and Image Representation , vol. 18,pp. 176-185 , 2007.
[12] Hsin-min Wang, “MATBN 2002: A Mandarin Chinese Broadcast News Corpus” ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition, 2003.
[13] Christopher M. Bishop, Pattern Recognition and Machine Learning, 2007
[14] Zhendong DONG, Qiang DONG, “Introduction to HowNet.”
http://www.keenage.com/
[15] Feng-Tse Lin, Evolutionary Computation Part 2: Genetic Algorithms and Their Three Applications, 2005.
[16] Melanie Mitchell, An introduction to genetic algorithms, 1996.
[17] Karen Sparck Jones, “A statistical interpretation of term specificity and its application in retrieval”, Journal of Documentation, vol. 28, no. 1, pp. 11-21, 1972.
[18] Rafael C. Gonzalez, Richard E. Woods, Digital Image Processing, 2007.
[19] Josef Sivic and Andrew Zisserman, “Video Google: A Text Retrieval Approach to Object Matching in Videos”, in Proceedings of the IEEE International Conference on Computer Vision (ICCV’03), 2003.
[20] Thomas K. Landauer and Susan T. Dumais, “A Solution to Plato’s Problem: The Latent Semantic Analysis Theory of Acquisition, Induction and Representation of Knowledge.” Psychological Review, vol. 104, no. 2, pp.
211-240, 1997.
[21] L. Fei-Fei, R. Fergus and P. Perona. “One-Shot learning of object
categories.” IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 28, no. 4, 2006.
校內:2029-07-31公開