| 研究生: |
林育民 Lin, Yu-Ming |
|---|---|
| 論文名稱: |
結合影像特徵與本文資訊之網頁影像註解 Web Image Annotation Using Visual feature And Textual Information |
| 指導教授: |
曾新穆
Tseng, Vincent S.M. |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2006 |
| 畢業學年度: | 94 |
| 語文別: | 中文 |
| 論文頁數: | 60 |
| 中文關鍵詞: | 索引 、搜尋 、全球資訊網 、影像擷取 、影像註解 、網頁影像 |
| 外文關鍵詞: | indexing, Web image, Search, World Wide Web, image retrieval, image annotation |
| 相關次數: | 點閱:86 下載:1 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來,由於數位電子產品的盛行,如數位相機,數位攝影機,具有照相功能的行動電話,幾乎隨處可見,再加上網際網路普及,網路上的影像資料日益龐大。為了讓使用者在這大量的影像中,透過搜尋引擎等平台,快速、準確的找到目標影像,系統必須有一個有效的註解與建立索引的方法。在過去,雖然有很多分析網路影像的視覺特徵和本文資訊的研究,但是在結合兩方面的資訊時,可能有建立模型可能需要大量資料、事先定義註解字庫,或是預測效果不佳的問題。因此替網頁影像下註解是一個很有挑戰性的研究主題。在本論文中,我們提出一個新的結合影像特徵與本文資訊替網頁影像下註解的方法,主要考量了三個重要部分:1)建構有效的影像特徵模型ModelMMG,2) 建構有效的本文資訊模型ModelC4.5,3) 整合上述模型,建立自動化註解法則。實驗結果顯示,我們所提出的方法是可行的,且具有不錯的準確性。
In recent years, with the rapid development of Internet technologies and electronic products like the digital camera and mobile phones with photography functions, the image databases on the World Wide Web are growing in explosive manners. In order to let the user search the image efficiently and accurately from the massive image database by the search engine or other platforms, the system must have an effective image annotation method. In the past, although a number of researches had done the web image analysis by using visual feature or textual attributes, there still exist a number of problems to be resolved, like that a large annotated training data is needed, the annotation keywords are limited, or the prediction effectiveness is low, etc. Therefore, the web image annotation has been considered as a challenging research topic. In this research, we propose a novel method for web image annotation by using the visual features and textual attributes. Our proposed method is named FMD (Fusion of MMG and Decision-tree), which consists of three main phases: 1) construct the visual feature model, namely ModelMMG, 2) construct the textual attribute model, namely ModelC4.5, and 3) fusion of the above two models. Empirical evaluations show that our approach is very promising in enhancing the accuracy of web image annotation in terms of precision, recall and F measure.
[1] Y. A. Aslandogan, and C.T. Yu, “Diogenes: A Web search agent for content based indexing of personal images” In Proceeding of the ACM SIGIR 2000.
[2] V. Athitsos and M. J. Swain “Distinguishing photographs and graphics on the World Wide Web” In Proceedings of IEEE Workshop on Content-Based Access of Image and Video Libraries 10-17,1997.
[3] K. Barnard, P. Duygulu, N. de Freitas, D. A. Forsyth, D. B. lei, and M. Jordan. ”Matching words and pictures”. Journal of Machine Learning Research 2003.
[4] K. Barnard, and D. Forsyth, “Learning the Semantic of Words and Pictures”. ICCV 2001.
[5] D.M. Blei, and M.I. Jordan, “Modeling Annotated Data.” In Proceeding of the ACM SIGIR 2003.
[6] C. Carson, M. Thomas, S. Belongie, J.M. Hellerstein, and J. Malik, “Blobworld: Image segmentation using Expectation-Maximization and its application to image querying,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 24, Issue 8, pp. 1026 – 1038, August 2002.
[7] Z. Chen, W.Y. Liu, F. Zhang, M.J. Li, and H.J. Zhang, “Web Mining for Web Image Retrieval”, Journal of the American Society for Information Science and Technology, 2001, 831-839.
[8] E. Chang, K. Goh, G. Sychay and G. Wu. “CBSA: content-based soft annotation for multimodal image retrieval using Bayes Point Machines.” IEEE Transactions on Circuits and Systems for Video Technology, Special Issue on Conceptual and Dynamical Aspects of Multimedia Content Description 13, 26-38 2003.
[9] Y. Deng, B.S. Manjunath, and H. Shin, “Color Image Segmentation,” IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2, pp. 1 - 451, June 1999.
[10] H. Feng, R. Shi and T.S. Chua “A bootstrapping framework for annotating and retrieving WWW images” Proceedings of the 12th annual ACM international conference on Multimedia Technical session 15 960-967 2004.
[11] A. Hauptman, R.V. Baron, M.-Y. Chen, M. Christel, P. Duygulu, C. Huang, R. Jin, W.-H Lin, T. Ng, N. Moraveji, N. Papernick, C.G.M. Snoek, G. Tzanetakis, J. Yang, & H.D Wactlar. “analyzing and searching broadcast news video”, Informedia at TRECVID 2003: http://www-nlpir.nist.gov/projects/tv.pubs.org.
[12] J. Jeon, V. Lavrenko, and R. Manmatha, “Automatic Image Annotation and Retrieval using Cross-Media Relevance Models,” Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 119 – 126, 2003.
[13] M. L. Kherfi, D.Ziou, and A. Bernardi “Atlas WISE: A Web-based image retrieval engine.” In Proceeding of the International Conference on Image and Signal Processing , 69-77, 2003.
[14] M.L. Kherfi, D.Ziou, and A. Bernardi, “Image Retrieval from the World Wide Web: Issues, Techniques, and Systems” CSUR ACM Computing Surveys Volume 36 , Issue 1, pp. 35-67, 2004.
[15] C.Y. Lin, B.L. Tseng, and J.R. Smith, IBM MPEG-7 Annotation Tool, 2002.
[16] O. Maron and A.L. Ratan. “Multiple-instance learning for natural scene classification.” In The Fifteenth International Conference on Machine Learning, 1998.
[17] J.Y. Pan , H.J. Yang , C. Faloutsos , P. Duygulu, “Automatic multimedia cross-modal correlation discovery”, Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, August 22-25, 2004, Seattle, WA, USA.
[18] J.C. Platt. “Probabilistic outputs for support vector machinesand comparisons to regularized likelihood methods”. In Advances in Large Margin Classifiers A.J. Smola, P. Bartlett, B. Scholkopf and D. Schuurmans (Eds). MIT Press, 1999.
[19] J. R. Quinlan “C4.5: Programs for Machine Learning” Morgan Kaufmann Publishers, 1993.
[20] G. Salton and M.J. McGill. “Introduction to modern information retrieval” McGraw Hill. 1983.
[21] H.M. Sanderson & M.D. Dunlop. “Image retrieval by hypertext links”. ACM SIGIR’ 1997. 296-303.
[22] S. Sclaroff, L. Taycher, and M. La Cascia, “Image rover: A content-based image browser for the World Wide Web” In Proceedings of IEEE Workshop on Content-Based Access of Image and Video Libraries 1997.
[23] H.-T. Shen, B.-C. Ooi & K.-L. Tan. “Giving meaning to WWW images”. Proceedings of the 8th annual ACM international conference on Multimedia ‘2000. LA, USA. 39-47.
[24] J. Shi, and J. Malik, “Normalized Cuts and Image Segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 22, Issue 8, pp. 888 – 905, August 2000.
[25] J. R. Smith and S.-F. Chang “VisualSEEK: A fully automated content-based image query system” In Proceedings of the 4th ACM international Conference on Multimedia 87-98,1996.
[26] R.K. Srihari, A.B. Rao, B. Han, S. Munirathnam, and X.Y. Wu, “A Model For Multimodal Information Retrieval”. ICME (2000).
[27] J.Z. Wang and J. Li. “Learning-based linguistic indexing of pictures with 2-D MHHMs”. Proceedings of the 10th annual ACM international conference on Multimedia 2002, 436- 445.
[28] X.J. Wang, W.Y. Ma, G.R. Xue, “ Multi-Model Similarity Propagation and its Application for Web image Retrieval” Xing Li Proceedings of the 12th annual ACM international conference on Multimedia 2004.
[29] T. Westerveld, “Probabilistic Multimedia Retrieval.” SIGIR In Proceeding of the ACM 2002.
[30] K. Yanai. “Generic image classificaiton using visual knowledge on the web”. Proceedings of the 11th annual ACM international conference on Multimedia 2003. Berkeley, USA. 167-176.
[31] O.R. Zaïane , J. Han , Z.N. Li , S.H. Chee , J.Y. Chiang, “MultiMediaMiner: a system prototype for multimedia data mining”, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.581-583, June 01-04, 1998, Seattle, Washington, United States.