| 研究生: |
葉力維 Yeh, Li-Wei |
|---|---|
| 論文名稱: |
擬人化之影像概念描述研究 On Personified Image Conceptualization: A Preliminary Research |
| 指導教授: |
陳裕民
Chen, Yu-Min |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 製造資訊與系統研究所 Institute of Manufacturing Information and Systems |
| 論文出版年: | 2013 |
| 畢業學年度: | 101 |
| 語文別: | 中文 |
| 論文頁數: | 57 |
| 中文關鍵詞: | 影像註解 、擬人化 、影像概念描述 |
| 外文關鍵詞: | image annotation, anthropomorphic, image concept description |
| 相關次數: | 點閱:167 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
「影像」如同文字內容,係由作者所創作、拍攝或編輯而成,故影像必潛藏作者所要傳達的訊息。過去之研究多透過影像註解(Image Annotation)達到獲取作者擬表達意涵之目的,但仍存有不足之處,如影像註解產出之物件標記無法完整描述影像之意涵。故本研究提出一「擬人化影像概念描述」之方法,模擬人類推論影像的思考流程,首先標記物件,並透過物件間的關係判斷場景,再從物件與場景來推論影像中的事件,最後以簡單之句子描述影像所要傳達的概念。由於影像多以生活照為主,故本研究針對生活照影像進行測試,實驗數據顯示,描述影像語句在受測者判斷下,準確率為87.3%,證明大部分的描述影像文句皆能被人們所接受。
“Images” as writings, are created, taken or edited by authors, which might carry the intent of the authors. In previous researches, “image annotation” method was used to indicate the concepts of image. However, object labels which are the outputs of “image annotation” can’t describe the meanings completely. This study proposes a method for image conceptualization by referencing the image realization process of humans. First of all, objects are annotated and scene is classified according to the relations between objects. Then an event of the image is inferred based on the objects and the scene identified in the previous step. Finally, the concept of the image is described using a simple sentence. Owing to the fact that most of images are living photos, this study focused on living photos for experiments of image conceptualization. The experiment showed that the accuracy of sentences for image concept description, which were evaluated by humans, is up to 87.3%. It also proved the applicability of the proposed image conceptualization method.
[1] D. Zhang, M. M. Islam, and G. Lu, "A review on automatic image annotation techniques," Pattern Recognition, vol. 45, pp. 346-362, 2012.
[2] Y. Liu, D. Zhang, G. Lu, and W.-Y. Ma, "A survey of content-based image retrieval with high-level semantics," Pattern Recognition, vol. 40, pp. 262-282, 2007.
[3] I. K. Sethi, I. L. Coman, and D. Stan, "Mining association rules between low-level image features and high-level concepts," 2001.
[4] N. Rasiwasia, N. Vasconcelos, and P. J. Moreno, "Query by semantic example," in Image and Video Retrieval, ed: Springer, 2006, pp. 51-60.
[5] L. Fei-Fei, A. Iyer, C. Koch, and P. Perona, "What do we perceive in a glance of a real-world scene?," Journal of Vision, vol. 7, 2007.
[6] S. Zhu and Y. Liu, "Semi-Supervised Learning Model Based Efficient Image Annotation," Signal Processing Letters, IEEE, vol. 16, pp. 989-992, 2009.
[7] Y. Mori, H. Takahashi, and R. Oka, "Image-to-word transformation based on dividing and vector quantizing images with words," in First International Workshop on Multimedia Intelligent Storage and Retrieval Management, 1999.
[8] H.-T. Chang, N. Mastorakis, V. Mladenov, Z. Bojkovic, D. Simian, S. Kartalopoulos, A. Varonides, C. Udriste, E. Kindler, and S. Narayanan, "Automatic web image annotation for image retrieval systems," in WSEAS International Conference. Proceedings. Mathematics and Computers in Science and Engineering, 2008.
[9] Z. Gong, Q. Liu, and J. Zhang, "Automatic image annotation by mining the web," in Data Warehousing and Knowledge Discovery, ed: Springer, 2006, pp. 449-458.
[10] H. J. Escalante, M. Montes-y-Goméz, and L. E. Sucar, "An energy-based model for region-labeling," Computer Vision and Image Understanding, vol. 115, pp. 787-803, 2011.
[11] H. Fu, Z. Chi, and D. Feng, "Recognition of attentive objects with a concept association network for image annotation," Pattern Recognition, vol. 43, pp. 3539-3547, 2010.
[12] Y. Zhao, Y. Zhao, and Z. Zhu, "TSVM-HMM: Transductive SVM based hidden Markov model for automatic image annotation," Expert Systems with Applications, vol. 36, pp. 9813-9818, 2009.
[13] Z. Li, Z. Shi, X. Liu, and Z. Shi, "Modeling continuous visual features for semantic image annotation and retrieval," Pattern Recognition Letters, vol. 32, pp. 516-523, 2011.
[14] R. C. Wong and C. H. Leung, "Automatic semantic annotation of real-world web images," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 30, pp. 1933-1944, 2008.
[15] L.-J. Li and L. Fei-Fei, "What, where and who? classifying events by scene and object recognition," in Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on, 2007, pp. 1-8.
[16] L. Fei-Fei, R. Fergus, and A. Torralba, "Recognizing and learning object categories.," Short Course CVPR, 2007.
[17] H. Cheng and R. Wang, "Semantic modeling of natural scenes based on contextual Bayesian networks," Pattern Recognition, vol. 43, pp. 4042-4054, 2010.
[18] L. Fei-Fei and P. Perona, "A bayesian hierarchical model for learning natural scene categories," in Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, 2005, pp. 524-531.
[19] A. Oliva and A. Torralba, "Modeling the shape of the scene: A holistic representation of the spatial envelope," International journal of computer vision, vol. 42, pp. 145-175, 2001.
[20] Y. Wang, H. Jiang, M. S. Drew, Z.-N. Li, and G. Mori, "Unsupervised discovery of action classes," in Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on, 2006, pp. 1654-1661.
[21] A. Farhadi, M. Hejrati, M. A. Sadeghi, P. Young, C. Rashtchian, J. Hockenmaier, and D. Forsyth, "Every picture tells a story: Generating sentences from images," in Computer Vision–ECCV 2010, ed: Springer, 2010, pp. 15-29.
[22] V. Vapnik, "The nature of statistical learning theory," 1995.
[23] S. R. Gunn, "Support Vector Machines for Classification and Regression," 1998.
[24] E. Nadernejad, S. Sharifzadeh, and H. Hassanpour, "Edge detection techniques: evaluations and comparisons," Applied Mathematical Sciences, vol. 2, pp. 1507-1520, 2008.
[25] J. Canny, "A computational approach to edge detection," Pattern Analysis and Machine Intelligence, IEEE Transactions on, pp. 679-698, 1986.
[26] Z.-Q. Hong, "Algebraic feature extraction of image for recognition," Pattern Recognition, vol. 24, pp. 211-219, 1991.
[27] M. Nixon and A. S. Aguado, Feature extraction & image processing: Access Online via Elsevier, 2008.
[28] M. Saad, "Low-Level Color and Texture Feature Extraction for Content-Based Image Retrieval," Final Project Report, EE K, vol. 381, pp. 20-28, 2008.
[29] 張宇翔, "MPEG-7在數位博物館物種影像查詢上之應用," 2002.
[30] 謝宗廷, "三維圓柱曲面上的文字偵測與校正; Text Detection and Deskew on 3D Cylinders," 2009.
[31] H. J. Escalante, M. Montes, and L. E. Sucar, "Word co-occurrence and Markov random fields for improving automatic image annotation."
[32] C.-W. Hsu, C.-C. Chang, and C.-J. Lin, "A practical guide to support vector classification," ed, 2003.
[33] M. Grubinger, "Analysis and evaluation of visual information systems performance," Victoria University, 2007.
[34] R. Kohavi, "A study of cross-validation and bootstrap for accuracy estimation and model selection."