| 研究生: |
吳敏慈 Wu, Min-Tzu |
|---|---|
| 論文名稱: |
基於深度類神經網路之照片構圖分析 Photo Composition Analysis Based on Deep Neural Networks |
| 指導教授: |
胡敏君
Hu, Min-Chun |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2018 |
| 畢業學年度: | 105 |
| 語文別: | 英文 |
| 論文頁數: | 26 |
| 中文關鍵詞: | 照片構圖分析 、卷積神經網絡 、視覺藝術 |
| 外文關鍵詞: | Photo Composition Analysis, Convolutional Neural Network, Visual Art |
| 相關次數: | 點閱:80 下載:7 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
為了拍出更佳的照片,攝影的初學者通常會學習一些基礎的構圖規則。然而,目前就我們所知,並無攝影學習工具是針對構圖所設計的。因此,在這篇論文當中,我們發展了一個可以辨識出照片之中,是否有我們所選擇的12種基礎構圖的方法。值得注意的是,這12種構圖,有些是以往的研究之中被未被探討過的。我們的方法採用的是以深度類神經網路,萃取出照片之中的高層次語意特徵,藉此來協助判斷照片之中的構圖。為了有效的訓練我們的網路,我們建立了一個構圖資料庫,裡面的照片分別是從三個知名攝影網站(DPChallenge, Flicker, and Unsplash)所搜集而來,這些照片又經過Amazon Mechanical Turk (AMT)這個群眾外包平台來標記出照片裡所含有的構圖。實驗結果也顯示了我們方法的可行性,以及展示了未來對幫助潛在攝影初學者學習攝影構圖的可能。
In order to take better photos, it is a fundamental step for the beginners of photography to learn basic photo composition rules. However, there are no tools developed to help beginners analyze the composition rules in given photos. Thus, in this study we developed a method with the capability to identify 12 common composition rules in a photo. It should be noted that some of the 12 common composition rules have not been considered by the previous studies, and this deficit gives this study its significance and appropriateness. In particular, we utilized deep neural networks (DNN) to extract high-level semantic features for facilitating the further analysis of photo composition rules. In order to train the DNN model, we constructed a dataset, which is collected from some famous photo websites, such as DPChallenge, Flicker, and Unsplash. All the collected photos were later labelled with 12 composition rules by a wide range of raters recruited from Amazon Mechanical Turk (AMT). Three DNN architectures (AlexNet , GoogLeNet and ResNet) were then employed to predict the composition of the collected dataset. The representative features of each composition rule were further visualized in our system. The results showed the feasibility of the proposed method and revealed the possibility of using this method to assist potential users to improve their photographical skills and expertise.
[1] R. Datta, D. Joshi, J. Li, and J. Z. Wang. Studying aesthetics in photographic images using a computational approach. In European Conference on Computer Vision, pages 288–301. Springer, 2006.
[2] Z. Dong, X. Shen, H. Li, and X. Tian. Photo quality assessment with dcnn that understands image well. In International Conference on Multimedia Modeling, pages 524–535. Springer, 2015.
[3] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
[4] X. Jin, J. Chi, S. Peng, Y. Tian, C. Ye, and X. Li. Deep image aesthetics classification using inception modules and fine-tuning connected layer. In Wireless Communications & Signal Processing (WCSP), 2016 8th International Conference on, pages 1–6. IEEE, 2016.
[5] S. Karayev, M. Trentacoste, H. Han, A. Agarwala, T. Darrell, A. Hertzmann, and H. Winnemoeller. Recognizing image style. arXiv preprint arXiv: 1311.3715, 2013.
[6] S. Kong, X. Shen, Z. Lin, R. Mech, and C. Fowlkes. Photo aesthetics ranking network with attributes and content adaptation. In European Conference on Computer Vision, pages 662–679. Springer, 2016.
[7] A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097–1105, 2012.
[8] J.-T. Lee, H.-U. Kim, C. Lee, and C.-S. Kim. Photographic composition classification and dominant geometric element detection for outdoor scenes. Journal of Visual Communication and Image Representation, 2018.
[9] L. Liu, R. Chen, L. Wolf, and D. Cohen-Or. Optimizing photo composition. In Computer Graphics Forum, volume 29, pages 469–478. Wiley Online Library, 2010.
[10] X. Lu, Z. Lin, H. Jin, J. Yang, and J. Z. Wang. Rapid: Rating pictorial aesthetics using deep learning. In Proceedings of the 22nd ACM international conference on Multimedia, pages 457–466. ACM, 2014.
[11] X. Lu, Z. Lin, X. Shen, R. Mech, and J. Z. Wang. Deep multi-patch aggregation network for image style, aesthetics, and quality estimation. In Proceedings of the IEEE International Conference on Computer Vision, pages 990–998, 2015.
[12] W. Luo, X. Wang, and X. Tang. Content-based photo quality assessment. In Computer Vision (ICCV), 2011 IEEE International Conference on, pages 2206–2213. IEEE, 2011.
[13] L. Mai, H. Jin, and F. Liu. Composition-preserving deep photo aesthetics assessment. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 497–506, 2016.
[14] L. Marchesotti, F. Perronnin, D. Larlus, and G. Csurka. Assessing the aesthetic quality of photographs using generic image descriptors. In Computer Vision (ICCV), 2011 IEEE International Conference on, pages 1784–1791. IEEE, 2011.
[15] N. Murray, L. Marchesotti, and F. Perronnin. Ava: A large-scale database for aesthetic visual analysis. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, pages 2408–2415. IEEE, 2012.
[16] J. Sánchez, F. Perronnin, T. Mensink, and J. Verbeek. Image classification with the fisher vector: Theory and practice. International journal of computer vision, 105(3):222–245, 2013.
[17] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich, et al. Going deeper with convolutions. Cvpr, 2015.
[18] H. Tong, M. Li, H.-J. Zhang, J. He, and C. Zhang. Classification of digital photos taken by photographers or home users. In Pacific-Rim Conference on Multimedia, pages 198–205. Springer, 2004.
[19] W. Wang, M. Zhao, L. Wang, J. Huang, C. Cai, and X. Xu. A multi-scene deep learning model for image aesthetic evaluation. Signal Processing: Image Communication, 47:511–518, 2016.
[20] Y. Wang, Q. Dai, R. Feng, and Y.-G. Jiang. Beauty is here: evaluating aesthetics in videos using multimodal features and free training data. In Proceedings of the 21st ACM international conference on Multimedia, pages 369–372. ACM, 2013.
[21] C.-L. Wen and T.-L. Chia. The fuzzy approach for classification of the photo composition. In Machine Learning and Cybernetics (ICMLC), 2012 International Conference on, volume 4, pages 1447–1453. IEEE, 2012.
[22] C.-H. Yeh, Y.-C. Ho, B. A. Barsky, and M. Ouhyoung. Personalized photograph ranking and selection system. In Proceedings of the 18th ACM international conference on Multimedia, pages 211–220. ACM, 2010.