簡易檢索 / 詳目顯示

研究生: 郭俊毅
Kuo, Chun-Yi
論文名稱: 運用情緒關聯之跨媒體播放系統
A Cross-Media Playing System Using Emotional Associations
指導教授: 曾新穆
Tseng, Shin-Mu
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2013
畢業學年度: 101
語文別: 英文
論文頁數: 57
中文關鍵詞: 影音展示系統跨媒體瀏覽系統情緒內容分析情緒感知校對
外文關鍵詞: Audiovisual playing system, cross-media browser, affective content analysis, emotion-aware alignment
相關次數: 點閱:100下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著相機與智慧型手機的普及,用照片記錄生活已經不是件難事,同時,有越來越多的網路服務,像是Flickr,提供照片上傳以及分享的功能,另一方面,隨著硬體價格的下降,使用iPad或數位相框等行動裝置來瀏覽照片已經變成一種趨勢,因此,引發了一個有趣的議題,如何提供給使用者一個較佳的相片瀏覽情境。確實,在新一代的相片瀏覽系統這將會是個熱門的議題。
    大部分的相片瀏覽軟體只提供給使用者一個非常直接的方式,單純只從視覺的維度上來瀏覽相片,另一個重要的感官因素 - 聽覺卻是常常被忽略。雖然市面上有許多影音編輯軟體可以使用,但用人工製作一個相片幻燈片仍需耗費大量的時間,因此這引發了我們提出一種新穎的跨媒體播放系統,只要使用者提供好要播放的相片資料集,系統會自動化的找出在情緒上與相片吻合的歌曲來做搭配播放,來增進使用者的觀賞體驗。本篇論文的主要提出的技術如下: (1) 以視覺樣式亂度為基礎之圖片情緒感知技術 。(2) 以聽覺序列樣式為基礎之音樂情緒感知技術。(3) 高效能之智慧型圖片音樂校對技術。實驗結果顯示,不論是在主觀性人工評分實驗或是客觀性的效果量測,都顯示了我們的跨媒體播放系統相較於其他既有的系統,能提供更佳的相片瀏覽情境,以及更好的使用者體驗。

    Most existing photo playing systems just provide a straightforward management to allow users browse their photos along a single dimension—visualization, while another important sense—audio—is usually neglected. Unfortunately, this neglect decreases the browsing atmosphere heavily. Although there are a lot of tools like video editors that we can use, it still needs much time to create an audiovisual slideshow. It motivates us to propose a novel audiovisual playing system to enrich photo navigation experiences by emotionally associating harmonic music with a given photo collection. The major techniques proposed in this work are: (1) a photo emotion-aware technique based on the pattern entropy, (2) a music emotion-aware technique based on sequential patterns, and (3) an effective alignment of photos and music based on associations of visual and music emotions. Through integration of these techniques, an affective, interesting, and intelligent cross-multimedia playing system can be achieved successfully. The experimental results for subjective user studies and objective evaluations on the real dataset reveal that, our proposed system can provide users with better browsing atmosphere than existing ones.

    中文摘要 I ABSTRACT II 誌謝 III List of Figures V List of Tables VII Chapter 1 Introduction 1 1.1 Background 1 1.2 Motivation 3 1.3 Overview of Our Proposed Method 4 1.4 Contributions 6 1.5 Thesis Organization 7 Chapter 2 Related Work 8 2.1 Music Visualization 8 2.2 Photo Gallery Story 9 2.3 Emotion analysis of different multimedia data 13 Chapter 3 Proposed Cross-Media Playing System 15 3.1 Overview of the Proposed Method 15 3.2 Offline Music Emotion Learning Model 17 3.3 Offline Image Emotion Learning Model 32 Chapter 4 Experimental Evaluations 42 4.1 Experimental Environment and Datasets 42 4.2 Evaluations for Effectiveness 45 4.3 The Experimental Discussion 51 Chapter 5 Conclusions and Future Work 52 5.1 Conclusion 52 5.2 Future Work 53 References 54 VITA 57

    [1] 2010 Audio Music Mood Classification Website, http://www.music-ir.org/mirex/wiki/2010:Audio_Music_Mood_Classification
    [2] ACDSee Pro, http://www.acdsee.com/zh/index , http://www.downloadcrew.com/article/21155-acdsee_pro
    [3] Google Picasa, http://picasa.google.com.tw/intl/zh-TW/ , http://moneymaker.cybertranslator.idv.tw/screenshots/p/Picasa.html
    [4] Marsyas Toolkit, http://marsyas.info/
    [5] Microsoft Photo Story, http://microsoft-photo-story.en.softonic.com/
    [6] C. H. Chen, M. F. Weng, S. K. Jeng, and Y. Y. Chuang. “Emotional-based Music Visualization using Photos.” In Proceedings of the 14th International Conference on Advances in Multimedia Modeling, pages 358-368, 2008.
    [7] J. C. Chen, W. T. Chu, J. H. Kuo, C. Y. Weng, and J. L. Wu. “Tiling Slideshow: An Audiovisual Presentation Method for Consumer Photos.” In Proceedings of ACM Multimedia, 2006.
    [8] A. Hanjalic. “Extracting moods from pictures and sounds: towards truly personalized TV.” Journal of IEEE Signal Processing Magazine, vol. 23, no. 2, pages 90-100, 2006.
    [9] X. Hu, J. S. Downie, C. Laurier, M. Bay, and A. F. Ehmann. “The 2007 MIREX Audio Mood Classification Task: Lessons Learned.” In Proceedings of International Society for Music Information Retrieval Conference (ISMIR), pages 462-467, 2008.
    [10] X. S. Hua, L. Lu, and H. J. Zhang. “Optimization-based automated home video editing system.” IEEE Transactions on Circuit and Systems for Video Technology, vol. 14, no. 5, pages 572-583, 2004.
    [11] E. Kabisch, F. Kuester, and S. Penny. “Sonic panoramas: Experiments with interactive landscape image sonification.” In Proceedings of International Conference on Augmented Tele-Existence, pages 156–163, 2005.
    [12] Y. E. Kim, E. M. Schmidt, R. Migneco, B. G. Morton, P. Richardson, J. Scott, J. A. Speck, and D. Turnbull. “Music Emotion Recognition: A State of the Art Review.” In Proceedings of International Symposium on Music Information Retrieval, 2010.
    [13] S. Kobayashi. “Color Image Scale.” Publishing of Kodansha, 1991.
    [14] C. T. Li and M. K. Shan. “Emotion-based Impressionism Slideshow with Automatic Music Accompaniment.” In Proceedings of the 15th International Conference on Multimedia, pages 839-842, 2007.
    [15] L. Lu, D. Liu, and H. J. Zhang. “Automatic mood detection and tracking of music audio signals.” IEEE Transactions on Audio, Speech, and Language Processing, pages 5-18, 2006.
    [16] J. Machajdik and A. Hanbury. “Affective Image Classification using Features Inspired by Psychology and Art Theory.” In Proceedings of the International Conference on Multimedia, pages 83-92, 2010.
    [17] A. Mehrabian, “Framework for a comprehensive description and measurement of emotional states.” Genetic Social and General Psychology Monographs, vol. 121, no. 3, pages 339–361, 1995.
    [18] Myint, E.E.P. and Pwint, M., “An approach for multi-label music mood classification,” In Proceedings of International Conference on Signal Processing System, vol. 1, pages 290-294, 2010.
    [19] S. B. Needleman and C. D. Wunsch. “A general method applicable to the search for similarities in the amino acid sequence of two proteins.” Journal of Molecular Biology, vol. 48, no. 3, pages 443-453, 1970.
    [20] E. M. Schmidt, and Y. E. Kim. “Prediction of Time-Varying Musical Mood Distributions Using Kalman Filtering.” In Proceedings of the 2010 Ninth International Conference on Machine Learning and Applications, pages 655-660, 2010.
    [21] Y. Shin and E. Y. Kim, “Affective Prediction in Photographic Images Using Probabilistic Affective Model,” In Proceedings of the ACM International Conference on Image and Video Retrieval, pages 390-397, 2010.
    [22] M. Solli and R. Lenz. “Color Emotion for Image Classification and Retrieval.” In Proceedings of IS&Ts Conference on Colour in Graphics, Imaging, and Vision (CGIV), pages 367-371, 2008.
    [23] M. Solli and R. Lenz. “Color Based Bags-of-Emotions,” In Proceedings of the 13th International Conference on Computer Analysis of Images and Patterns, pages 573-580, 2009.
    [24] J. H. Su, C. L. Chou, C. Y. Lin, and V. S. Tseng, “Effective Semantic Annotation by Image-to-Concept Distribution Model.” IEEE Transactions on Multimedia, vol. 13, no. 3, pages 530-538, 2011.
    [25] J. H. Su, H. H. Yeh, P. S. Yu, and V. S. Tseng. “Music Recommendation Using Content and Context Information Mining.” In Proceedings of IEEE International Conference on Intelligent Systems, vol. 25, pages 16-26, 2010.
    [26] J. H. Su, M. H. Hsie, T. Mei and V. S. Tseng. “PHOTOSENSE: MAKE SENSE OF, “PHOTOSENSE: MAKE SENSE OF YOUR PHOTOS WITH ENRICHED HARMONIC MUSIC VIA EMOTION ASSOCIATION.” In Proceedings of IEEE International Conference on Multimedia & Expo (ICME), 2011.
    [27] J. H. Su, B. W. Wang, C. Y. Hsiao, V. S. Tseng. “Personalized rough-set-based recommendation by integrating multiple contents and collaborative information.” Journal of Information Sciences, vol. 180, pages 113–131, 2010.
    [28] G. Tzanetakis and P. Cook “MARSYAS: A Framework for Audio Analysis” Organized Sound, Cambridge University Press, 2000.
    [29] W. N.Wang, Y. L. Yu, and S. M. Jiang. “Image retrieval by emotional semantics: A study of emotional space and feature extraction,” In Proceedings of IEEE International Conference on Systems, Man and Cybernetics, pages 3534–3539, 2006.
    [30] X. Wu, Z. N. Li. “Exploring visual-auditory associations for generating music from image.” In Proceedings of International Conference on Multimedia, 2008.
    [31] X. Wu and Z. N. Li. “A Study of Image-based Composition.” In Proceedings of IEEE International Conference on Multimedia & Expo (ICME), pages 1345-1348, 2008.
    [32] Y. Xiang, M. S. Kankanhalli. “A Synaesthetic Approach for Image Slideshow Generation.” In Proceedings of IEEE International Conference on Multimedia & Expo (ICME), pages 985-990, 2012.
    [33] Y. H. Yang, J. C. Wang, I. H. Jhuo, Y. Y. Lin and H. M. Wang. “The Acousticvisual Emotion Guassians Model for Automatic Generation of Music Video. ” In Proceedings of the 20th ACM international conference on Multimedia, pages 89-98, 2012.
    [34] A. Yazdani, E. Skodras, N. Fakotakis, and T. Ebrahimi. “Multimedia content analysis for emotional characterization of music video clips.” EURASIP Journal on Image and Video Processing, 2013.
    [35] H. W. Yoo. “Visual-based emotional descriptor and feedback mechanism for image retrieval.” Journal of Information Science and Engineering, pages 1205–1227, 2007.

    無法下載圖示 校內:2018-08-29公開
    校外:不公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE