簡易檢索 / 詳目顯示

研究生: 陳伯煒
Chen, Bo-Wei
論文名稱: 應用多型態語意分析及探勘技術實現結構化影片摘要之研究
Structural Video Summarization Based on the Multimodal Semantic Analysis and Mining
指導教授: 王駿發
Wang, Jhing-Fa
學位類別: 博士
Doctor
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2009
畢業學年度: 98
語文別: 英文
論文頁數: 88
中文關鍵詞: 結構化影片摘要影片語意分析語意擴充樹圖分析圖探勘圖熵社會網路分析顯著性移動熵移動向量分析
外文關鍵詞: structuralized video summarization, video semantic analysis, concept expansion tree, graph analysis, graph mining, graph entropy, social network analysis, salient motion entropy, motion vector analysis
相關次數: 點閱:173下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在本篇研究中,我們提出了一種新型的影片摘要技術,不同於以往傳統方法,我們的系統可以將輸入的影片轉換成為具有結構性資訊的摘要。此種結構化的影片摘要包含了人、事、時、地、物的線索,而且相關的線索之間,彼此會互相連結。而在淬取影片摘要重要線索方面,我們亦應用圖形探勘方法,進而對所得到的結構性線索做重要性分析,以便擷取出使用者感興趣的片段。此外,系統中亦採用了新穎的聲音特徵擷取技術,將影片音段做分類的動作。如此一來,使用者便可從多型態線索獲知影片概擴。
    由於單一影片所含的知識與內容通常有限,若我們能有效地將其他相關資源連結起來,勢必可以增加原有內容的完整性。故在影像內容擴充方面,我們利用了影片轉換後的結構化資訊作為基準點,應用社會網路分析法,針對網路上的多媒體資訊做知識淬取的處理。進一步找出與欲擴充影片相關的資訊,以補充原有影片中不足的內容,使得使用者能夠得到豐富的資訊。
    在於動態型影片摘要技術的改良方面,我們提出了顯著性移動熵來偵測影片中重要性畫面。為了補償播放時的流暢度,畫面間相似度的計算,則是運用了新型的共同信息計算方程式來幫助分析。
    最後,我們將所提出的影片摘要技術與實際商業應用結合,開發出改良型隨選視訊系統。結構化的影片資訊配合上空間可擴展性視訊編碼的支援,不但可以讓使用者便於選擇相關性影片訊息,亦可以讓伺服器有效率地儲存數位化檔案。

    In this dissertation, we proposed a novel video summarization technique, which was capable of converting a video into a structural summary. Such proposed structuralized video abstracts are composed of the four types of entities: Who, what, where, and when. Furthermore, corresponding shots and their annotations are also listed in each type of entity. The correlated shots are connected by relational edges. In order to discover important clues among a video, we employed a graph mining algorithm to extract components from these entities. Therefore, users can comprehend the story without wasting too much time. In addition, novel sound feature extraction and classification were also adopted to identify audio segments in the video.
    On the other hand, a single video usually provides limited information for users. If we can find out those resources which are relevant to the video, we may generate more complete knowledge for them. Accordingly, while augmenting existing contents of a video, we made use of the structuralized clues and integrated with the social network analysis to retrieve media from online resources. After related media were extracted, they could be taken as the complementary materials to enrich the original video contents.
    In this dissertation, we also presented a novel technique in generating dynamic skimming videos. Salient motion entropies and an improved mutual information equation were proposed to efficiently highlight video events in a video.
    Lastly, we combined the proposed technique with the commercial applications, developing an enhanced video-on-demand system. With the structural information and the support of the scalable video coding, our system could not only facilitate video selection but also make it more efficient for server sides to store digital files.

    ABSTRACT (CHINESE) i ABSTRACT (ENGLISH) iii ACKNOWLEDGMENT v CONTENTS vi LIST OF TABLES ix LIST OF FIGURES x CHAPTER 1 1 INTRODUCTION 1 1.1 Motivation 1 1.2 Background and Literature Review 2 1.2.1 Video Summarization Techniques 2 1.2.2 Video Content Augmentation Techniques 4 1.2.3 Browsing Interfaces and Applications 5 1.3 Contributions of the Dissertation 6 1.4 Outline of the Dissertation 7 CHAPTER 2 8 STRUCTURALIZED VIDEO SUMMARIZATION BASED ON SEMANTIC CONCEPT ENTITIES 8 2.1 Introduction 8 2.2 Mapping Visual Contents to Text 10 2.2.1 Visual and Textual Content Pre-Analysis 10 2.2.2 Maximum Entropy Criterion-Based Annotator 12 2.3 Concept Expansion 14 2.3.1 Background 14 2.3.2 Constructing Trees 14 2.3.3 Dependency Degree Function 15 2.4 Structuralizing Video Contents Using Multimodal Information 16 2.4.1 Annotation Classification Using WordNet 16 2.4.2 Building Vertices in the Relational Graph 17 2.4.3 Building Relations in the Relational Graph 18 2.4.4 Mining Important Vertices and Edges 20 2.4.5 Audio Feature Classification 24 2.5 Experimental Results 28 2.5.1 Concept Expansion and Word Relations 29 2.5.2 Informativeness and Interrelation 32 2.5.3 Contextual Information Among Shots 40 2.5.4 Adaptation of the Proposed System to the Traditional Storyboard 42 2.5.5 Discussions About the Browsing and Indexing Interface 44 2.6 Summary 45 CHAPTER 3 46 VIDEO KNOWLEDGE AUGMENTATION USING STRUCTURALIZED SEMANTIC CONTENTS 46 3.1 Introduction 46 3.2 Graph-Organizing via Spatial-Temporal Modeling 47 3.3 Content Augmentation Based on the Social Network Analysis 48 3.3.1 Constructing the Fundamental Social Network 49 3.3.2 Converting to the Line Graph 49 3.3.3 Mapping Back to the Vertex Graph 51 3.3.4 Calculating Relevant Clusters 53 3.4 Experimental Results 53 3.4.1 Associations Between the Augmented and Structuralized Contents 53 3.4.2 Evaluation of the Contextual Features in the Social Network Analysis 56 3.4.3 Discussions of Time Complexity 59 3.5 Summary 59 CHAPTER 4 61 STRUCTURALIZED CONTEXT-AWARE VIDEO CONTENT FOR VOD SERVICES 61 4.1 Introduction 61 4.2 Processing Context-Aware Contents 62 4.3 Scalable Resolution Support 63 4.4 Experimental Results 67 4.5 Summary 69 CHAPTER 5 70 SPORTS VIDEO SUMMARIZATION BASED ON THE SALIENT MOTION AND INFORMATION ANALYSIS 70 5.1 Introduction 70 5.2 Saliency Map Extraction 71 5.3 Salient Motion Entropy 72 5.4 Mutual Information Based on Salient Motions 74 5.5 Experimental Results 75 5.6 Summary 77 CHAPTER 6 78 CONCLUSION 78 6.1 Summary 78 6.2 Future Work 79 REFERENCES 80 PUBLICATION LIST 87

    [1] Y. Peng and C.-W. Ngo, “Clip-based similarity measure for query-dependent clip retrieval and video summarization,” IEEE Trans. Circuits and Systems for Video Technology, vol. 16, no. 5, pp. 612–627, May 2006.
    [2] S. Lu, I. King, and M. R. Lyu, “A novel video summarization framework for document preparation and archival applications,” in Proc. 2005 IEEE Aerospace Conf., Big Sky, Montana, United States, 2005, Mar. 05–12, pp. 1–10.
    [3] J.-M. Odobez, D. Gatica-Perez, and M. Guillemot, “Spectral structuring of home videos,” in Proc. 2003 ACM Int. Conf. Image and Video Retrieval, Urbana, Illinois, United States, 2003, Jul. 24–25, pp. 310–320.
    [4] J.-M. Odobez, D. Gatica-Perez, and M. Guillemot, “Video shot custering using spectral methods,” in Proc. 3rd Int. Workshop on Content-Based Multimedia Indexing, Rennes, France, 2003, Sep. 22–24, pp. 94–102.
    [5] C.-W. Ngo, Y.-F. Ma, and H.-J. Zhang, “Video summarization and scene detection by graph modeling,” IEEE Trans. Circuits and Systems for Video Technology, vol. 15, no. 2, pp. 296–305, Feb. 2005.
    [6] Z. Li, G. M. Schuster, and A. K. Katsaggelos, “MINMAX optimal video summarization,” IEEE Trans. Circuits and Systems for Video Technology, vol. 15, no. 10, pp. 1245–1256, Oct. 2005.
    [7] K. A. Peker and F. I. Bashir, “Content-based video summarization using spectral clustering,” in Proc. 2005 Int. Workshop on Very Low Bit-Rate Video-Coding, Sardinia, Italy, 2005, Sep. 15–16.
    [8] Z. Cernekova, I. Pitas, and C. Nikou, “Information theory-based shot cut/fade detection and video summarization,” IEEE Trans. Circuits and Systems for Video Technology, vol. 16, no. 1, pp. 82–91, Jan. 2006.
    [9] A. Bagga, J. Hu, J. Zhong, and G. Ramesh, “Multi-source combined-media video tracking for summarization,” in Proc. 16th IEEE Int. Conf. Pattern Recognition, Quebec City, Quebec, Canada, 2002, Aug. 11–15, pp. 818–821.
    [10] C. M. Taskiran, Z. Pizlo, A. Amir, D. Ponceleon, and E. J. Delp, “Automated video program summarization using speech transcripts,” IEEE Trans. Multimedia, vol. 8, no. 4, pp. 775–791, Aug. 2006.
    [11] X. Zhu, J. Fan, A. K. Elmagarmid, and X. Wu, “Hierarchical video content description and summarization using unified semantic and visual similarity,” Multimedia Systems, vol. 9, no. 1, pp. 31–53, Jul. 2003.
    [12] M. G. Christel and A. S. Warmack, “The effect of text in storyboards for video navigation,” in Proc. 2001 IEEE Int. Conf. Acoustics, Speech, and Signal Processing, Salt Lake City, Utah, United States, 2001, May 07–11, pp. 1409–1412.
    [13] Y.-F. Ma, L. Lu, H.-J. Zhang, and M. Li, “A user attention model for video summarization,” in Proc. 10th ACM Int. Conf. Multimedia, Juan-les-Pins, France, 2002, Dec. 01–06, pp. 533–542.
    [14] J. You, G. Liu, L. Sun, and H. Li, “A multiple visual models based perceptive analysis framework for multilevel video summarization,” IEEE Trans. Circuits and Systems for Video Technology, vol. 17, no. 3, pp. 273–285, Mar. 2007.
    [15] Y.-F. Ma, X.-S. Hua, L. Lu, and H.-J. Zhang, “A generic framework of user attention model and its application in video summarization,” IEEE Trans. Multimedia, vol. 7, no. 5, pp. 907–919, Oct. 2005.
    [16] Y. Li, S.-H. Lee, C.-H. Yeh, and C.-C. J. Kuo, “Techniques for movie content analysis and skimming: Tutorial and overview on video abstraction techniques,” IEEE Signal Processing Magazine, vol. 23, no. 2, pp. 79–89, Mar. 2006.
    [17] B. T. Truong and S. Venkatesh, “Generating comprehensible summaries of rushes sequences based on robust feature matching,” in Proc. Int. Workshop on TRECVID Video Summarization, Bavaria, Germany, 2007, Sep. 23–28, pp. 30–34.
    [18] M. Detyniecki and C. Marsala, “Video rushes summarization by adaptive acceleration and stacking of shots,” in Proc. Int. Workshop on TRECVID Video Summarization, Bavaria, Germany, 2007, Sep. 23–28, pp. 65–69.
    [19] Q. Ma and K. Tanaka, “WebTelop: Dynamic TV-content augmentation by using Web pages,” in Proc. 2003 Int. Conf. Multimedia and Expo, Baltimore, Maryland, United States, 2003, Jul. 06–09, pp. 173–176.
    [20] N. Haas, R. Bolle, N. Dimitrova, A. Janevski, and J. Zimmerman, “Personalized news through content augmentation and profiling,” in Proc. 2002 Int. Conf. Image Processing, Rochester, New York, United States, 2002, Sep. 22–25, pp. 9–12.
    [21] Q. Ma and K. Tanaka, “Topic-structure-based complementary information retrieval and its application,” ACM Trans. Asian Language Information Processing, vol. 4, no. 4, pp. 475–503, Dec. 2005.
    [22] N. Dimitrova, J. Zimmerman, A. Janevski, L. Agnihotri, N. Haas, and R. Bolle, “Content augmentation aspects of personalized entertainment experience,” in Proc. 3rd Workshop on Personalization in Future TV, Johnstown, Pennsylvania, United States, 2003, Jun. 22–26, pp. 42–51.
    [23] T.-Y. Liu, W.-Y. Ma, and H.-J. Zhang, “Effective feature extraction for play detection in american football video,” in Proc. 11th Int. Multimedia Modeling Conference, Melbourne, Australia, 2005, Jan. 12–14, pp. 164–171.
    [24] T. Liu, H.-J. Zhang, and F. Qi, “A novel video key-frame-extraction algorithm based on perceived motion energy model,” IEEE Trans. Circuits and Systems for Video Technology, vol. 13, no. 10, pp. 1006–1013, Oct. 2003.
    [25] U. Brandes, “A faster algorithm for betweenness centrality,” Journal of Mathematical Sociology, vol. 25, no. 2, pp. 163–177, 2001.
    [26] Z. Shen, K.-L. Ma, and T. Eliassi-Rad, “Visual analysis of large heterogeneous social networks by semantic and structural abstraction,” IEEE Trans. Visualization and Computer Graphics, vol. 12, no. 6, pp. 1427–1439, Nov. 2006.
    [27] S. Brin and L. Page, “The anatomy of a large-scale hypertextual web search engine,” in Proc. 7th World-Wide Web Conf., BrisBane, Australia, 1998, Apr. 21–25, pp. 107–117.
    [28] S. V. Dongen, “Graph clustering via a discrete uncoupling process,” SIAM Journal on Matrix Analysis and Applications, vol. 30, no. 1, pp. 121–141, Feb. 2008.
    [29] J. Warmbrodt, H. Sheng, and R. Hall, “Social network analysis of video bloggers' community,” in Proc. 41st Annu. Hawaii Int. Conf. System Sciences, Waikoloa, Hawaii, United States, 2008, Jan. 07–10, pp. 291–299.
    [30] U. Brandes, J. Lerner, M. J. Lubbers, C. McCarty, and J. L. Molina, “Visual statistics for collections of clustered graphs,” in Proc. 2008 IEEE Pacific Symp. Visualization, Kyoto, Japan, 2008, Mar. 04–07, pp. 47–54.
    [31] B. Shevade, H. Sundaram, and L. Xie, “Modeling personal and social network context for event annotation in images,” in Proc. 7th ACM/IEEE-CS Joint Conf. Digital Libraries, Vancouver, British Columbia, Canada, 2007, Jun. 18–23, pp. 127–134.
    [32] C.-Y. Weng, W.-T. Chu, and J.-L. Wu, “Movie analysis based on roles' social network,” in Proc. 2007 IEEE Int. Conf. Multimedia and Expo, Beijing, China, 2007, Jul. 02–05, pp. 1403–1406.
    [33] A. Girgensohn, J. Boreczky, and L. Wilcox, “Keyframe-based user interfaces for digital video,” Computer, vol. 34, no. 9, pp. 61–67, Sep. 2001.
    [34] A. Haubold and J. R. Kender, “VAST MM: Multimedia browser for presentation video,” in Proc. 6th ACM Int. Conf. Image and Video Retrieval, Amsterdam, Netherlands 2007, Jul. 09–11, pp. 14–18.
    [35] L. Tang and J. R. Kender, “Designing an intelligent user interface for instructional video indexing and browsing,” in Proc. 11th ACM Int. Conf. Intelligent User Interfaces, Sydney, Australia, 2006, Jan. 29–Feb. 01, pp. 318–320.
    [36] D. C. Gibbon, “Generating hypermedia documents from transcriptions of television programs using parallel text alignment,” in Proc. 8th Int. Workshop on Research Issues in Data Engineering: Continuous-Media Databases and Applications, Orlando, Florida, United States, 1998, Feb. 23–24, pp. 26–33.
    [37] J. Graham and J. J. Hull, “Video paper: A paper-based interface for skimming and watching video,” in Proc. 2002 IEEE Int. Conf. Consumer Electronics, Auckland, New Zealand, 2002, Dec. 03–06, pp. 214–215.
    [38] N. Katashi, O. Shigeki, and Y. Mitsuhiro, “Annotation-based multimedia summarization and translation,” in Proc. 19th Int. Conf. Computational Linguistics, Taipei, Taiwan, 2002, Aug. 24–Sep. 01, pp. 1–7.
    [39] J.-Y. Pan, H. Yang, and C. Faloutsos, “MMSS: Multi-modal story-oriented video summarization,” in Proc. 4th IEEE Int. Conf. Data Mining Brighton, United Kingdom, 2004, Nov. 01–04, pp. 491–494.
    [40] M. G. Christel, “Supporting video library exploratory search: When storyboards are not enough,” in Proc. 2008 ACM Int. Conf. Content-Based Image and Video Retrieval, Niagara Falls, Ontario, Canada, 2008, Jul. 07–09, pp. 447–456.
    [41] A. Aya, L. Tang, and J. R. Kender, “A method and browser for cross-referenced video summaries,” in Proc. 2002 IEEE Int. Conf. Multimedia and Expo, Lusanne, Switzerland, 2002, Aug. 26–29, pp. 237–240.
    [42] L.-N. Zhang, C. Yuan, and Y.-Z. Zhong, “HandVoD: A robust and scalable VoD solution with raptor codes over GPRS/EDGE network,” in Proc. 4th IEEE Int. Conf. Circuits and Systems for Communications, Shanghai, China, 2008, May 26–28, pp. 482–486.
    [43] I. Yurie and Y. Takami, “A proposal of related information providing system on distributed VOD,” in Proc. IEEE Workshop on Knowledge Media Networking, Kyoto, Japan, 2002, Jul. 10–12, pp. 43–48.
    [44] C.-S. Park, T.-S. Wang, J.-H. Kim, M.-C. Hwang, and S.-J. Ko, “Video transcoding to support playback at a random location for scalable video coding,” IEEE Trans. Consumer Electronics, vol. 53, no. 1, pp. 227–234, Feb. 2007.
    [45] J. C. Paolillo, “Structure and network in the YouTube core,” in Proc. 41st Annu. Hawaii Int. Conf. System Sciences, Waikoloa, Hawaii, United States, 2008, Jan. 07–10, pp. 156–165.
    [46] T. Liu and R. Katpelly, “An interactive system for video content exploration,” IEEE Trans. Consumer Electronics, vol. 52, no. 4, pp. 1368–1376, Nov. 2006.
    [47] X. Lan, N. Zheng, J. Xue, B. Gao, and X. Wu, “Adaptive VoD architecture for heterogeneous networks based on scalable wavelet video coding,” IEEE Trans. Consumer Electronics, vol. 53, no. 4, pp. 1401–1409, Nov. 2007.
    [48] C. Harrison, B. Amento, and L. Stead, “iEPG: An ego-centric electronic program guide and recommendation interface,” in Proc. 1st Int. Conf. Designing Interactive User Experiences for TV and Video, Silicon Valley, California, United States, 2008, Oct. 22–24, pp. 23–26.
    [49] W. Cooper, “The interactive television user experience so far,” in Proc. 1st Int. Conf. Designing Interactive User Experiences for TV and Video, Silicon Valley, California, United States, 2008, Oct. 22–24, pp. 133–142.
    [50] M. G. Christel, A. M. Olligschlaeger, and C. Huang, “Interactive maps for a digital video library,” IEEE Multimedia Mag., vol. 7, no. 1, pp. 60–67, Jan. 2000.
    [51] Y. Rui, T. S. Huang, and S. Mehrotra, “Exploring video structure beyond the shots,” in Proc. 1998 IEEE Int. Conf. Multimedia Computing and Systems, Austin, Texas, United States, 1998, Jun. 28–Jul. 01, pp. 237–240.
    [52] B. T. Truong, C. Dorai, and S. Venkatesh, “New enhancements to cut, fade, and dissolve detection processes in video segmentation,” in Proc. 8th ACM Int. Conf. Multimedia, Marina del Rey, California, United States, 2000, Oct. 30–Nov. 03, pp. 219–227.
    [53] T.-H. Tsai and Y.-C. Chen, “A robust shot change detection method for content-based retrieval,” in Proc. 2005 IEEE Int. Symp. Circuits and Systems, Taoyuan, Taiwan, 2005, May 23–26, pp. 4590–4593.
    [54] Y. Rui, T. S. Huang, and S. Mehrotra, “Constructing table-of-content for videos,” Multimedia Systems, vol. 7, no. 5, pp. 359–368, Sep. 1999.
    [55] A. Velivelli and T. S. Huang, “Automatic video annotation by mining speech transcripts,” in Proc. 2006 IEEE Int. Conf. Computer Vision and Pattern Recognition, New York City, New York, United States, 2006, Jun. 17–22, pp. 115–122.
    [56] D. Pelleg and A. Moore, “X-means: Extending K-means with efficient estimation of the number of clusters,” in Proc. 17th Int. Conf. Machine Learning, Stanford, California, United States, 2000, Jun. 29–Jul. 02, pp. 727–734.
    [57] S. Patwardhan, S. Banerjee, and T. Pedersen, “SenseRelate::TargetWord—A generalized framework for word sense disambiguation,” in Proc. 43rd Annu. Meeting of the Association for Computational Linguistics, Michigan, Michigan, United States, 2005, Jun. 25–30, pp. 73–76.
    [58] S.-S. Kang, “Keyword-based document clustering,” in Proc. 6th Int. Workshop on Information Retrieval with Asian Languages, Sappro, Japan, 2003, Jul. 07–12, pp. 132–137.
    [59] J. Argillander, G. Iyengar, and H. Nock, “Semantic annotation of multimedia using maximum entropy models,” in Proc. 2005 IEEE Int. Conf. Acoustics, Speech, and Signal Processing, Philadelphia, Pennsylvania, United States, 2005, Mar. 18–23, pp. 153–156.
    [60] S. Gao, J.-H. Lim, and Q. Sun, “An integrated statistical model for multimedia evidence combination,” in Proc. 15th ACM Int. Conf. Multimedia, Augsburg, Germany, 2007, Sep. 25–29, pp. 872–881.
    [61] W. H.-M. Hsu and S.-F. Chang, “Generative, discriminative, and ensemble learning on multi-modal perceptual fusion toward news video story segmentation,” in Proc. 2004 IEEE Int. Conf. Multimedia and Expo, Taipei, Taiwan, 2004, Jun. 27–30, pp. 1091–1094.
    [62] A. L. Berger, S. A. D. Pietra, and V. J. D. Pietra, “A maximum entropy approach to natural language processing,” Computational Linguistics, vol. 22, no. 1, pp. 39–71, Mar. 1996.
    [63] G. A. Miller, “Nouns in WordNet: A lexical inheritance system,” International Journal of Lexicography, vol. 3, no. 4, pp. 245–264, Jan. 1990.
    [64] H. Liu, “Unpacking meaning from words: A context-centered approachto computational lexicon design,” in Proc. 4th Context International and Interdisciplinary Conf. Modeling and Using Context, Stanford, California, United States, 2003, Jun. 23–25, pp. 218–232.
    [65] A. Hoogs, J. Rittscher, G. Stein, and J. Schmiederer, “Video content annotation using visual analysis and a large semantic knowledgebase,” in Proc. 2003 IEEE Int. Conf. Computer Vision and Pattern Recognition, Madison, Wisconsi, United States, 2003, Jun. 18–20, pp. 327–334.
    [66] L. Khan, D. McLeod, and E. Hovy, “Retrieval effectiveness of an ontology-based model for information selection,” International Journal on Very Large Data Bases, vol. 13, no. 1, pp. 71–85, Jan. 2004.
    [67] A. Hulth and B. B. Megyesi, “A study on automatically extracted keywords in text categorization,” in Proc. 21st Int. Conf. Computational Linguistics and 44th Annu. Meeting of the Association for Computational Linguistics, Sydney, Australia, 2006, Jul. 17–18, pp. 537–544.
    [68] L. Singh, M. Beard, L. Getoor, and M. B. Blake, “Visual mining of multi-modal social networks at different abstraction levels,” in Proc. 11th Int. Conf. Information Visualization, Zurich, Switzerland, 2007, Jul. 04–06, pp. 672–679.
    [69] J. Shetty and J. Adibi, “Discovering important nodes through graph entropy the case of Enron email database,” in Proc. 3rd ACM Int. Conf. Link Discovery, Chicago, Illinois, United States, 2005, Aug. 21–25, pp. 74–81.
    [70] R. Navigli and M. Lapata, “Graph connectivity measures for unsupervised word sense disambiguation,” in Proc. 20th Int. Joint Conf. Artificial Intelligence, Hyderabad, India, 2007, Jan. 06–12, pp. 1683–1688.
    [71] D. Walther and C. Koch, “Modeling attention to salient proto-objects,” Neural Networks, vol. 19, no. 9, pp. 1395–1407, Nov. 2006.
    [72] J. T. Foote, “Content-based retrieval of music and audio,” in Proc. 1997 SPIE Conf., Dallas, Texas, United States, 1997, Nov. 03, pp. 138–147.
    [73] S. Pfeiffer, S. Fischer, and W. Effelsberg, “Automatic audio content analysis,” in Proc. 5th ACM Int. Conf. Multimedia, Boston, Massachusetts, United States, 1996, Nov. 18–22, pp. 21–30.
    [74] S. Z. Li, “Content-based audio classification and retrieval using the nearest feature line method,” IEEE Trans. Speech and Audio Processing, vol. 8, no. 5, pp. 619–625, Sep. 2000.
    [75] G. Guo and S. Z. Li, “Content-based audio classification and retrieval by support vector machines,” IEEE Trans. Neural Networks, vol. 14, no. 1, pp. 209–215, Jan. 2003.
    [76] C.-C. Lin, S.-H. Chen, T.-K. Truong, and Y. Chang, “Audio classification and categorization based on wavelets and support vector machine,” IEEE Trans. Speech and Audio Processing, vol. 13, no. 5, pp. 644–651, Sep. 2005.
    [77] C. D. Manning, P. Raghavan, and H. Schütze, Introduction to Information Retrieval. Cambridge, MA: Cambridge University Press, 2008.
    [78] B. Upcroft, S. Kumar, M. Ridley, L. L. Ong, and H. Durrant-Whyte, “Fast re-parameterisation of gaussian mixture models for robotics applications,” in Proc. 2004 Australasian Conf. Robotics and Automation, Canberra, Australia, 2004, Nov. 11–30.
    [79] M. D. Berg, M. V. Kreveld, M. Overmars, and O. Schwarzkopf, Computational Geometry: Algorithms and Applications, 2nd ed. New York, NY: Springer-Verlag, 2000.
    [80] A. D. R. McQuarrie and C.-L. Tsai, Regression and Time Series Model Selection. River Edge, NJ: World Scientific, 1998.
    [81] L. Tan and D. Taniar, “Adaptive estimated maximum-entropy distribution model,” Information Sciences, vol. 177, no. 15, pp. 3110–3128, 2007.
    [82] L.-Y. Duan, M. Xu, Q. Tian, C.-S. Xu, and J. S. Jin, “A unified framework for semantic shot classification in sports video,” IEEE Trans. Multimedia, vol. 7, no. 6, pp. 1066–1083, Dec. 2005.
    [83] V. Mezaris, I. Kompatsiaris, N. V. Boulgouris, and M. G. Strintzis, “Real-time compressed-domain spatiotemporal segmentation and ontologies for video indexing and retrieval,” IEEE Trans. Circuits and Systems for Video Technology, vol. 14, no. 5, pp. 606–621, May 2004.
    [84] B.-W. Chen, J.-C. Wang, and J.-F. Wang, “A novel video summarization based on mining the story-structure and semantic relations among concept entities,” IEEE Trans. Multimedia, vol. 11, no. 2, pp. 295–312, Feb. 2009.
    [85] H. Zheng, H. Wang, and D. H. Glass, “Integration of genomic data for inferring protein complexes from global protein-protein interaction networks,” IEEE Trans. Systems, Man, and Cybernetics—Part B, vol. 38, no. 1, pp. 5–16, Feb. 2008.
    [86] J. B. Pereira-Leal, A. J. Enright, and C. A. Ouzounis, “Detection of functional modules from protein interaction networks,” Proteins: Structure, Function, and Bioinformatics, vol. 54, no. 1, pp. 49–57, Jan. 2004.
    [87] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, vol. 60, no. 2, pp. 91–110, Nov. 2004.
    [88] P. H. S. Torr and D. W. Murray, “The development and comparison of robust methods for estimating the fundamental matrix,” International Journal of Computer Vision vol. 24, no. 3, pp. 271–300, Nov. 1997.
    [89] S. Ahn, M. Choi, J. Choi, and W. K. Chung, “Data association using visual object recognition for EKF-SLAM in home environment,” in Proc. 2006 IEEE/RSJ Int. Conf. Intelligent Robots and Systems, Beijing, China, 2006, Oct. 09–15, pp. 2588–2594.
    [90] P. D. Turney, “Expressing implicit semantic relations without supervision,” in Proc. 21st Int. Conf. Computational Linguistics and 44th Annu. Meeting of the Association for Computational Linguistics, Sydney, Australia, 2006, Jul. 17–21, pp. 313–320.
    [91] J. M. Kleinberg, “Authoritative sources in a hyperlinked environment,” in Proc. 9th Annu. ACM-SIAM Symp. Discrete Algorithms, San Francisco, California, United States, 1998, Jan. 25–27, pp. 668–677.
    [92] S. V. Dongen, "A cluster algorithm for graphs," National Research Institute for Mathematics and Computer Science, Amsterdam, Netherlands, May 2000.
    [93] T. Wiegand, G. Sullivan, H. Schwarz, and M. Wien, "Scalable video coding: Amendment 3 to ITU-T Rec. H.264 (2005) | ISO/IEC 14496-10:2005," Joint Video Team of ITU-T VCEG and ISO/IEC MPEG, 2007.
    [94] T. Wiegand, G. J. Sullivan, G. Bjøntegaard, and A. Luthra, “Overview of the H.264/AVC video coding standard,” IEEE Trans. Circuits and Systems for Video Technology, vol. 13, no. 7, pp. 560–576, Jul. 2003.
    [95] S. C. Dass, “Markov random field models for directional field and singularity extraction in fingerprint images,” IEEE Trans. Image Processing, vol. 13, no. 10, pp. 1358–1367, Oct. 2004.
    [96] W.-Y. Ma and B. S. Manjunath, “A texture thesaurus for browsing large aerial photographs,” Journal of the American Society for Information Science, vol. 49, no. 7, pp. 633–648, May 1998.
    [97] X. Li and M. T. Orchard, “New edge-directed interpolation,” IEEE Trans. Image Processing, vol. 10, no. 10, pp. 1521–1527, Oct. 2001.
    [98] J. S. Goldstein, I. S. Reed, and L. L. Scharf, “A multistage representation of the Wiener filter based on orthogonal projections,” IEEE Trans. Information Theory, vol. 44, no. 7, pp. 2943–2959, Nov. 1998.
    [99] J. Vieron, M. Wien, and H. Schwarz, "Joint Scalable Video Model (JSVM) 11 Software," Joint Video Team of ITU-T VCEG and ISO/IEC MPEG, Geneva, Switzerland, Doc. JVT-X203 2007.
    [100] B.-L. Yeo and B. Liu, “Rapid scene analysis on compressed video,” IEEE Trans. Circuits and Systems for Video Technology, vol. 5, no. 6, pp. 533–544, Dec. 1995.
    [101] C.-Y. Chen, J.-C. Wang, J.-F. Wang, and Y.-H. Hu, “Event-based segmentation of sports video using motion entropy,” in Proc. 9th IEEE Int. Symp. Multimedia, Taichung, Taiwan, 2007, Dec. 10–12, pp. 107–111.
    [102] M. J. Black, “The robust estimation of multiple motions: Parametric and piecewise-smooth flow fields,” Computer Vision and Image Understanding, vol. 63, no. 1, pp. 75–104, Jan. 1996.
    [103] D. Walther, U. Rutishauser, C. Koch, and P. Perona, “Selective visual attention enables learning and recognition of multiple objects in cluttered scenes,” Computer Vision and Image Understanding, vol. 100, no. 1–2, pp. 41–63, Oct. 2005.

    無法下載圖示 校內:2019-12-31公開
    校外:不公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE