簡易檢索 / 詳目顯示

研究生: 陳震宇
Chen, Chen-Yu
論文名稱: 新聞、運動與監控視訊摘要方法之研究
A Study on Video Summarization Approaches for News, Sports, and Surveillance
指導教授: 王駿發
Wang, Jhing-Fa
學位類別: 博士
Doctor
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2008
畢業學年度: 96
語文別: 英文
論文頁數: 90
中文關鍵詞: 視訊摘要分散式新聞影片伺服器基於混亂度之運動特徵萃取同質性錯誤模式狀態轉栘之支持向量機人類行為辨識
外文關鍵詞: Human Behavior Identification, State Transition Support Vector Machine, Video Summarization, Distributed News Video Servers, Entropy-based Motion Feature Extraction, Heteroscedastic Error Model
相關次數: 點閱:167下載:5
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著多媒體時代來臨,視訊摘要已成為一個很重要的議題,使用者可以透過瀏灠摘要影片快速地取得原影片想要表達的意涵。本論文針對時下有廣泛需求的影片種類提出研究方法,分別為新聞影片摘要方法、運動影片摘要方法及監視影片摘要之異常行為辨識方法。
    在新聞影片視訊摘要方法中,本論文提出一個以分散式為架構的摘要方法,其架構包含新聞影片視訊處理伺服器和影片查詢/瀏灠伺服器。透過本論文提出的兩種伺服器之間的合作,可以降低新聞影片存放裝置的大小、新聞故事摘要產生的運算量及在網際網路間傳送影片所需要的網路頻寬。
    運動影片中物體(人物)栘動常常是在處理運動視訊摘要時極為重要的依據,然而只以物體運動作為該片段是否具有事件發生的判斷依據是有疑慮的,因為攝影機在栘動時也會產生大量的運動量,常常造成系統的誤判。為了解決這個困惱,本論文特別提出一種基於混亂度運動特徵來降低由於攝影機栘動而產生的運動量變化。該運動特徵不只考量運動量大小,還要考慮影片中運動向量的混亂度,若是某一影片片段的運動向量混亂度很高,則表示該片段有重要事件的可能性較高,反之則否。
    監視系統已被廣泛地使用於許多公共場合或是家庭安全監控系統,在這類的系統中,會將錄下的影片存在儲存裝置中,所以該系統需要大容量的儲存裝置,而監控影片摘要便因應而生,以降低該系統對儲存裝置的需求量。於監控影片摘要系統中,異常行為偵測又尤為重要,可在特殊事件發生的時候,提出警告訊息或是只將該事件發生前後的影片儲存即可,以達到減少儲存裝置的需求量,所以,本論文針對人類異常行為偵測發展一套可以辨識人類行為偵測系統,該系統由數個連續的狀態組成,每一個狀態可以識別一個特定行為的片段,狀態與狀態之間的關聯性可以由馬可夫隨機場來定義,透過數個狀態的組合,該系統便可達到辦識隨時間變化內容之監控影片。

    It is easy to watch video in its most common representation, namely sequential. However browsing, manipulating and editing video is a tedious process. In this thesis, we investigate distributed news video server architecture, entropy-based motion feature, and state transition SVM for news video summarization, sports video summarization, and surveillance video summarization, respectively.
    The distributed news video summarization architecture includes video news processing (NVP) servers and the video querying/browsing (VQB) server. To reduce the storage size for news video, computation cost of story abstract generation, and required Internet bandwidth, this research work proposed an efficient news video querying/browsing based on distributed news video servers. Our news video querying/browsing system works well since these distributed NVP servers are associated with the VQB server. Each news story abstract is generated by the corresponding NVP server and sent to the VQB server for user querying/browsing.
    Entropy-based motion feature is extracted from sports video not only concerning motion magnitude but also motion directivity. The proposed novel feature can decrease the effect from camera motion. Furthermore, the entropy-based motion feature is effective for a variety of sports video summarization, such as soccer video,
    tennis video, and basketball video. Meaningful events then can be segmented by the heteroscedastic error model while sports video is represented as entropy motion values.
    To improve the performance of intelligent surveillance system, we developed a human behavior identification module to increase the efficiency by integrating visual and contour information. The state transition support vector machine (STSVM) is used to human behavior identification with continuous property. The STSVM assumes a human activity is composed of several successive states. Each state is modeled by an individual 2-class SVM and the transition probabilities of consecutive states are calculated by Markov random field (MRF) theory.
    Each proposed approach and the corresponding experimental results will be explained clearly in the dissertation contents.

    CONTENTS CONTENTS i LIST OF TABLES iii LIST OF FIGURES iv Chapter 1 Introduction 1 1.1 Video Summarization 1 1.1.1 Review of News Video Summarization 2 1.1.2 Review of Sport Video Summarization 4 1.1.3 Review of Surveillance Video Summarization 6 1.2 Motivation 7 1.3 Purposes and Research Approaches 8 1.4 Organization of this Dissertation 10 Chapter 2 Efficient News Video Querying and Browsing Based on Distributed News Video Servers 11 2.1 Introduction to Distributed News Video Servers 11 2.2 Visualized Querying/Browsing Server 16 2.2.1 Story Vector Construction 17 2.2.2 LSA-based Story Vector 18 2.2.3 Dynamic Hierarchical Tree Construction 19 2.2.4 LSA-based Similarity Measure 20 2.3 Distributed News Video Preprocessing Server 21 2.3.1 Key Sentence Extraction 22 2.3.2 Key frame Extraction 25 2.4 Experimental Results 28 2.4.1 Data Description and Experimental Setup 28 2.4.3 Key Frame Extraction Experiment 31 2.4.4 News Story Clustering Experiment 34 2.5 Summary 36 Chapter 3 High Motion Event-based Segmentation of Sports Video Using Motion Entropy and Heteroscedastic Error Model 38 3.1 Introduction to High Motion Event-based Segmentation 38 3.2 Entropy-based Motion Analysis 41 3.3 High Motion Event-based Segmentation 44 3.4 Motion Pattern Recognition 49 3.5 Experimental Results 51 3.6 Summary 54 Chapter 4 Human Behavior Identification for Intelligent Visual Surveillance Based on State Transition Support Vector Machine 56 4.1 Introduction to Human Behavior Identification 56 4.2 Preprocessing 59 4.3 Our Methodology 61 4.3.1 Conventional Single Support Vector Machine 62 4.3.2 State Transition Support Vector Machine 63 4.3.2.1 Generation of the State Probabilities in Identification Phase 65 4.3.2.2 Generation of the State Transition Probabilities in Training Phase...... 65 4.4 Experimental Results 69 4.4.1 The Implementation System 69 4.4.2 Experiments on Human Behavior Identification 70 4.5 Summary 72 Chapter 5 Conclusions 73 5.1 Principal Contributions 73 5.2 Future Directions 74 5.2.1 Video Summarization and Scale Video Coding 74 5.2.2 Video Summarization and SoC Design 75 Bibliography 76 Publication List 88 Vita 90 LIST OF TABLES Table 2.1 Statistics for the tested news video 29 Table 2.2 The evaluation of key sentence extraction I 30 Table 2.3 The evaluation of key sentence extraction II 31 Table 2.4 The evaluation of five key frame extraction methods 35 Table 2.5 The relationship between MOS values and computation cost in various window sizes 35 Table 2.6 The experimental result of the news story clustering 36 Table 2.7 Query performance of the proposed system under LSA space 36 Table 3.1 Characteristics of the Data Set 53 Table 3.2 Parameter Settings of Kernel Functions in Our Experiments 53 Table 3.3 Evaluation on Event Detection 54 Table 3.4 Results between Original [53] and Modified Algorithm for Change Point Detection in Soccer 54 Table 4.1 Evaluation on Human Behavior Identification Using Single SVM, Multi-Stage SVM, and State Transition SVMs 71 Table 4.2 Confusion Matrix of Five Kinds of Human Behaviors 71 LIST OF FIGURES Figure 1.1 A basic architecture of news video summarization 3 Figure 1.2 A framework for sport video summarization 5 Figure 1.3 A basic architecture of surveillance video summarization 6 Figure 2.1 Concept illustration of the proposed architecture 13 Figure 2.2 The visualized querying/browsing interface of the proposed system 14 Figure 2.3 Block diagram of the VQB server 16 Figure 2.4 Concept illustration the dynamic hierarchical tree 17 Figure 2.5 Block diagram of the NVP server 22 Figure 2.6 Concept illustration of the key sentence extraction algorithm 24 Figure 2.7 Illustration of the key frame extraction based on non-overlapping windows 28 Figure 2.8 Examples of the extracted key frames 32 Figure 2.9 Comparison with other existing methods for key frame extraction 33 Figure 3.1 Block diagram and system flowchart of the proposed architecture 41 Figure 3.2 Motion vector examples 42 Figure 3.3 An example of the EMV and PME curves generated by the same sports video sequence 43 Figure 3.4 The relationship between the parameter settings and their corresponding convergent mean square errors 47 Figure 3.5 Example of detected change points using the heteroscedastic error model 49 Figure 3.6 An example of the motion pattern 50 Figure 3.7 The receiver operating characteristics (ROC) curve of EMV and PME motion feature 51 Figure 3.8 Extracted key frames uses entropy-based motion analysis 52 Figure 4.1 Block diagram of the proposed framework for human behavior identification 59 Figure 4.2 An example for morphological opening processing 61 Figure 4.3 An example of moving area extraction 61 Figure 4.4 An example of a person’s silhouette sequence for the raising hand behavior 61 Figure 4.5 Illustration of the STSVM 64 Figure 4.6 Illustration of two localized contour sequences generated by the LCS approach 67 Figure 4.7 The invariant characteristic of similar contours generated by the LCS approach 68 Figure 4.8 Our implementation system for human behavior identification based on the proposed STSVM 69 Figure 5.1 Basic architecture of video summarization with scalable video coding 74

    [1] B. Li and M. I. Sezan, "Event detection and summarization in sports video," IEEE Workshop on Content-based Access of Image and Video Libraries, pp. 132-138, 2001.

    [2] S. M. Iacob, R. L. Lagendijk, and M.E. Iacob, “Video abstraction based on asymmetric similarity values”, in Proc. SPIE Vol. 3846 Multimedia Storage and Archiving Systems IV, pp. 181-191, 1999.

    [3] M. Cooper and J. Foote, “Summarizing video using non-negative similarity matrix factorization,” in Proc. IEEE Workshop on Multimedia Signal Processing, pp. 25-28, 2002.

    [4] L. Chaisorn, T. S. Chua, and C. H. Lee; “The segmentation of news video into story units,” in Proc. IEEE Int. Conf. Multimedia and Expo, pp. 73-76, 2002.

    [5] N. E. O'Connor, S. Marlow, N. Murphy, A. F. Smeaton, P. Browne, S. Deasy, H. Lee, and K. McDonald, “Fischlar: an on-line system for indexing and browsing broadcast television content,” in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, pp. 1633-1636, May 2001.

    [6] H. T. Chen, D. Y. Chen, and S. Y. Lee, “Object based video similarity retrieval and its application to detecting anchorperson shots in news video,” in Proc. Fifth International Symposium on Multimedia Software Engineering, pp. 172-179, 2003.

    [7] X. Gao and X. Tang, “Unsupervised video-shot segmentation and model-free anchorperson detection for news video story parsing,” IEEE Trans. Circuits Syst. Video Technol., vol. 12, no. 9, pp. 765-776, Sep. 2002.

    [8] A. Hanjalic, R. L Lagensijk, and J. Biemond, “Template-based detection of anchorperson shots in news programs,” in Proc. 1998 Int. Conf. Image Processing, pp. 148-152, Oct. 1998.

    [9] W. Qi, L. Gu, H. Jiang, X. R. Chen, and H. J. Zhang, “Integrating visual, audio and text analysis for news video,” in Proc. Int. Conf. Image Processing, pp. 520-523, Sept. 2000.

    [10] S. Raaijmakers, J. den Hartog, and J. Baan, “Multimodal topic segmentation and classification of news video,” in Proc. IEEE Int. Conf. Multimedia and Expo, pp. 33-36, Aug. 2002.

    [11] R. S. Jasinschi, N. Dimitrova, T. McGee, L. Agnihotri, J. Zimmerman, and D. Li, “Integrated multimedia processing for topic segmentation and classification,” in Proc. Int. Conf. Image Processing, pp. 366-369, Oct. 2001.

    [12] W. Zhu, C. Toklu, and S. P. Liou, “Automatic news video segmentation and categorization based on closed-captioned text,” in Proc. IEEE Int. Conf. Multimedia and Expo, pp. 829-832, Aug. 2001.

    [13] Y. Gong, L.T. Sin, C.H. Chuan, H. Zhang and M. Sakauchi, “Automatic parsing of TV soccer programs,” in Proc. Internat. Conf. on Multimedia Computing and Systems, pp. 167-174, May 1995.

    [14] D. Yow, B-L. Yeo, M. Yeung and B. Liu, “Analysis and presentation of soccer highlights from digital video,” in Proc. Asian Conf. on Comp. Vision (ACCV), 1995.

    [15] D. Zhong and S. F. Chang, “Spatio-temporal video search using the object-based video representation,” in Proc. IEEE Int. Conf. Image Processing, vol. 1, pp. 21–24, Oct. 1997.

    [16] H. Denman, N. Rea and A. Kokaram, “Content based analysis for video from snooker broadcasts,” Journal of Computer Vision and Image Understanding (CVIU), pp. 141-306, 2002.

    [17] N. Rea, R. Dahyot and A. Kokaram, “Semantic event detection in sports through motion understanding,” Int. Conf. Image and Video Retrieval (CIVR), July 2004.

    [18] H. C. Shih and C. L. Huang, “Content-based multi-functional video retrieval system,” Int. Conf. Computers in Education, pp. 383-384, Jan. 2005.

    [19] T. Y. Liu, W. Y. Ma and H. J. Zhang, “Effective feature extraction for play detection in American football video,” in Proc. Conf. Multimedia Modelling, pp. 164-171, 2005.
    [20] M. J. Black and P. Anandan, “The robust estimation of multiple motions: parametric and piecewise-smooth flow fields,” Computer Vision and Image Understanding, vol. 6, no. 4, pp. 348-365, 1995.

    [21] J. M. Odobcr and P. Bouthemy, “Robust multiresolution estimation of parametric motion models,” Journal of Visual Communication and Image Representation, vol. 6, no. 4, pp. 348-365. 1995.

    [22] P. Bauthemy et al. “A unified approach to shot change detection and camera motion characterization.” IEEE Trans. Circuits Syst. Video Technol., vol. 7, pp. 1030-1044, 1999.

    [23] T. Liu, H. J. Zhang, and F. Qi, “A novel video key-frame-extraction algorithm based on perceived motion energy model,” IEEE Trans. Circuits Syst. Video Technol., vol. 13, no. 10, pp. 1006-1013, Oct. 2003.

    [24] N. Rota and M. Thonnat, “Video sequence interpretation for visual surveillance,” in Proc. the 3rd IEEE International Workshop on Visual Surveillance, pp. 59-68, 2000.

    [25] R. Cutler and M. Turk, “View-based interpretation of real-time optical flow for gesture recognition,” University of Maryland, College Park Report, 1997.

    [26] C. Wren, A. Azarbayejani, T. Darrell, and A. Pentland, “Pfinder: Real-time tracking of the human body,” IEEE Trans. Pattern Anal. Machine Intell., vol.19, no. 7, pp. 780-785, July 1997.

    [27] C. Schuldt, I. Laptev, and B. Caputo, “Recognizing human actions: a local SVM approach,” in Proc. IEEE Conf. Pattern Recognition, pp. 32-36, Aug. 2004.

    [28] J. Ben-Arie, Z. Wang, P. Pandit, and S. Rajaram, “Human activity recognition using multidimensional idexing,” IEEE Trans. Pattern Anal. Machine Intell., vol.24, no. 8, pp. 1091-1104, Aug. 2002.

    [29] D. Heesch, M. J. Pickering, S. Ruger, and A. Yavlinsky, “Video retrieval using search and browsing with key frames,” in Proc. TREC Video Retrieval Evaluation (TRECVID), 2003.

    [30] C. G. M. Snoek, M. Worring, J. M. Geusebroek, D. C. Koelma, and F. J. Seinstra, “The MediaMill TRECVID 2004 semantic video search engine,” in Proc. TREC Video Retrieval Evaluation (TRECVID), 2004.

    [31] D. Heesch, P. Howarth, J. Megalhaes, A. May, M. Pickering, A. Yavlinsky, and S. Ruger, “Video retrieval using search and browsing,” in Proc. TREC Video Retrieval Evaluation (TRECVID), 2004.

    [32] N. Serpanos and A. Bouloutas, “Centralized versus distributed multimedia servers,” IEEE Trans. Circuits Syst. Video Technol., vol. 10, no. 8, pp. 1483-1449, Dec. 2000.

    [33] S. A. Barnett and G. J. Anido, “A Cost Comparison of Distributed and Centralized Approaches to Video-on-Demand,” IEEE Journal on Selected Areas in Communications, vol. 14, no. 6, pp. 1173-1183, Aug 1996.

    [34] K. Tanaka, H. Sakamoto, H. Suzuki, and K. Nishimura, “Distributed Architecture for Large-scale Video Servers,” in Proc. Int. Conf. Information, Communications and Signal Processing, pp. 578-583, Sep. 1997.

    [35] K. Kondo, Y. Arai, and F. Kozato, “A method of summarizing used paraphrasing a part of text,” Technical Report of IEICE, NLC95-62, pp. 25-30, 1995.

    [36] F. Ren, S. Li, and K. Kita “Automatic abstracting important sentences of Web articles,” in Proc. IEEE Int. Conf. System, Man, and Cybernetics, pp. 1705-1710, 2001.

    [37] F. Ren and Y. Sadanaga, “An automatic extraction of important sentences using statistical information and structural feature,” Natural Language, no. 125, pp.71-78, May 1998.

    [38] K. Otsuji, Y. Tonomura, and Y. Ohba, “Video browsing using brightness data,” in Proc. SPIE Visual Communications and Image Processing,” vol. 1606, pp. 980-985, Nov. 1991.

    [39] Y. Tonomura, A. Akutsu, Y. Taniguchi, and G. Suzuki, “Structured video computing,” IEEE Multimedia, vol. 1, no. 3, pp. 34-43, 1994.

    [40] H. J. Zhang, A. Kankanhalli, and S. W. Smoliar, “Automatic partitioning of full-motion video,” Multimedia Systems, vol. 1, no. 1, pp. 10-28, June 1993.

    [41] Y. Zhuang, Y. Rui, T. S. Huang, and S. Mehrotra, “Adaptive key frame extraction using unsupervised clustering,” in Proc. Int. Conf. Image Processing, pp. 866-870, Oct. 1998.

    [42] B. Gunsel and A. M. Tekalp, “Content-based video abstraction,” in Proc. Int. Conf. Image Processing, pp. 128-132, Oct. 1998.

    [43] M. M Yeung and B. Liu, “Efficient matching and clustering of video shots,” in Proc. Int. Conf. Image Processing, pp. 338-341, Oct. 1995.

    [44] X. Sun, M. S Kankanhalli, Y. Zhu, and J. Wu, “Content-based key frame extraction for digital video,” in Proc. IEEE Int. Conf. Multimedia Computing and Systems, pp. 190-193, 1998.

    [45] W. Wolf, “Key frame selection by motion analysis,” in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, pp. 1228-1231, 1996.

    [46] T. Liu, H. J. Zhang, and F. Qi, “A novel video key-frame-extraction algorithm based on perceived motion energy model,” IEEE Trans. Circuits Syst. Video Technol., vol. 13, no. 10, pp. 1006-1013, Oct. 2003.

    [47] G. Salton and C. Buckley, “Term-weighting approaches in automatic text retrieval,” Information Processing and Management, vol. 24, no. 5, pp. 513-523, 1988.
    [48] T. K. Landauer, P. W. Foltz, and D. Laham, “Introduction to latent semantic analysis,” Discourse Processes, vol. 25, pp. 259-284, 1998.

    [49] P. W. Foltz, “Latent semantic analysis for text-based research,” Behavior Research Methods, Instruments and Computers, vol. 28, no. 2, pp. 197-202, 1996.

    [50] M. W. Peter, “How latent is latent semantic analysis?,” in Proc. Int. Joint Conf. Artificial Intelligence, pp. 932-941, 1999.

    [51] Y. Zhao and G. Karypis, “Evaluation of hierarchical clustering algorithms for document datasets,” in Proc. the eleventh Int. Conf. Information and knowledge management, pp. 515-524, 2002.

    [52] F. Coldefy, and P. Bouthemy, “Unsupervised soccer video abstraction based on pitch, dominant color and camera motion analysis,” in Proc. ACM Multimedia, pp. 268-271, Oct. 2004.

    [53] V. Guralnik and J. Srivastava, “Event detection from time series data,” in Proc. ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, pp. 33-42, 1999.

    [54] S. Suthaharan, S. W. Kim, H. K. Lee, and K. R. Rao, “Perceptually Tuned Video Watermarking Scheme using Motion Entropy Masking,” in Proc. IEEE Region 10 Conf. (TENCON), pp. 182-185, Sep. 1999.

    [55] Y. Ma and H. J. Zhang, “A new perceived motion based shot content representation,” in Proc. IEEE Int. Conf. Image Processing, pp. 426-429, Oct. 2001.

    [56] J. Zan, M. O. Ahmad, and M. N. S. Swamy, “A multiresolution motion estimation technique with indexing,” IEEE Trans. Circuits Syst. Video Technol., vol. 16, no. 2, pp. 157-165, Feb. 2006.

    [57] D. M. Hawkins and D. F. Merriam, “Optimization of digitized sequential data,” Mathematical Geology, pp. 389-395, 1973.

    [58] S. B. Guthery, “Pattern regression,” Amer. Statist. Ass., pp. 945-947, 1974.

    [59] D. M. Hawkins, “Point estimation of parameters of piecewise regression models,” The Journal of the Royal Statistical Society Series C, pp. 51-57, 1976.

    [60] A. Agresti, “Categorical data analysis,” John Wiley & Sons, 1990.

    [61] A. J. Smola and B. Scholkopf, “A tutorial on support vector regression,” Statistics and Computing, vol. 14, no. 3, pp. 199-222, 2004.

    [62] R. Courant and D. Hilbert, “Methods of mathematical physics,” Interscience Publishers, 1953.

    [63] S. Shafer et al., “The new EasyLiving project at Microsoft research,” in Proc. DARPA/NIST Smart Spaces Workshop, pp. 127-130, 1998.

    [64] B. Johanson and A. Fox, “Extending tuplespaces for coordination in interactive workspaces,” Journal of Systems and Software, vol. 69, no. 3, pp. 243-266, Jan., 2004.

    [65] Philips Home Lab: http://www.philips.com/research/ami.

    [66] H. Mizoguchi, T. Sato and T. Ishikawa, “Robotic office room to support office work human behavior understanding function with networked machines,” in Proc. IEEE Conf. Robotics and Automation, pp. 2968-2975, 1996.

    [67] F. Doctor, H. Hagras and V. Callaghan, “A fuzzy embedded agent-based approach for realizing ambient intelligence in intelligent inhabited environments,” IEEE Trans. System, Man, and Cybernetics-Part A: System and human, vol. 35, no. 1, pp. 55-65, Jan. 2005.

    [68] Z. Cheng, Q. Han, S. Sun, M. Kansen, T. Hosokawa, T. Huang and A. He, “A proposal on a learner’s context-aware personalized education support method based on principles of behavior science,” in Proc. IEEE Conf. Advanced Information Networking and Application, pp. 341-345, 2006.

    [69] A. Butz and A. Kruger, “Applying the peephole metaphor in a mixed-reality room,” IEEE Trans. Computer, vol. 1, no. 26, pp. 56-63, Jan. 2006.

    [70] J. Rehg and T. Kanade, “Model-based tracking of self-occluding articulated objects,” in Proc. Int. Conf. Computer Vision, pp. 612-617, 1995.

    [71] S. X. Ju, M. J. Black, and Y. Yacoob, “Cardboard people: A parameterized model of articulated motion,” in Proc. IEEE Conf. Automatic Face and Gesture Recognition, pp. 38-44, 1996.

    [72] H. Sidenbladh, M. Black, and D. Fleet, “Stochastic tracking of 3d human figures using 2d image motion,” in European Conf. Computer Vision, pp. 702-718, 2000.

    [73] K. Grauman, G. Shakhnarovich, and T. Darrell, “Inferring 3d structure with a statistical image-based shape model,” in Proc. Int. Conf. Computer Vision, pp. 641-647, 2003.

    [74] I. Haritaoglu, D. Harwood, and L. S. Davis, “W4: Real-time surveillance of people and their activities,” IEEE Trans. Pattern Anal. Machine Intell., vol. 22, no. 8, pp. 809-830, Aug. 2000.

    [75] S. McKenna, S. Jabri, Z. Duric, A. Rosenfeld, and H. Wechsler, “Tracking groups of people,” Comput. Vis. Image Understanding, vol. 80, no. 1, pp. 42-56, 2000.

    [76] C. Stauffer and W. Grimson, “Adaptive background mixture models for real-time tracking,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 246-252, 1999.
    [77] A. Prati, I. Mikic, M. Trivedi, and R. Cucchiara, “Detecting moving shadows: algorithms and evaluation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 25, no. 7, pp. 918-923, July 2003.

    [78] V. Vapnik, Statistical Learning Theory, New York: Wiley, 1998.

    [79] R. Courant and D. Hilbert, Methods of mathematical physics, Interscience Publishers, 1953.

    [80] A. Ganapathiraju, J.E. Hamaker, and J. Picone, “Applications of support vector machines to speech recognition,” IEEE Transactions on Signal Processing, vol. 52, no. 8, pp. 2348–2355. Aug. 2004.

    [81] M. I. Jordon, editor, Learning in graphical models, MIT Press, 1999.

    [82] C. M. Bishop, Pattern recognition and machine learning, Springer, 2006.

    [83] L. Gupta and S. Ma, “Gesture-based interaction and communication: automated classification of hand gesture contours,” IEEE Trans. System, Man, and Cybernetics-Part C: Application and reviews, vol. 31, no. 31, pp. 114-120, Feb. 2001.

    [84] G. D. Forney, “The Viterbi algorithm,” Proc. the IEEE, vol. 61, no. 3, pp. 268-278, Mar. 1973.

    下載圖示 校內:2008-10-26公開
    校外:2008-10-26公開
    QR CODE