| 研究生: |
劉晉德 Liu, Chin-De |
|---|---|
| 論文名稱: |
以視訊影像之互動與事件偵測進行人類行為分析系統 Human Behavior Analysis with Interaction and Event Detection from Video Streams |
| 指導教授: |
詹寶珠
Chung, Pau-Choo |
| 學位類別: |
博士 Doctor |
| 系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2008 |
| 畢業學年度: | 96 |
| 語文別: | 英文 |
| 論文頁數: | 77 |
| 外文關鍵詞: | interaction recognition, behavior recognition, activity recognition, HMM |
| 相關次數: | 點閱:98 下載:5 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著人類平均年齡的成長,老人健康照護已經成為重要的社會議題。如何利用科技來改善老人日常生活的健康照護,是當前重要議題。由於老人的看護須求增加,有愈來愈多的長期照護中心逐年設立。部份的照護中心都加裝了視訊監控系統來監看老人的日常活動,以確保老人的生命健康。透過這樣的監控系統,須要一個固定的人長時間的在觀看監視器,才不會遺漏偶發的意外。雖然這是一個沉重的人力負擔,對於意外發生的不確定性而言,卻又是一個不可避免的必要措施。除了意外,老人的特殊行為也是照護中心所須注意的。要查覺老人的特殊行為,須要長時間記錄與時空資訊的整合下才得以判讀出來。因此,一個能長時間學習,從中發現老人的意外與特殊行為的技術,對於監控系統的意外警報自動化是有很大的助益。
本論文提出了一個IE-HMM (Interaction Embedded Hidden Markov Model) 來進行行為分析。行為種類包含了人與人的互動行為與個人行為。這個IE-HMM包含了主要的三個分析模組:switch control (SC),Individual Duration HMM (IDHMM)以及Interaction Coupled Duration HMM (ICDHMM)。毎一獨立的個人行為與互動行為都可視為一個行為單位,SC的功用便是找出每一個行為單位。SC為每兩個人之間建立一個互動狀態分析,這個互動狀態是基於人與人的互動所處的相對距離與時間來決定互動的存在與否,從所有的互動狀態中,找出有重疊的成員整合成一個群體互動。所找出的每一個互動行為單元中的人會被傳遞到ICDHMM進行互動行為判讀。互動行為的判讀是基於人與人之間的肢體互動以及互動時間的長短。為了能同時考慮到這兩項因素,本論文將ICDHMM設計成一個兩階層的HMM。第一層用來處理互動的時間,而第二層用來處理肢體的互動。為了能適應人與人之間肢體互動的相關生,本論文採用Coupled HMM來辨識肢體的相互作用。
IE-HMM將每一個個人行為單元傳遞到Hierarchical Context Hidden Marko Model (HC-HMM)進行個人行為辨識。個人行為的判讀會須要參考到時間,空間以及動作內容。為了能結合這三項因素,HC-HMM是一個階層式的HMM,第一層以空間做為輸入,進行個人行為辨識的基礎,而再以動作做為第二層的輸入,用來縮小辨識的行為範圍。最後,以時間做為輸入,決定個人行為辨識的結果。藉由上述的設計,IE-HMM可以辨識出不同空間,動作與時間下所呈現的不同行為,也可以同時辨識多個互動與個人行為。
Due to the lengthening of the human ages, the elderly’s daily health care has become one of the most critical issues in our society. As such, how to use the current technology for improving the well-being of the elderly daily life has become increasingly important. Due to the increased necessity of assisting elderly care, there are several nursing centers being established, some of which are installed with cameras for monitoring the elderly situation in every bedroom and hallway, for preventing them from unexpected accidence. However, this approach requires a dedicated person watching all of the screens at all time, which is a high human burden and cannot be avoided of the potential of human occasional negligence. Furthermore, monitoring abnormal behaviors should consider past behavior history and contextual environment event occurs. Thus an approach which can understand the elderly behaviors from their daily life based on video sequence would provide great assistance to monitor the elderly situation.
This paper presents an IE-HMM (Interaction Embedded Hidden Markov Model) for behaviors understanding including individual behaviors and interactions. IE-HMM is composed of Switch Control (SC), Individual Duration HMM (IDHMM) and Interaction Coupled Duration HMM. To recognize multiple independent behaviors in a scene, the independent (group) people are taken as an atomic behavior unit which has two types: individual behavior unit and interaction behavior unit. To determine the atomic behavior units and their participants, SC dynamically creates a two-person interaction detection component and applies a behavior unit decision component. The individual behavior units are sent to IDHMM which has numbers of Hierarchical Context Hidden Marko Model (HC-HMM) each infers elderly behaviors through three contexts which are spatial, activities, and temporal context. By considering the hierarchical architecture, IHC-HMM builds three modules composing the three components, reasoning in the primary and the secondary relationship. The spatial contexts are defined from the spatial structure, so that it is placed as the primary inference contexts. The temporal duration is associated to elderly activities, so activities are placed in the following of spatial contexts and the temporal duration is placed after activities. Between the spatial context reasoning and behavior reasoning of activities, a modified duration HMM is applied to extract activities. According to this design, human behaviors different in spatial contexts would be distinguished in first module. The behaviors different in activities would be determined in second module. The third module is to recognize behaviors involving different temporal duration. The interaction behavior units are sent to Interaction Coupled Duration HMM, which has number of CDHMM, each is for interaction recognition. An ICDHMM is a two-layer HMM, where the top layer handles the interaction durations and the bottom layer handles the participants’ physical correlation. As a result, IE-HMM can recognize individual behaviors and interactions considering reasonable range durations and body poses. It can recognize multiple independent interactions and handles the complex compositions of atomic behavior units.
[1] Ismail Haritaoglu, David Harwood, and Larry S. David, W4:Real-Time Surveillance
of People and Their Acitvities, IEEE Trans. on PAMI, vol.24 No.8, Aug 2000.
[2] Jezekiel Ben-Arie, Zhiqian Wang, Purvin Pandit, and shyamsundar Rajaram, Human
Activity Recognition Using Multidimensional Indexing, IEEE Trans. On PAMI,
vol.24 No.8, Aug 2002.
[3] Aaron F. Bobick and James W. Davis, The Recognition of Human Movement Using
Temporal Templates, IEEE Trans. on PAMI, vol. 23, No.3, Mar 2001
[4] Polana, R. and Nelson R., Recognizing activities, Proc. Int’l Conf. IAPR, vol.1, Oct.
1994, pp.815-818.
[5] Hironobu Fujiyoshi and Alan J. Lipton, Real-time human motion analysis by image
skeletonization, Proc. IEEE Workshop Applications of Computer Vision, Oct. 1998,
pp.15-21.
[6] M.Masudur Rahman, Kazuya Nakamura, and Seiji Ishikawa.Recognizing, Human
behavior using universal eigenspace, Proc. Int’l Conf. Pattern Recognition, vol.1, Aug.
2002, pp.295-298.
[7] Hisashi Miyamori and Shun-ichi Iisaku, Video annotation for content-based retrieval
using human behavior analysis and domain knowledge, Proc. Int’l Conf. Automatic
Face and Gesture Recognition, March 2000, pp.320-325.
[8] S. Fine, Y. Singer and N. Tishby. The Hierarchical Hidden Markov Model Analysis
and Applications, Machine Learning, Jul 1998, pp.41–62.
[9] Rabiner L.R, A Tutorial on Hidden Markov Models and Selected Applications in
Speech Recognition, Proceedings of the IEEE vol. 77, Feb. 1989 pp.257-286.
[10] S. Luhr, S. Venkatesh, G.West, and H. H. Bui., Duration Abnormality Detection inSequence of Human Activity, Technical report, Department of Computing, Curtin
University of Technology, May 2004.
[11] Hung H. Bui, Dinh Q. Phung and Svetha Venkatesh, Hierarchical Hidden Markov
Models with General State Hierarchy, In Proceedings of the Nineteenth National Conf
on Artificial Intelligence, 2004, pp.324–329
[12] Oliver N., Horvitz E. and Garg, A. Layered representations for human activity
recognition, Multimodal Interfaces, Proceedings. Fourth IEEE Int’l Conf on 14-16
Oct. 2002 pp.3-8
[13] Yamato J., Ohya J. and Ishii K., Recognizing human action in time-sequential images
using hidden Markov model, Computer Vision and Pattern Recognition, 1992.
Proceedings CVPR '92., 1992 IEEE Computer Society Conf on 15-18 June 1992
pp.379-385
[14] Aphrodite Galata, Neil Johnson and David Hogg, Learning Variable-Length Markov
Models of Behavior, Computer Vision and Image Understanding vol.81 , Issue 3
(March 2001) pp.398-413
[15] Mou-Yen Chen and Kundu, A. A complement to variable duration hidden Markov
model in handwritten word recognition, Image Processing, 1994. Proceedings.
ICIP-94., IEEE Int’l Conf, vol.1, 13-16 Nov. 1994 pp.174-178
[16] Russell M., A segmental HMM for speech pattern modeling, Acoustics, Speech, and
Signal Processing, 1993. ICASSP-93., 1993 IEEE Int’l Conf on, vol. 2, 27-30 April
1993 pp.499 - 502
[17] Zhang, X. and Mason, J.S., Improved training using semi-hidden Markov models in
speech recognition, Acoustics, Speech, and Signal Processing, 1989. ICASSP-89.,
1989 Int’l Conf on 23-26 May 1989 pp.306 - 309 vol.1
[18] Hara K., Omori T. and Ueno R., Detection of unusual human behavior in intelligenthouse, Neural Networks for Signal Processing, 2002. Proceedings of the 2002 12th
IEEE Workshop on 4-6 Sept. 2002 pp.697 - 706
[19] Tao Zhao and Ram Nevatia, Tracking Multiple Humans in Complex Situations, IEEE
Trans. On PAMI, vol. 26, No.9, Sept. 2004
[20] Vili Kellokumpu, Matti Pietikäinen and Janne Heikkilä, Human Activity Recognition
Using Sequences of Postures, Proc. IAPR Conf on Machine Vision Applications
(MVA 2005), Tsukuba Science City, Japan, pp.570-573.
[21] Nam T. Nguyen, Dinh Q. Phung, Svetha Venkatesh, Hung Bui, Learning and
Detection Activities from Movement Trajectories Using the Hierarchical Hidden
Markov Model, 2005 IEEE Computer Society Conf on Computer Vision and Pattern
Recognition (CVPR'05) , vol.2, pp.955-960
[22] Thi V. Duong, Hung H. Bui, Dinh Q. Phung, Svetha Venkatesh, Activity Recognition
and Abnormality Detection with the Switching Hidden Semi-Markov Model, IEEE
Computer Society Conf on CVPR 2005. pp.838- 845 vol. 1
[23] Nicholas Carter, David Young and James Ferryman, “A Combined Bayesian
Markovian Approach for Behaviour Recognition”. Pattern Recognition, 2006. ICPR
2006. 18th Int’l Conf
[24] Robertson, N, Reid, I, Brady, M. “Behaviour Recognition and Explanation for Video
Surveillance”. IEEE Crime and Security, 2006. The Institution of Engineering and
Technology Conf
[25] Pankaj Kumar, Surendra Ranganath, Huang Weimin, and Kuntal Sengupta,
“Framework for real-time behavior interpretation from traffic video”, IEEE Trans. on
Intelligent Transportation Systems, March 2005, vol. 6 pp.43- 53
[26] Aaron F. Bobick and James W. Davis, The Recognition of Human Movement Using
Temporal Templates, IEEE Trans. on PAMI, vol. 23, No.3, Mar 2001.[27] N. Oliver, B. Rosario, and A. Pentland, “A Bayesian computer vision system for
modeling human interactions”, IEEE Trans. on PAMI, vol. 22, No. 8, Aug. 2000.
[28] S. Hongeng and R. Nevatia, “Multi-agent event recognition”, in Proc. IEEE ICCV,
Vancouver, BC, Canada, Jul. 2001.
[29] D. Chen, R. Malkin, J. Yang, "Multimodal Detection of Human Interaction Events in
a Nursing Home Environment", Proceedings of ACM Int’l Conf on Multimodal
Interface (ICMI), pp.82 - 89, 2004.
[30] Chen, D., Wactlar, H., Yang, J., "Towards Automatic Analysis of Social Interaction
Patterns in a Nursing Home Environment from Video", In ACM SIGMM Int’l
Workshop on Multimedia Information Retrieval, New York, pp.283-290, October
2004
[31] N. Johnson, A. Galata, and D. Hogg. "The acquisition and use of interaction
behaviour models". IEEE Conf. on Computer Vision and Pattern Recognition, 1998.
[32] Sato, K. and J.K. Aggarwal, "Tracking and recognizing two-person interactions in
outdoor image sequences", IEEE Workshop on Multi-Object Tracking, Vancouver,
CA, 2001.
[33] S. Park, J. Park, and J. K. Aggarwal. "Video retrieval of human interactions using
model-based motion tracking and multi-layer finite state automata", Lecture Notes in
Computer Science: Image and Video Retrieval. Springer Verlag, 2003.
[34] S. Park and J.K. Aggarwal, "Recognition of human interaction using multiple features
in grayscale images", Int’l Conf on Pattern Recognition, Barcelona, Spain, pp.51-54,
2000
[35] S. Park and J.K. Aggarwal. "Recognition of two-person interactions using a
hierarchical Bayesian network", ACM SIGMM Int’l Workshop on Video Surveillance,
pp.65–76, Berkeley, CA, USA, 2003.[36] S. Park and J.K. Aggarwal. "Event semantics in two-person interactions. Int’l Conf on
Pattern Recognition", Cambridge, UK, 2004.
[37] S. Park and J.K. Aggarwal. "Semantic-level Understanding of Human Actions and
Interactions using Event Hierarchy", IEEE Int’l Workshop on Computer Vision and
Pattern Recognition, 2004
[38] Dong Zhang Daniel Gatica-Perez, Samy Bengio, and Iain McCowan, “Modeling
Individual and Group Actions in Meetings With Layered HMMs”, IEEE Trans. on
Multimedia, vol. 8, No.3, June 2006.
[39] Iain McCowan, Daniel Gatica-Perez, Samy Bengio, Guillaume Lathoud, Mark
Barnard, and Dong Zhang, “Automatic Analysis of Multimodal Group Actions in
Meetings”, IEEE Trans. on PAMI, vol.27 No3, Mar 2005.
[40] Dong Zhang, Daniel Gatica-Perez, Samy Bengio, Iain McCowan, and Guillaume
Lathoud, “Modeling Individual and Group Actions in Meetings: a Two-Layer HMM
Framework”, IEEE Conf on CVPR 2004.
[41] Zhang, Dong and Bengio, Samy, “Exploring Contextual Information in a Layered
Framework for Group Action Recognition”, IEEE Conf on ICME 2007.
[42] Stephan Reiter and Gerhard Rigoll, "A Neural-Field-like Approach for Modeling
Human Group Actions in Meetings". IEEE Conf on Multimedia 2005.
[43] Oliver Brdiczka, Jérôme Maisonnasse and Patrick Reignier, "Automatic Detection of
Interaction Groups", ICMI 2005.
[44] Tao Zhao and Ram Nevatia, Fellow, IEEETracking Multiple Humans in Complex
Situations, IEEE Trans. on PAMI, vol.26 No.9, Sept. 2004.