| 研究生: |
田靜樺 Tian, Jing-Hua |
|---|---|
| 論文名稱: |
基於幸福九大因子之人體動作影像理解 Human Action Understanding System based on HAPPINESS Factors with Skeleton Data from RGB-D Sensor |
| 指導教授: |
王駿發
Wang, Jhing-Fa |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2016 |
| 畢業學年度: | 104 |
| 語文別: | 英文 |
| 論文頁數: | 48 |
| 中文關鍵詞: | 動作理解 、動作辨識 、人體骨架 、動態時間校正 、模板選擇 |
| 外文關鍵詞: | human action understanding, human action recognition, skeleton, Dynamic Time Warping, template selection |
| 相關次數: | 點閱:92 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
有別以往的幸福感偵測,本論文提出基於幸福九大因子之人體動作理解系統,以人體動作辨識來理解幸福。人體行為的辨識與理解在電腦視覺研究領域中是非常活躍且具有挑戰的研究課題,本篇論文藉由深度相機的幫助,如微軟Kinect v2,取得人類姿勢預估的資訊,並以此建置幸福九大因子動作資料庫,資料庫中共收集了27個由幸福九大因子定義出的動作。本系統分成三個模組:姿勢辨識、動作辨識以及幸福九大因子分類。首先,使用骨架資訊擷取全身及手部特徵,在姿勢辨識中採用libSVM方法;在動作辨識中,使用動態時間校正方法,來比對動作相似度。除此之外為了增加動作匹配之運算速度,本篇提出一種較新穎之模板選擇方法,在同一種動作裡,選出適當且具代表性動作的模板,其中個別特徵可以來自不同之受測者;最後會由姿勢與動作辨識的結果對應到本篇提出的九大幸福因子動作進行分類。最後,以本研究建立的幸福九大因子動作資料庫做測試,實驗結果顯示姿勢辨識的準確率達到97.46%,動作辨識的準確率高達90.38%;同時也比較本方法在Kinect Activity Recognition Dataset的實驗結果,發現動作辨識的準確率最高可達到96.3%,以上結果證明本研究在不同資料庫的測試下辨識度仍達9成以上。
The proposed research is different with previous happiness detection, the human action understanding system based on HAPPINESS factors is presented by using human action recognition to interpret happiness. Recognition and understanding of human behavior is a popular research topic and also a challenge in computer vision. With the aid of depth camera, such as Microsoft Kinect v2, information about human posture estimation can be acquired, and is used to form the HAPPINESS Factors Action Dataset, totally 27 actions are included. The proposed system composed of three modules, i.e., posture recognition, action recognition and happiness factors classification, respectively. The proposed system extracts the features of skeleton information for entire body and arms movement. The method of libSVM is then adopted for posture recognition. For action recognition, we applied the Dynamic Time Warping to compare the similarity between two actions. Furthermore, we propose a template selection method to speed up the action comparison, it selects suitable templates among different subjects within the same action, i.e., individual features may come from different subjects, and then selects representative template. Finally, the system classifies the corresponding HAPPINESS factors from the results of posture and action recognition. Experiments were performed on proposed HAPPINESS Factors Action Dataset and the results show that the accuracy of our posture and action recognition is 97.46% and 90.38%, respectively. The proposed method also achieves 96.3% action recognition rate on Kinect Activity Recognition Dataset, which confirm that the recognition accuracy of the proposed method is performing well in both datasets.
[1] R. A. Calvo and S. D'Mello, "Affect detection: An interdisciplinary review of models, methods, and their applications," IEEE Transactions on affective computing, vol. 1, pp. 18-37, 2010.
[2] G. S. Parra-Dominguez, B. Taati, and A. Mihailidis, "3D human motion analysis to detect abnormal events on stairs," in 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission, 2012, pp. 97-103.
[3] H.-H. Hsu, Y. Chiou, Y.-R. Chen, and T. K. Shih, "Using kinect to develop a smart meeting room," in 2013 16th International Conference on Network-Based Information Systems, 2013, pp. 410-415.
[4] E. Velloso, A. Bulling, and H. Gellersen, "MotionMA: motion modelling and analysis by demonstration," in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 2013, pp. 1309-1318.
[5] S. Liu, Y. Wang, L. Yuan, J. Bu, P. Tan, and J. Sun, "Video stabilization with a depth camera," in Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, 2012, pp. 89-95.
[6] F. Yang, J. Huang, X. Yu, X. Cui, and D. Metaxas, "Robust face tracking with a consumer depth camera," in 2012 19th IEEE International Conference on Image Processing, 2012, pp. 561-564.
[7] W. Shen, K. Deng, X. Bai, T. Leyvand, B. Guo, and Z. Tu, "Exemplar-based human action pose correction and tagging," in Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, 2012, pp. 1784-1791.
[8] L. Chen and C. Nugent, "Ontology-based activity recognition in intelligent pervasive environments," International Journal of Web Information Systems, vol. 5, pp. 410-430, 2009.
[9] J. Yamato, J. Ohya, and K. Ishii, "Recognizing human action in time-sequential images using hidden markov model," in Computer Vision and Pattern Recognition, 1992. Proceedings CVPR'92., 1992 IEEE Computer Society Conference on, 1992, pp. 379-385.
[10] V. Kellokumpu, M. Pietikäinen, and J. Heikkilä, "Human activity recognition using sequences of postures," in MVA, 2005, pp. 570-573.
[11] N. Ikizler-Cinbis and S. Sclaroff, "Web-based classifiers for human action recognition," IEEE Transactions on Multimedia, vol. 14, pp. 1031-1045, 2012.
[12] Y. Song, L.-P. Morency, and R. Davis, "Action recognition by hierarchical sequence summarization," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2013, pp. 3562-3569.
[13] S. Nomm and K. Buhhalko, "Monitoring of the human motor functions rehabilitation by neural networks based system with kinect sensor," IFAC Proceedings Volumes, vol. 46, pp. 249-253, 2013.
[14] J. Sung, C. Ponce, B. Selman, and A. Saxena, "Unstructured human activity detection from rgbd images," in Robotics and Automation (ICRA), 2012 IEEE International Conference on, 2012, pp. 842-849.
[15] T. T. Thanh, F. Chen, K. Kotani, and H.-B. Le, "Extraction of discriminative patterns from skeleton sequences for human action recognition," in Computing and Communication Technologies, Research, Innovation, and Vision for the Future (RIVF), 2012 IEEE RIVF International Conference on, 2012, pp. 1-6.
[16] S.-Y. Lin, C.-K. Shie, S.-C. Chen, M.-S. Lee, and Y.-P. Hung, "Human action recognition using action trait code," in Pattern Recognition (ICPR), 2012 21st International Conference on, 2012, pp. 3456-3459.
[17] J. Wang, Z. Liu, and Y. Wu, "Learning actionlet ensemble for 3D human action recognition," in Human Action Recognition with Depth Cameras, ed: Springer, 2014, pp. 11-40.
[18] W. Li, Z. Zhang, and Z. Liu, "Action recognition based on a bag of 3d points," in 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops, 2010, pp. 9-14.
[19] L. Xia and J. Aggarwal, "Spatio-temporal depth cuboid similarity feature for activity recognition using depth camera," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2013, pp. 2834-2841.
[20] Y. Song, J. Tang, F. Liu, and S. Yan, "Body surface context: A new robust feature for action recognition from depth videos," IEEE Transactions on Circuits and Systems for Video Technology, vol. 24, pp. 952-964, 2014.
[21] R. Girshick, J. Shotton, P. Kohli, A. Criminisi, and A. Fitzgibbon, "Efficient regression of general-activity human poses from depth images," in 2011 International Conference on Computer Vision, 2011, pp. 415-422.
[22] D. S. Alexiadis, P. Kelly, P. Daras, N. E. O'Connor, T. Boubekeur, and M. B. Moussa, "Evaluating a dancer's performance using kinect-based skeleton tracking," in Proceedings of the 19th ACM international conference on Multimedia, 2011, pp. 659-662.
[23] C. Schüldt, I. Laptev, and B. Caputo, "Recognizing human actions: a local SVM approach," in Proceedings of the 17th International Conference on Pattern Recognition, 2004, pp. 32-36.
[24] M. Blank, L. Gorelick, E. Shechtman, M. Irani, and R. Basri, "Actions as space-time shapes," in Tenth IEEE International Conference on Computer Vision, 2005, pp. 1395-1402.
[25] H. Kuehne, H. Jhuang, R. Stiefelhagen, and T. Serre, "HMDB51: A large video database for human motion recognition," in High Performance Computing in Science and Engineering, ed: Springer, 2013, pp. 571-582.
[26] J. Sung, C. Ponce, B. Selman, and A. Saxena, "Human Activity Detection from RGBD Images," plan, activity, and intent recognition, vol. 64, 2011.
[27] B. Ni, G. Wang, and P. Moulin, "Rgbd-hudaact: A color-depth video database for human daily activity recognition," in Consumer Depth Cameras for Computer Vision, ed: Springer, 2013, pp. 193-208.
[28] K. Yun, J. Honorio, D. Chattopadhyay, T. L. Berg, and D. Samaras, "Two-person interaction detection using body-pose features and multiple instance learning," in 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2012, pp. 28-35.
[29] W. H. Organization, "Constitution of the World Health Organization," 1989.
[30] O. Dictionaries. (2010). Oxford dictionaries.
[31] E. Pol and P. Carroll, "An introduction to economics with emphasis on innovation," 2006.
[32] G. Berns, Satisfaction: The science of finding true fulfillment: Macmillan, 2010.
[33] J. Butlin, "Our common future. By World commission on environment and development.," ed: Wiley Online Library, 1989.
[34] J. Shotton, T. Sharp, A. Kipman, A. Fitzgibbon, M. Finocchio, A. Blake, et al., "Real-time human pose recognition in parts from single depth images," Communications of the ACM, vol. 56, pp. 116-124, 2013.
[35] P. Cottone, G. L. Re, G. Maida, and M. Morana, "Motion sensors for activity recognition in an ambient-intelligence scenario," in Pervasive Computing and Communications Workshops (PERCOM Workshops), 2013 IEEE International Conference on, 2013, pp. 646-651.
[36] M. Raptis, D. Kirovski, and H. Hoppe, "Real-time classification of dance gestures from skeleton animation," in Proceedings of the 2011 ACM SIGGRAPH/Eurographics symposium on computer animation, 2011, pp. 147-156.
[37] C. Cortes and V. Vapnik, "Support-vector networks," Machine learning, vol. 20, pp. 273-297, 1995.
[38] C.-W. Hsu, C.-C. Chang, and C.-J. Lin, "A practical guide to support vector classification," 2003.
[39] C.-C. Chang and C.-J. Lin, "LIBSVM: a library for support vector machines," ACM Transactions on Intelligent Systems and Technology (TIST), vol. 2, p. 27, 2011.
[40] X. Xi, E. Keogh, C. Shelton, L. Wei, and C. A. Ratanamahatana, "Fast time series classification using numerosity reduction," in Proceedings of the 23rd international conference on Machine learning, 2006, pp. 1033-1040.
[41] D. R. Wilson and T. R. Martinez, "Instance pruning techniques," in ICML, 1997, pp. 403-411.
[42] S. Gaglio, G. L. Re, and M. Morana, "Human activity recognition process using 3-D posture data," IEEE Transactions on Human-Machine Systems, vol. 45, pp. 586-597, 2015.
校內:2021-07-31公開