| 研究生: |
林居本 Lin, Chu-Pen |
|---|---|
| 論文名稱: |
平行隱藏式馬可夫模型之表情辨識系統 Facial Expression Recognition System using Parallel Hidden Markov Models |
| 指導教授: |
郭淑美
Guo, Shu-Mei |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2006 |
| 畢業學年度: | 94 |
| 語文別: | 中文 |
| 論文頁數: | 61 |
| 中文關鍵詞: | 動作差異 、動作元件 、融合 、Cohn-Kanade 人臉表情資料庫 、平行隱藏式馬可夫模型 、人類表情辨識 |
| 外文關鍵詞: | hidden Markov models, facial expression recognition, action unit, fusion, parallel hidden Markov models |
| 相關次數: | 點閱:126 下載:1 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
影像處理研究領域中,表情分析辨識一直是具挑戰性的研究議題。然而,由於表情影像序列具時間特性,故本研究使用具時間特性的隱藏式馬可夫模型(Hidden Markov Model; HMM)來處理此表情辨識問題。本論文自Cohn-Kanade人臉表情資料庫中的影像萃取了動作元件(Action Unit),以及前後影像的動作差異資料來做表情辨識的特徵。藉由此兩類特徵,各自建立獨立的兩群隱藏式馬可夫模型,並提出特殊的融合技術(Fusion Strategy)來建構平行隱藏式馬可夫模型(Parallel Hidden Markov Models; PaHMMs),藉此解決多重類別的臉部表情辨識問題。此法經有實驗證明,其可大大的提高臉部表情中較難辨識的三類表情(生氣、噁心、害怕)的辨識率,亦致六種典型的表情(高興、生氣、哀傷、驚訝、噁心、害怕)達87.5%的辨識率。
Analysis of human facial expression is a challenging problem with many applications. In this paper, a novel hidden Markov model (HMM) method is proposed for facial expression recognition in image sequences. Because the facial expression problem is a temporal problem, HMMs can extract the timely characteristic from an image sequence. In this paper, we extract action unit (AU) and motion differences from Cohn-Kanade face database image sequences as the features of the facial expression recognition. Based on AU features and motion difference features, two groups of HMMs are established. Unlike the traditional facial expression recognition approach, the architecture of parallel hidden Markov models (PaHMMs) is used. PaHMMs can keep the timely features from a continuous data sequence, using the fusion strategy of PaHMMs to solve multi-class classification problems. Finally, the experimental result of recognizing six prototypic emotional expressions with a recognition rate of 87.5% is achieved.
[1] P. Ekman and W. V. Friesen, “Emotion in the human face,” Prentice Hall, New York: Cambridge University, vol.2, no.1, pp. 1-143, 1982.
[2] R. L. Hsu, M. Abdel-Mottaleb, and A. K. Jain, “Face detection in color images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 5, pp. 696 – 706, 2002.
[3] M. Wang, Y. Iwai, and M. Yachida, “Recognizing degree of continuous facial expression change,” International Conference on Pattern Recognition, Sidney, vol.2, no.8, pp. 1188 – 1190, 1998.
[4] Y. Yacoob and L. Davis, “Recognizing human facial expressions from long image sequences using optical flow,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.8, no.6, pp. 636 – 642, 1996.
[5] Y. Gao, M. K. H. Leung, S. C. Hui, and M. W. Tananda, “Facial expression recognition from line-based caricatures,” IEEE Transactions on Systems, Man, and Cybernetics, vol.33, no.3 pp.407 – 412, 2003.
[6] Y. Gao and M. K. H. Leung, “Face recognition using line edge map,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.24, no.6, pp.764 – 779, 2002.
[7] P. Ekman and W. V. Friesen, Facial Action Coding System (FACS), http://face-and-emotion.com/dataface/facs/description.jsp
[8] J. Cohn, A. Zlochower, J. J. Lien, Y. T. Wu, and T. Kanade, “Automated face coding: a computer-vision based method of facial expression analysis,” In 7th European Conference on Facial Expression Measurement and Meaning, Salzburg, vol.7, no.7, pp. 329 – 333, 1997.
[9] T. Otsuka and J. Ohya, “Spotting segments displaying facial expression from image sequences using HMM.” In IEEE Proceedings of the Second International Conference on Automatic Face and Gesture Recognition, Nara, vol.14-16, no.4, pp. 442 – 447, 1998.
[10] N. Oliver, A. Pentland, and F. Berard, “LAFTER: A real-time lips and face tracker with facial expression recognition,” In Proceedings of the IEEE Conference on Computer Vision, S. Juan, vol.11, no.1, 1997.
[11] H. Kobayashi and F. Hara, “Dynamic recognition of basic facial expressions by discrete-time recurrent neural network,” In Proceedings of the International Joint Conference on Neural Network, Seattle, vol.1, no.10, pp. 155 – 158, 1993.
[12] M. Rosenblum, Y. Yacoob, and L. Davis, “Human expression recognition from motion using a radial basis function network architecture,” IEEE Transactions on Neural Networks, vol.7, no.5, pp. 1121 – 1138, 1996.
[13] I. Essa and A. Pentland, “Coding, analysis, interpretation and recognition of facial expressions,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.19, no.7, pp. 757 – 763, 1997.
[14] L. E. Baum and T. Petrie, “Statistical inference for probabilistic functions of finite state Markov chains,” Annals of the Institute of Statistical Mathematics, vol. 37, no. 5, pp. 1554 – 1563, 1966.
[15] L. E. Baum and J. A. Egon, “An inequality with applications to statistical estimation for probabilistic functions of a Markov process and to a model for ecology,” Bulletin of the American Meteorological, vol. 73, no. 2, pp. 360 – 363, 1967.
[16] L. E. Baum and G. R. Sell, “Growth functions for transformations on manifolds,” Pacific journal of mathematics, vol. 27, no. 2, pp. 211 – 227, 1968.
[17] L. E. Baum, T. Petrie, G. Soules, and N. Weiss, “A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains,” The Annals of mathematical statistics, vol.41, no. 1, pp. 164 – 171, 1970.
[18] L. E. Baum, “An inequality and associated maximization technique in statistical estimation for probabilistic functions of Markov processes,” Journal of Inequalities in Pure and Applied Mathematics, vol. 3, pp. 1 – 8, 1972.
[19] J. K. Baker, “The dragon system-An overview,” IEEE transactions on acoustics speech and signal processing, vol. 23, no. 1, pp. 24 – 29, Feb. 1975.
[20] F. Jelinek, “A fast sequential decoding algorithm using a stack,” Interpretive Structural Modeling Develop, vol. 13, no. 6, pp. 675 – 685, 1969.
[21] L. R. Bahl and F. Jelinek, “Decoding for channels with insertions, deletions, and substitutions with applications to speech recognition,” IEEE Transactions on Information Theory, vol. 21, no.5, pp. 404 – 411, 1975.
[22] F. Jelinek, L. R. Bahl, and R. L. Mercer, “Design of a linguistic statistical decoder for the recognition of continuous speech,” IEEE Transactions on Information Theory, vol. 21, no.5, pp. 250 – 256, 1975.
[23] F. Jelinek, “Continuous speech recognition by statistical methods,” Proceedings of the IEEE, vol. 64, no.4, pp. 532 – 536, 1976.
[24] R. Bakis, “Continuous speech word recognition via cent second acoustic states,” in Proceedings of the ASA Meeting (Washington, DC), 1976.
[25] F. Jelinek, L. R. Bahl, and R. L. Mercer, “Continuous speech recognition: Statistical methods,” in Handbook of Statistics, P. R. Krishnaiad, Ed. Amsterdam, vol.64, no.4, pp. 532-556, 1976.
[26] L. R. Bahl, F. Jelinek, and R. L. Mercer, “A maximum likelihood approach to continuous speech recognition,” IEEE Transactions on Pattern Analysis Machine, vol. 5, no.6, pp. 179 – 190, 1983.
[27] A. V. Nefian and M. H. Hayes III, “Hidden Markov models for face recognition, acoustics, speech, and signal processing,” Proceedings IEEE International Conference on Signal Processing, Seattle, vol. 5, no.5, pp. 2721– 2724, 1998.
[28] S. Müller, F. Wallhoff, F. Hülsken, and G. Rigoll, “Facial expression recognition using pseudo 3-D hidden Markov models,” Proceedings. 16th International Conference on Pattern Recognition, Quebec, vol. 2, no.8, pp. 32 – 35, 2002.
[29] B. W. Miners and O. A. Basir, “Dynamic facial expression recognition using fuzzy hidden Markov models systems,” IEEE International Conference on Man and Cybernetics, Vancouver, vol. 2, no.10, pp. 1417 – 1422, 2005.
[30] L. Ma, D. Chelberg and M. Celenk, “Spatio-temporal modeling of facial expressions using Gabor-wavelets and hierarchical hidden Markov models,” Image Processing, IEEE International Conference on Man and Cybernetics, Genova, vol. 2, no.9, pp. II - 57 – 60, 2005.
[31] X. Zhou, X. Huang, B. Xu, and Y. Wang, “Real-time facial expression recognition based on boosted embedded hidden Markov model image and graphics,” Proceedings. Third International Conference on Computer Vision, Tarragona, vol.8, no.12, pp. 290 – 293, 2004.
[32] C. Vogler and D. Metaxas, “Parallel hidden Markov models for American sign language recognition” The Proceedings of the Seventh IEEE International Conference on Computer Vision, New York, vol.1, no.9, pp. 116 – 122, 1999.
[33] F. Brugnara, R. D. Mori, D. Giuliani, and M. Omologo, “A family of parallel hidden Markov models” IEEE International Conference on Acoustics, Speech, and Signal Processing, San Francisco, vol.1, no.3, pp. 377 – 380, 1992.
[34] H. Bourlard and S. Dupont, “Subband-based speech recognition” IEEE International Conference on speech recognition, Munich, vol.2, no.4, pp.1251, 1997.
[35] H. Hermansky, S. Tibrewala, and M. Pavel, “Towards ASR on practically corrupted speech”. Proceedings, Fourth International Conference on speech recognition, Philadelphia, vol.1, no.6, pp. 462 – 465, 1996.
[36] L. Breiman, “Bagging Predictors,” International Conference on Machine Learning, Bari, vol. 421, no.9, pp.123 – 140, 1996.
[37] R. Kuhn, J. C. Junqua, P. Nguyen, and N. Niedzielski, “Rapid speaker adaptation in eigenvoice space,” IEEE Transactions on speech and audio processing, vol.8, no.6, pp. 695 – 707, 2000.
[38] I. Bloch, “Information combination operators for data fusion: a comparative review with classification,” IEEE Transaction on Systems Man Cybernet, vol.26, no.1, pp.52 – 67, 1996.
[39] M. I. Jordan and R. A. Jacobs, “Hierarchical mixtures of experts and the EM algorithm,” IEEE International Joint Conference on Neural Networks, Orlando, vol. 6, no.2, pp. 181 – 214, 1994.
[40] K.Woods, W. P. Kegelmeyer, and K.Bowyer, “Combination of multiple classifier using local accuracy estimates,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no.4, pp. 405 – 410, 1997.
[41] P. Ekman, W. V. Friesen, and J. C. Hager, The new (2002) Facial Action Coding System (FACS), http://face-and-emotion.com/dataface/facs/new_version.jsp
[42] J. Cohn, A. Zlochower, J. J. Lien, Y. T. Wu, and T. Kanade, “Cohn-Kanade AU-Coded Facial Expression Database, ” http://vasc.ri.cmu.edu/idb/html/face/facial_expression
[43] M. Pantic and L. J. M. Rothkrantz, “An Expert System for Multiple Emotional Classification of Facial Expressions,” Proceedings. IEEE International Conference on Tools with Artificial Intelligence, Boston, vol. 9, no.11, pp. 113 – 120, 1999.
[44] D. Datcu and L. J. M. Rothkrantz, “Automatic recognition of facial expressions using Bayesian Belief Networks,” IEEE International Conference on Systems, Man and Cybernetics, Hague, vol. 3, no.10, pp. 2209 – 2214, 2004.
[45] L. R. Rabiner, ” A tutorial on hidden Markov models and selected applications in speech recognition” Proceedings of the IEEE on speech recognition, vol. 77, no. 2, pp. 257 – 286, 1989.