| 研究生: |
蔡翔宇 Tsai, Hsiang-Yu |
|---|---|
| 論文名稱: |
預測並重建臉部遮蔽表情單元之情緒辨識 Facial Action Units Prediction and Reconstruction for Emotion Recognition under Partial Occlusion |
| 指導教授: |
吳宗憲
Wu, Chung-Hsien |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2013 |
| 畢業學年度: | 101 |
| 語文別: | 中文 |
| 論文頁數: | 42 |
| 中文關鍵詞: | 人臉重建 、遮蔽效應 |
| 外文關鍵詞: | Face Reconstruct, Occlusive Effect |
| 相關次數: | 點閱:87 下載:12 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來,隨著智慧型機器人的應用受到廣泛的重視,為了讓電腦具有人性化的互動能力,其中一個重要的課題就是讓電腦了解人類的情緒。然而,表現情緒的方式有很多,其中最明顯也最直接的表示方式,就是透過臉部表情的變化來傳達。以往人臉情緒辨識方面的研究大多針對單純的臉部表情,亦即辨識使用者在沒有被遮蔽情況下臉部表情所呈現的情緒狀態;然而,在日常生活中臉部區域可能在說話當時,被手勢或者被配帶的飾品所遮蔽,造成人臉情緒的辨識率下降。所以為了提升辨識的應用價值和正確性,故遮蔽的議題應該被考慮。
為了克服遮蔽效應對於人臉情緒辨識的影響,本論文提出一個有效的臉部遮蔽區域偵測以及重建機制。首先,利用主動式外觀模型(Active Appearance Model; AAM)擷取臉部特徵之後,進一步偵測臉部特徵是否被遮蔽。接著,我們利用錯誤權重交互相關模型(Error Weighted Cross-Correlation Model; EWCCM),透過模型化未遮蔽之臉部區域特徵,來自動預測出遮蔽區域之臉部動作單元(Facial Action Unit; AU)。此方法主要透過高斯混合模型(Gaussian Mixture Model; GMM)有效的模組化各個未遮蔽之臉部區域特徵,並進一步探討各個配對之未遮蔽臉部區域之交互關聯性,並透過整合Bayesian分類器的權重架構,增加AU預測之準確性。當預測完遮蔽區域的AU後,利用迴歸模型融合技術有效結合各個未遮蔽區域所建置之廻歸模型,重建出遮蔽區域原有的特徵點座標(即所呈現之動作單元)。最後,我們將重建前後之特徵分別結合其他臉部特徵去做人臉情緒辨識,並比較其正確性。
實驗部份,使用卡內基美濃臉部表情影像資料庫CK+共選出5種與嘴巴相關的AU共176筆資料。方法比較方面,我們使用leave-one-out交叉測試方法對於所提出的系統效能做評估。實驗結果顯示,第一階段的遮蔽區域偵測率皆可達到97%以上,對之後的實驗具有穩定性。第二階段則為推論遮蔽區域之AU,實驗上我們比較傳統GMM以及所提出之GMM-base CMM與EWCCM這三種模型。由數據得知,EWCCM平均的AU偵測率達到83.53%為最高。實驗結果證實所提出之EWCCM模型可以有效的偵測出被遮蔽區域之AU。第三階段利用迴歸模型融合技術,重建遮蔽部位之特徵座標,誤差少於一般方法的1.4至3.4倍。最後將重建後之人臉特徵做情緒辨識,相較於沒有對遮蔽區域做處理,我們所提出之方法達到13.07%的辨識率改善,及辨識率分別為61.93%和74.43%。實驗證實,本文所提出之遮蔽區偵測以及重建機制可以有效減少遮蔽效應對於臉部表情辨識所造成的影響,進而提升臉部表情之辨識率。
In recent years, with the development of computer technology, the intelligent robots are getting considerable attentions from different fields of applications. Hence, creating an intelligent human-computer interface toward harmonious interaction between robot and human has become an important issue. To make the computer has the better ability to interact with human; the emotion recognition is a critical topic. There are many ways to express emotions; however, facial expression is one of the most directly related cues to human emotions. In the previous researches, most facial expression recognition tasks use the pure expression database for recognizing user’s emotional state which recognizes user’s expression without considering the effect of partial facial occlusion. However, human face is often occluded by adornments or hand gesture during expressing emotions in real communication. Accordingly, the occlusive effect will decrease the accuracy of facial expression recognition. In order to increase the system’s value in real life applications, the occlusive effect in facial expression should be considered.
To overcome the impact of the occlusive effect on the recognition accuracy of facial expression, this paper presents an effective mechanism to detect the occlusive regions and then reconstruct them. First, we use Active Appearance Model (AAM) to localize facial feature points and then used to get the facial region information for occlusive region detection. Next, we present an approach to automatic prediction of action unit (AU) under partial facial occlusion by using the proposed Error Weighted Cross-Correlation Model (EWCCM), which effort to provide the correct facial information on occlusive region for later facial expression recognition. In the EWCCM, a Gaussian Mixture model (GMM)-based cross-correlation model (CMM) is first proposed not only models the facial feature variations but also explores the co-occurrence probabilities among facial features for predicting AU. The Bayesian classifier weighting scheme (EWC) is then adopted to integrate the GMM-based CCMs (i.e., form the EWCCM) to enhance the AU prediction accuracy. After facial AU prediction, the strategy of multiple regression fusion is then used to effectively combine various facial regression models for reconstructing AU on occluded region. Finally, the reconstructed facial feature points are then used to combine with other non-occluded facial feature points for facial expression recognition.
In the experiments, five kinds of the AUs which related to the lower facial regions are considered. In total, 176 data are chosen from CK+ facial expression database. For performance evaluation, the leave-one-out cross-validation method is used in this study. Experimental results show that the detection rate of occlusive region can achieve about 97% accuracy in the first step, it will make stable for the next experimental steps. In the second step, we will predict the action unit (AU) under detected occluded region. Three methods are compared including traditional GMM, the proposed GMM-base CMM, and EWCCM. For evaluation, the results show that the average prediction accuracy of the proposed EWCCM is best which achieved 83.53% accuracy. Hence, the result demonstrated that the proposed EWCCM can provide a better ability for predicting facial AU under partial facial occlusion. In terms of facial feature points reconstruction, the regression fusion strategy is adopted and confirmed that better than traditional regression model which the reconstructed error of the proposed fusion strategy is less than traditional method 1.4 to 3.4 times. Finally, for facial expression recognition, compared to the method of no deal with the occlusive effect, the recognition rate of the proposed mechanism is achieved the 13.07% improvement, that is, 61.93% and 74.43%, respectively. Based on these analyses, the proposed mechanism is demonstrated that useful to deal with the occlusive effect on facial expression recognition.
[1]A. Mehrabian. “Communication without words”. Psychology Today, 2:53–56, 1968.
[2]Mehrabian and S.R. Ferris, “Inference of attitude from nonverbal communication in two channels”, Journal of Counseling Psychology 31 (3) (1967), pp. 248–252.
[3]Ambady, N., & Rosenthal, R. (1992). “Thin slices of expressive behavior as predictors of interpersonal consequences: A meta-analysis”. Psychological Bulletin, 111, 256-274.
[4]Z. Zeng, M. Pantic, G.I. Roisman, and T.S. Huang, “A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions”, Proc. Ninth ACM Int'l Conf. Multimodal Interfaces (ICMI '07), pp. 126-133, 2007.
[5]A. Azcarate, F. Hageloh, K.v.d. Sande, R. Valenti, “Automatic facial emotion recognition”. Universiteit van Amsterdam, June 2005.
[6]Ioannou, S., Raouzaiou, A., Tzouvaras, V., Mailis, T., Karpouzis, K., & Kollias, S. (2005). “Emotion recognition through facial expression analysis based on a neurofuzzy network”. Special Issue on Emotion: Understanding & Recognition, Neural Networks, 18(4), 423–435.
[7]I. Cohen, A. Garg, T.S. Huang, “Emotion recognition from facial expressions using multilevel HMM”, Neural Inf. Process. Syst. (2000)
[8]P. Ekman and W. Friesen. “Facial action coding system (FACS): Manual”. Palo Alto: Consulting Psychologists Press, 1978.
[9]Ying-li Tian, Takeo Kande, Jeffrey F. Cohn, “Robost Lip Tracking by Combining Shape, Color and Motion”. Proc.Asian Conf. Computer Vision, pp.1040~1045,2000.
[10]Ying-li Tian, Takeo Kande, Jeffrey F. Cohn, “Recognizing Action Units for Facial Expression Analysis”. IEEE Transaction on Pattern Analysis and Machine Intelligence, pp. 97~115,2001.
[11]M. Pantic and L. J. Rothkrantz. “Automatic analysis of facial expressions: The state of the art”. IEEE Transactions On Pattern Analysis And Machine Intelligence, 22(12):1424–1445, December 2000.
[12]M. Pantic and L. J. M. Rothkrantz. “An expert system for recognition of facial actions and their intensity”. In Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence, 2000.
[13]Cuiping Zhang. “3D Face Structure Extraction from Images at Arbitrary Poses and under Arbitrary Illumination Conditions”. In partial fulfillment of the Requirements for the degree Of Doctor of Philosophy, October 2006.
[14]Tomoko Okada, Tetsuya Takiguchi and Yasuo Ariki. “Pose Robust and Person Independent Facial Expressions Recognition Using AAM Selection”. In Consumer Electronics, 2009.
[15]Ting Shan, Lovell, B.C., Shaokang Chen. “Face Recognition Robust to Head Pose from One Sample Image”. Pattern Recognition, ICPR 2006.
[16]Shiro Kumano, Kazuhiro Otsuka, Junji Yamato, Eisaku Maeda, Yoichi Sato. “Pose-Invariant Facial Expression Recognition Using Variable-Intensity Templates”. International Journal of Computer Vision, 2009.
[17]Bourel, F., Chibelushi, C. C., and Low, A. A. 2001. Recognition of facial expressions in the presence of occlusion. In BMVC. Manchester, UK, 213-222.
[18]Kotisa, I., Buciu, I., and Pitas I. 2008. An analysis of facial expression recognition under partial facial image occlusion. J. IVC. 26, 7, 1052-1067.
[19]Miyakoshi, Y. and Kato, S. 2011. Facial emotion detection considering partial occlusion of face using Bayesian network. In ISCI. 96-101.
[20]I. Matthews and S. Baker, “Active Appearance Models revisited,” International Journal of Computer Vision, Vol.60, no. 2, pp. 135-164, 2004.
[21]J. C. Lin, C. H. Wu, and W. L. Wei, “Error weighted semi-coupled hidden Markov model for audio-visual emotion recognition,” IEEE Trans. Multimedia, vol. 14, no.1, pp. 142–156, Feb. 2012.
[22]徐清郎, 迴歸分析。