| 研究生: |
范維康 Fan, Wei-Kang |
|---|---|
| 論文名稱: |
應用多區域方向梯度直方圖之鑑別性笑臉線索分類於多人笑臉強度偵測 Multiple People Smile Intensity Estimation Using Multi-Region Histogram of Oriented Gradients with Discriminative Classification of Smiling Face Clues |
| 指導教授: |
王駿發
Wang, Jhing-Fa |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2012 |
| 畢業學年度: | 100 |
| 語文別: | 英文 |
| 論文頁數: | 48 |
| 中文關鍵詞: | 笑臉偵測 、表情辨識 、方向梯度直方圖 、即時系統 、支援向量機 、主動外觀模型 |
| 外文關鍵詞: | Smiling Face Clues, Smiling Face Intensity Estimation, Facial Expression Recognition, Multi-Region Histogram of Oriented Gradients, Active Shape Model |
| 相關次數: | 點閱:90 下載:3 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著科技的進步,使用電腦進行各式各樣的工作已是十分普遍的現象。長時間的在緊張及充滿壓力的狀況下使用電腦工作,常會累積許多精神上的壓力而不自覺。許多文明病便是在這種不良的精神狀況下發生的。市面上有許多儀器與產品被設計出來檢測及注意使用者的身體健康,但是對於心靈上的健康卻鮮有相關的產品可關懷與注意心靈上的狀態。在開心的時候人們會自然而然的展現出笑容,而這正是此論文主要利用特性。藉由偵測與記錄笑臉的出現情形來估測使用者的壓力累積狀況,並於壓力過高時提醒適時的使用者紓解累積的壓力,達到關懷心靈健康的目的。
笑臉偵測歸類於表情辨識的一部份,表情辨識亦屬影像處理中重要的領域之一,而根據使用特徵的擷取方式,主要可分為幾何特徵與紋理特徵。而為能達成在不干擾使用者工作的情況下,偵測與紀錄使用者的笑臉出現情形,此系統在準確率與所需的計算能力中取得平衡點。利用提出之新穎的特徵擷許方式,可結合幾何與紋理兩種特徵的優點,做為主要判斷的依據。首先,利用主動外觀模型(Active Shape Model)來萃取具鑑別性特徵之區域,再擷取並改良各區域之方向梯度直方圖,最後組合取得之多區域方向梯度直方圖(Multi-Region Histogram of Oriented Gradients) 來做為主要的特徵使用。此特徵利用梯度分布與位置資訊來代表五官的外型與肌肉的線條,與人類判斷表情之主要根據相同。並分析笑臉出現之五官外型與肌肉線條做為笑臉線索。同時,根據笑臉線索來訓練其對應之支援向量機(Support Vector Machine), 可對輸入的人臉影像線索的調查。最後收集與統計這些具鑑別性之笑臉線索來完成笑臉強度的偵測與估計之功能。利用JAFFE與FERET兩個資料庫對提出之系統進行測試後,發現即便在最難分辨之微笑表情下,準確率亦可達80%,而在其他較為具明顯特徵的表情下,則有更佳的成果。另外,提出之特徵擷許方式不需建立在多次遞迴後之精確臉部模型或是使用大量的濾波器對影像進行處理的動作,其運算量與特徵維度相對較低,可在一般電腦使用之距離達到即時偵測的效果。
With the advancement of technology, it is common for people to work with personal computers. However, people work under pressure with long hours and pile up stress in the nervous condition without awareness. There are many products and devices are developed to detect the health of body but less for the mental health. It is natural for people to convey their happiness with smiling faces. Therefore, based on the human natural, the thesis proposed a system which can remind people to sooth the nervous mental state by detecting and recording the appearance of smiling face to care the mental health of people. Smiling face detection can be categorized to the facial expression recognition (FER). There are several researches about FER in the field of image and video processing. According to the feature extraction manners, the FER system can be divided into geometric-based, appearance-based approaches and the hybrid-based approaches. All of the approaches have their own advantages and disadvantages. In order to detect the smiling face without disturbing the users, the system makes a tradeoff between the accuracy rate and computation power. The thesis proposed a new hybrid-based feature called multi-region histogram of oriented gradients (MRHOGs) which adopts the active shape model (ASM) for region extraction preprocessing, and shape characteristics for smiling face detection. The MRHOGs can represent both the orientation histogram and spatial information for the shapes of facial features which are also the major clues for humans to recognize the expressions. According to the characteristics of MRHOGs, the discriminative classification smiling face clues is designed for smiling face intensity estimation. The support vector machines (SVMs) are trained to detect the smiling face clues in the input image. By integrating the discriminative smiling face clues, the smiling face intensity can be estimated. The experiments were hold on JAFFE and FERET database and the worst case that smiles a little can also achieve the accuracy rate of 80%. There are higher accuracy rates for other clear smiling faces. Moreover, the method does not convolute the image with filter banks or need a large number of iteration to obtain the precise face model, which make the vector size smaller and computational power less to achieve a real-time system.
[1] P. Ekman, “A methodological discussion of nonverbal behavior,” Psychology, vol.43, 141-149, 1957.
[2] P. Ekman and W. Friesen, “The repertoire of nonverbal behavior: Categories, origins, usage, and coding,” Semiotica, 1, 49–98,1969.
[3] P. Ekman and W. Friesen, “Facial action coding system: A technique for the measurement of facial movement,” Consulting Psychologists Press, Palo Alto, 1978.
[4] “FACS,” http://face-and-emotion.com/dataface/facs/description.jsp
[5] T. Kanade, F. John, and T. Yingli, “Comprehensive database for facial expression analysis,” in Proc. Int. Conf. on Automatic Face and Gesture Recognition, France, 2000, March , pp. 46-53.
[6] T. Okada, T Takiguchi, and Y. Ariki, “Pose robust and person independent facial expressions recognition using AAM selection,” in Proc. 13th IEEE Int. Symp. Consumer Electronics, Kyoto, Japan, 2009, May 25-28, pp. 637-638.
[7] M. Rosenblum, Y. Yacoob, and L. S. Davis, “Human expression recognition from motion using a radial basis function network architecture,” IEEE Trans. Neural Networks, vol. 7, no. 12, pp. 1121-1138, Sep. 1996.
[8] I. Cohen, N. Sebe, A. Garg, L. S. Chen, and T. S. Huang, “Facial expression recognition from video sequences: Temporal and static modeling,” Computer Vision and Image Understanding, vol. 91, pp. 160-187, July 2003.
[9] T. Xiong, L. Xu, K. Wang, J. Li, and Y. Ma, “Local binary pattern probability model based facial feature localization,” in Proc. IEEE 17th Int. Conf. Image Processing, Hong Kong convention and exhibition centre, Hong Kong, 2010, Sep. 26-29, pp. 1425-1428.
[10] Yuanzhong Li and Wataru Ito, “Shape parameter optimization for adaboosted active shape model,” in Proc. 10th IEEE Int. Conf. Computer Vision, Beijing, China, 2005, Oct. 17-20, pp. 251-258.
[11] Md. Z. Uddin, J. J. Lee, and T.-S. Kim, “An enhanced independent component-based human facial expression recognition from video,” IEEE Trans. Consumer Electronics, vol. 55, no. 4, pp. 1121-1138, Nov. 2009.
[12] Y.-J. Li, S.-K. Kang, Y.-U. Kim, and S.-T. Jung, “Development of a facial expression recognition system for the laughter therapy,” in Proc. 4th IEEE Int. Conf. Cybernetics and Intelligent Systems, Grand Copthorne Waterfront Hotel, Singapore, 2010, June 28-30, pp. 168-171.
[13] P. Li, S. L. Phung, A. Bouzerdoum, and F. H. C. Tivive, “Automatic recognition of smiling and neutral facial expressions,” in Proc. IEEE Int. Conf. Digital Image Computing: Techniques and Applications, Sydney, Australia, 2010, Dec. 1-3, pp. 581-586,.
[14] M. J. Lyons, S. Akamatsu, M. Kamachi, and J. Gyoba, “Coding facial expressions with gabor wavelets,” in Proc. 3rd IEEE Int. Conf. Automatic Face and Gesture Recognition, Nara, Japan, 1998, Apr. 14-16, pp. 200-205.
[15] X.-B. Bai, J. O, Q.-X. Wu, and Y.-Z. Chen, “An application study on facial expression recognition based on gabor wavelets,” in Proc. Int. Symp. Computer Network and Multimedia Technology, Wuhan, China, 2009, Dec. 18-20, pp. 1-4.
[16] Z. Ying and X. Fang, “Combining LBP and adaboost for facial expression recognition,” in Proc. IEEE Int. Conf. Signal Processing, Leipzig, Germany, 2008, May 10-11, pp. 1461-1464.
[17] G. Zhao and M. Pietikainen, “Dynamic texture recognition using local binary patterns with an application to facial expressions,” IEEE Trans. Pattern Analysis And Machine Intelligence, vol. 29, no. 6, pp. 915-928, June 2007.
[18] J. Chen and Y. Bai, “Classification of smile expression using hybrid phog and gabor features,” in Proc. IEEE Int. Conf. Computer Application and System Modeling, Taiyuan, China, 2010, Oct. 22-24 pp. 417-420.
[19] J. Whitehill, G. Littlewort, I. Fasel, M. Bartlett, and J. Movellan, “Toward practical smile detection,” IEEE Trans. Pattern Analysis And Machine Intelligence, vol. 31, no. 11, pp. 2106-2111, Nov. 2009.
[20] Y. Bai, L. Guo, L. Jin, and Q. Huang, “A novel feature extraction method using pyramid histogram of orientation gradients for smile recognition,” in Proc. IEEE Int. Conf. Image Processing, Cairo, Egypt, 2009, Nov. 7-10, pp. 3305-3308.
[21] T. Ojala, M. Pietikäinen, and D. Harwood, “Performance evaluation of texture measures with classification based on Kullback discrimination of distributions,” in Proc. 12th IEEE Int. Conf. Pattern Recognition, Jerusalem, Israel, 1994, vol. 1, pp. 582-585.
[22] T. Ojala, M. Pietikäinen, and D. Harwood, “A comparative study of texture measures with classification based on feature distributions,” in Proc. 13th IEEE Int. Conf. Pattern Recognition, Vienna, Austria, 1996, vol. 29, pp. 51-59
[23] P. Viola and M.J. Jones, “Rapid object detection using a boosted cascade of simple features,” in Proc. Computer Society Int. Conf. Computer Vision and Pattern Recognition, 2001, Dec. vol. 1, pp. 511-518.
[24] “FGNET,” http://www.fgnet.rsunit.com/
[25] T. F. Cootes, G. J. Edwards, and C. J. Taylor, “Active appearance models,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 6, pp. 681-685, June 2001.
[26] Y. Wu, H. Liu, and H. Zha, “Modeling facial expression space for recognition,” in Proc. IEEE Int. Conf. Intelligent Robots and System, Shaw Conference Centre in Edmonton, Alberta, Canada, 2005, Aug. 2-6, pp. 1968-1973.
[27] A. J. Calder, A. M. Burton, P. Miller, and A. W. Young, and S. Akamatsu, “A principal component analysis of facial expressions,” Vision Research, vol. 41, no. 9, pp. 1179-1208, Apr. 2001.
[28] A. J. Calder, R. Duncan, A. W. Young, I. Nimmo-Smith, J. Keane, and D.I. Perrett, “Caricaturing facial expressions,” Cognition, vol. 76, no. 2, pp. 105-146, Aug. 2000.
[29] N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition, San Diego, CA, USA, 2005, June 20-26, pp. 886-893.
[30] A. Bosch, A. Zisserman, and X. Munoz, “Representing shape with a spatial pyramid kernel,” in Proc. 6th ACM Int. Conf. Image and Video Retrieval, Amsterdam , the Netherlands, 2007, July 9-11, pp. 401-408.
[31] “JAFFE”, http://www.kasrl.org/jaffe.html