簡易檢索 / 詳目顯示

研究生: 許劉乙成
Hsu, Liu-YiCheng
論文名稱: 應用於機器人之高效能即時多人多角度人臉辨識與情緒辨識影像理解系統
Understanding System of Efficient Multi-person and Multi-angle Face Recognition and Emotion Recognition in Real-Time
指導教授: 王駿發
Wang, Jhing-Fa
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2018
畢業學年度: 106
語文別: 英文
論文頁數: 60
中文關鍵詞: 多人多角度身分辨識多人情緒辨識卷積神經網路臉部追蹤機制
外文關鍵詞: Multiple Facial Expression Recognition, Multi-person and Multi-angle Face Recognition, Deep Convolutional Neural Network, Identity tracking mechanism
相關次數: 點閱:80下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年來,機器人的發展及應用已成為一股新的趨勢,而電腦視覺在機器人上的實際應用如人臉辨識、情緒辨識則能提高機器人與使用者的互動體驗。本篇論文使用網路攝影機,擷取之影像作為視覺系統的輸入,經由高效能卷積神經網路取得臉部RGB影像、臉部標記點,並且透過臉部標記點進行人臉校正。而後將系統分成兩個部分。第一個部分是多人的人臉身分辨識,使用臉部彩色RGB影像來進行臉部特徵偵測與身分辨識。通過訓練完整的特徵偵測網路,能偵測有效及明顯的臉部特徵,並且訓練分類器。一般家庭或公共場合人數可從一人至上千人,此身分辨識有別於以往神經網路,透過不同的學習機制,能有效且快速的訓練資料庫人物。第二個部分是多人的情緒辨識,將人臉影像與臉部標記點當作輸入,為了提高情緒辨識的精準度,本篇論文提出一個混和式的情緒辨識網路,分別訓練至收斂後,混合臉部卷積神經網路與臉部標記點網路,並且經由輸出層輸出5種分別為喜、怒、哀、驚喜、中性的居家常見情緒。實驗結果顯示身分辨識的準確率可以高達90.61%,情緒辨識的準確率高達86.14%。在實際應用中,本系統可同時辨識多達上千人的情緒和身分。

    In recent years, the development and application of robots has become a new trend. The most important thing for robots is personification. The practical application of computer vision on robots such as face recognition and emotion recognition can improve the interactive experience between robots and users. This paper uses a webcam to capture the image as a visual system input. Then, facial image is obtained through high-performance face detect neural network. Facial landmarks is used to correct the face. Then, the system is divided into two parts, first part is face recognition of multiple angle and people. Using facial color RGB images for facial feature detection and identity recognition. By training a complete feature detection network, it is possible to detect valid and distinct facial features and train the classifier for those features. In general, the number of people in a family or public can range from one person to thousands. This identity network is different from previous methods. Through different learning mechanisms, the classifier can be effectively and quickly trained. In addition, in order to reduce the huge computing time and enhance the face recognition rate of multi-face, this paper proposes a face tracking mechanism that automatically counts the target recognition results. Once we got the result, the identification network is no longer turned on to reduce the amount of calculation. The second part is the multi-person emotion recognition system. In order to improve the accuracy of emotion recognition, this paper proposes a hybrid emotion recognition network. Taking facial points and facial image as input, training hybrid neural network to convergence and outputting five home common emotion, neutral, happy, surprise, sad and angry. The experimental results show that the accuracy of identity recognition can be as high as 90.61%, and the accuracy of emotion recognition is as high as 86.14%. In practical applications, the system can recognize the emotions and identities of up to thousands of people at the same time.

    中文摘要 III Abstract IV Contents VII Table List IX Figure List X 1. Introduction 1 1.1 Background 1 1.2 Motivation 2 1.3 Thesis Objective 3 1.4 Thesis Organization 3 2. Related Works 5 2.1 The Survey of Deep Convolutional Neural Network 5 2.2 The Survey of Face Recognition 7 2.3 The Survey of Emotion Recognition 7 3. Multi-person and Multi-angle Face Recognition Based on Deep Convolutional Neural Network 10 3.1 System Overview 10 3.2 Multi-Face Detect Based on Multi-task Convolutional Neural Network 11 3.3 Face Alignment based on Facial landmark 21 3.4 Multi-person and Multi-angle Face Recognition Based on Face Image 23 3.5 Identity Tracking Mechanism 29 4. Multi-Face Emotion Recognition Based on hybrid Facial Points and Deep Dilated Convolutional Neural Network 33 4.1 System Overview 33 4.2 Feature Extraction Based on Deep Dilated Convolutional Neural Network 34 4.3 Facial Landmark Network 40 4.4 Training Phase 44 5. Experimental Result 47 5.1 Emotion Recognition Result 47 5.2 Identity Recognition Result 49 5.3 Identity Tracking Mechanism Result 56 6. Conclusion and Future Works 57 7. References 58

    [1] P. Ekman, "Emotional and conversational nonverbal signals," in Language, knowledge, and representation: Springer, 2004, pp. 39-50.
    [2] Y. LeCun et al., "Backpropagation Applied to Handwritten Zip Code Recognition," Neural Computation, vol. 1, no. 4, pp. 541-551, 1989.
    [3] M. A. Hearst, S. T. Dumais, E. Osuna, J. Platt, and B. Scholkopf, "Support vector machines," IEEE Intelligent Systems and their applications, vol. 13, no. 4, pp. 18-28, 1998.
    [4] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," in Advances in neural information processing systems, 2012, pp. 1097-1105.
    [5] C. Szegedy et al., "Going deeper with convolutions," in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 1-9.
    [6] K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition," in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770-778.
    [7] G. B. Huang, M. Ramesh, T. Berg, and E. Learned-Miller, "Labeled faces in the wild: A database for studying face recognition in unconstrained environments," Technical Report 07-49, University of Massachusetts, Amherst2007.
    [8] N. Kumar, A. C. Berg, P. N. Belhumeur, and S. K. Nayar, "Attribute and simile classifiers for face verification," in Computer Vision, 2009 IEEE 12th International Conference on, 2009, pp. 365-372: IEEE.
    [9] Y. Taigman, M. Yang, M. A. Ranzato, and L. Wolf, "Deepface: Closing the gap to human-level performance in face verification," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 1701-1708.
    [10] F. Schroff, D. Kalenichenko, and J. Philbin, "Facenet: A unified embedding for face recognition and clustering," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 815-823.
    [11] S. Shojaeilangari, W.-Y. Yau, K. Nandakumar, J. Li, and E. K. Teoh, "Robust representation and recognition of facial emotions using extreme sparse learning," IEEE Transactions on Image Processing, vol. 24, no. 7, pp. 2140-2152, 2015.
    [12] G. Levi and T. Hassner, "Emotion recognition in the wild via convolutional neural networks and mapped binary patterns," in Proceedings of the 2015 ACM on international conference on multimodal interaction, 2015, pp. 503-510: ACM.
    [13] A. T. Lopes, E. de Aguiar, A. F. De Souza, and T. Oliveira-Santos, "Facial expression recognition with convolutional neural networks: coping with few data and the training sample order," Pattern Recognition, vol. 61, pp. 610-628, 2017.
    [14] Y.-I. Tian, T. Kanade, and J. F. Cohn, "Recognizing action units for facial expression analysis," IEEE Transactions on pattern analysis and machine intelligence, vol. 23, no. 2, pp. 97-115, 2001.
    [15] G. Yang and T. S. Huang, "Human face detection in a complex background," Pattern recognition, vol. 27, no. 1, pp. 53-63, 1994.
    [16] T. A. McGregor, R. L. Klatzky, C. Hamilton, and S. J. Lederman, "Haptic classification of facial identity in 2D displays: Configural versus feature-based processing," Haptics, IEEE Transactions on, vol. 3, no. 1, pp. 48-55, 2010.
    [17] C.-T. Tu and J.-J. J. Lien, "Automatic location of facial feature points and synthesis of facial sketches using direct combined model," Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on, vol. 40, no. 4, pp. 1158-1169, 2010.
    [18] K. Sandeep and A. Rajagopalan, "Human Face Detection in Cluttered Color Images Using Skin Color, Edge Information," in ICVGIP, 2002.
    [19] R. Brunelli and T. Poggio, "Face recognition: Features versus templates," IEEE Transactions on Pattern Analysis & Machine Intelligence, no. 10, pp. 1042-1052, 1993.
    [20] A. K. Jain, Y. Zhong, and M.-P. Dubuisson-Jolly, "Deformable template models: A review," Signal processing, vol. 71, no. 2, pp. 109-129, 1998.
    [21] B. Moghaddam and A. P. Pentland, "Face recognition using view-based and modular eigenspaces," in SPIE's 1994 International Symposium on Optics, Imaging, and Instrumentation, 1994, pp. 12-21: International Society for Optics and Photonics.
    [22] Y. Fu and N. Zheng, "M-face: An appearance-based photorealistic model for multiple facial attributes rendering," Circuits and Systems for Video Technology, IEEE Transactions on, vol. 16, no. 7, pp. 830-842, 2006.
    [23] F. Dornaika and F. Davoine, "On appearance based face and facial action tracking," Circuits and Systems for Video Technology, IEEE Transactions on, vol. 16, no. 9, pp. 1107-1124, 2006.
    [24] K.-K. Sung and T. Poggio, "Example-based learning for view-based human face detection," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 20, no. 1, pp. 39-51, 1998.
    [25] M. Kirby and L. Sirovich, "Application of the Karhunen-Loeve procedure for the characterization of human faces," IEEE Transactions on Pattern Analysis and Machine Intelligence,, vol. 12, no. 1, pp. 103-108, 1990.
    [26] J. Miao, B. Yin, K. Wang, L. Shen, and X. Chen, "A hierarchical multiscale and multiangle system for human face detection in a complex background using gravity-center template," Pattern Recognition, vol. 32, no. 7, pp. 1237-1248, 1999.
    [27] K. Zhang, Z. Zhang, Z. Li, and Y. Qiao, "Joint face detection and alignment using multitask cascaded convolutional networks," IEEE Signal Processing Letters, vol. 23, no. 10, pp. 1499-1503, 2016.
    [28] P. Viola and M. J. Jones, "Robust real-time face detection," International journal of computer vision, vol. 57, no. 2, pp. 137-154, 2004.
    [29] M.-I. Georgescu, R. T. Ionescu, and M. Popescu, "Local Learning with Deep and Handcrafted Features for Facial Expression Recognition," arXiv preprint arXiv:1804.10892, 2018.
    [30] E. Barsoum, C. Zhang, C. C. Ferrer, and Z. Zhang, "Training deep networks for facial expression recognition with crowd-sourced label distribution," in Proceedings of the 18th ACM International Conference on Multimodal Interaction, 2016, pp. 279-283: ACM.
    [31] B. Amos, B. Ludwiczuk, and M. Satyanarayanan, "Openface: A general-purpose face recognition library with mobile applications," CMU School of Computer Science, 2016.
    [32] D. Chen, X. Cao, F. Wen, and J. Sun, "Blessing of Dimensionality: High-Dimensional Feature and Its Efficient Compression for Face Verification," in 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013, pp. 3025-3032.
    [33] X. Cao, D. Wipf, F. Wen, G. Duan, and J. Sun, "A Practical Transfer Learning Algorithm for Face Verification," in 2013 IEEE International Conference on Computer Vision, 2013, pp. 3208-3215.

    無法下載圖示 校內:2023-08-31公開
    校外:不公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE