簡易檢索 / 詳目顯示

研究生: 陸藝文
Lu, Yi-Wen
論文名稱: 基於卷積神經網路實現人臉辨識系統
Implementation of Facial Recognition System based on Convolutional Neural Networks
指導教授: 廖德祿
Liao, Teh-Lu
學位類別: 碩士
Master
系所名稱: 工學院 - 工程科學系
Department of Engineering Science
論文出版年: 2018
畢業學年度: 106
語文別: 英文
論文頁數: 74
中文關鍵詞: 卷積神經網路人臉辨識系統深度學習
外文關鍵詞: Convolutional Neural Networks, Face Recognition System, Deep Learning。
相關次數: 點閱:67下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 過去來說,人臉辨識系統通常被建立在一些需要更高安全級別的地方,例如政府部門,犯罪調查單位,等等。然而近幾年人臉辨識技術開始廣泛應用在現實生活中,例如:智慧型手機螢幕刷臉解鎖功能和刷臉線上支付功能,等等。以上這些應用,人臉辨識準確率是關鍵的因素,因此本篇論文著重於提升人臉辨識準確率,藉由卷積神經網路來學習一組強大的特徵向量,再利用分類器來做類別分類。本論文所提出的人臉辨識系統分成兩個階段。第一階段是人臉偵測,使用多任務級聯卷積網路[1]來實現人臉偵測。本方法利用三個卷積神經網路來預測一張圖像中可能存在人臉的區域,再藉由這些區域來產生人臉邊框。接著使用非極大抑制演算法來消除重疊率高的人臉邊框。最後計算預測邊框和真實邊框之間的歐式距離來做邊框回歸。經過第一階段所得到人臉邊框再進行尺度調整,然後執行標準化操作,讓不同的特徵具有相同的尺度,使得卷積神經網路在訓練階段更有效率。第二階段是人臉辨識,採用監督式學習方式,使用Inception-V4[2]模型來學習128維度的特徵向量,並且以Softmax Loss作為目標函數來訓練模型。最後就是訓練線性支持向量分類器,針對該特徵向量進行類別分類。本研究在人臉識別公認的測試數據集LFW上得到的辨識準確率高達97.83%,。然而在錯誤接受率為0.1和0.001的時候,所得到的驗證率分別為99.13%和89.10%

    In the past, the face recognition system is normally founded in a place where requires a higher security level, such as government agencies, military centers. In recent years, face recognition technology is widely used in the real-world application. For example, some smartphone company offers online payment and smartphone screen unlock functionality through face recognition technique. For these new applications, face recognition accuracy is the key factor. Therefore, this paper will focus on improving the accuracy of face recognition, using convolutional neural networks to learn a powerful feature vector. The proposed system divided into two stages. The first stage is the face detection. This stage uses Multi-Task Cascaded Convolutional Networks [1] method with the model to train the face detector. This algorithm adopts convolutional neural networks to predict an image which might include face regions and then generates bounding boxes with these face regions. Then, the non-maximal suppression algorithm is performed to eliminate those boxes with highly overlapped. Finally, the Euclidean distance between the predicted bounding box and the ground true bounding box is calculated to perform bounding box regression. The second stage is the face recognition. We apply supervised learning and uses the Inception-V4 [2] model to learn 128-dimensional feature vectors. Eventually, the Linear Support Vector Classifier will be trained for the classification task. We obtain 97.83% recognition rate on the face recognition bench-marks Label Faces in the Wild. Additionally, we achieve 99.13% validation rate when the false accept rate restricts to 0.1. However, the validation rate (VAL) drops to 89.10% when the false accept rate restricts to 0.001.

    摘要...I Abstract...III 誌謝...V Contents...VI List of Figures...IX List of Tables...XI CHAPTER 1 INTRODUCTION...1 1.1 Background...1 1.2 Motivation...1 1.3 Dissertation Organization...2 CHAPTER 2 FUNDAMENTAL KNOWLEDGE...4 2.1 Artificial Neural Networks...4 2.2 Convolutional Neural Networks...6 2.2.1 Convolution Layer...7 2.2.2 Pooling Layer...8 2.3 Activation Functions...9 2.4 Backpropagation...11 2.5 Gradient Descent...19 2.5.1 Momentum...20 2.5.2 Adagrad...21 2.5.3 RMSProp...22 2.6 Overfitting...22 2.6.1 Early Stopping...23 2.6.2 Regularization...24 2.6.3 Dropout...25 CHAPTER 3 METHODOLOGY...28 3.1 System Architecture...28 3.2 Face Detection Algorithm...29 3.2.1 Convolutional Neural Networks Architectures...30 3.2.2 Multi-task Training Strategy...32 3.2.3 The Process of MTCNN algorithm...33 3.3 Face Recognition Algorithm...33 3.3.1 Deep Convolutional Neural Networks Architecture...34 3.3.2 Loss Function...38 3.3.3 Deep Feature Learning...39 3.3.4 Linear Support Vector Classifier...40 CHAPTER 4 EXPERIMENTS AND RESULTS...42 4.1 Implementation Details...43 4.1.1 Data Preprocessing...43 4.1.2 Hardware...44 4.2 Model Performance Evaluation...45 4.2.1 LFW Dataset Evaluation...46 4.2.2 Experiments and Results...53 4.2.3 Performance Comparison...60 4.3 Training Linear Support Vector Classifier...61 4.3.1 Performance on Linear Support Vector Classifier...61 4.4 Graphic User Interface...62 4.4.1 Introduction to Qt Designer...63 4.5 Implementation on Embedded System...65 4.5.1 Introduction of Jetson-Tx1...65 4.5.2 System on Jetson-Tx1...66 CHAPTER 5 CONCLUSION AND FUTURE WORK...68 5.1 Conclusion...68 5.2 Future Work...68 REFERENCE...70

    [1] Zhang, Kaipeng, et al. "Joint face detection and alignment using multitask cascaded convolutional networks." IEEE Signal Processing Letters 23.10 (2016):
    [2] Szegedy, Christian, et al. "Inception-v4, inception-resnet and the impact of residual connections on learning." AAAI. Vol. 4. 2017.
    [3] Bubeck, Dina Sanchez Uwe, and Dina Sanchez. "Biometric Authentication." Universidade Estadual de San Diego (2003).
    [4] De Luis-Garcı́a, Rodrigo, et al. "Biometric identification systems." Signal Processing 83.12 (2003): 2539-2557.
    [5] Lawrence, Jeannette. Introduction to neural networks: design, theory, and applications. Nevada City, CA: California Scientific Software, 1994.
    [6] Graves, Alex, and Jürgen Schmidhuber. "Offline handwriting recognition with multidimensional recurrent neural networks." Advances in neural information processing systems. 2009.
    [7] Maas, Andrew L., Awni Y. Hannun, and Andrew Y. Ng. "Rectifier nonlinearities improve neural network acoustic models." Proc. icml. Vol. 30. No. 1. 2013.
    [8] He, Kaiming, et al. "Delving deep into rectifiers: Surpassing human-level performance on imagenet classification." Proceedings of the IEEE international conference on computer vision. 2015.
    [9] Ruder, Sebastian. "An overview of gradient descent optimization algorithms." arXiv preprint arXiv:1609.04747 (2016).
    [10] Hinton, Geoffrey, Nitish Srivastava, and Kevin Swersky. "Neural networks for machine learning lecture 6a overview of mini-batch gradient descent." Cited on (2012): 14.
    [11] Christian, Brian, and Tom Griffiths. Algorithms to live by: The computer science of human decisions. Macmillan, 2016.
    [12] Mahsereci, Maren, et al. "Early stopping without a validation set." arXiv preprint arXiv:1703.09580 (2017).
    [13] Srivastava, Nitish, et al. "Dropout: A simple way to prevent neural networks from overfitting." The Journal of Machine Learning Research 15.1 (2014): 1929-1958.
    [14] P. Viola and M. J. Jones, “Robust real-time face detection. International journal of computer vision,” vol. 57, no. 2, pp. 137-154, 2004
    [15] Freund, Yoav, and Robert E. Schapire. "Experiments with a new boosting algorithm." Icml. Vol. 96. 1996.
    [16] Yang, Bin, et al. "Aggregate channel features for multi-view face detection." Biometrics (IJCB), 2014 IEEE International Joint Conference on. IEEE, 2014.
    [17] Pham, Minh-Tri, et al. "Fast polygonal integration and its application in extending haar-like features to improve object detection." Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on. IEEE, 2010
    [18] Zhu, Qiang, et al. "Fast human detection using a cascade of histograms of oriented gradients." Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on. Vol. 2. IEEE, 2006.
    [19] Mathias, Markus, et al. "Face detection without bells and whistles." European Conference on Computer Vision. Springer, Cham, 2014.
    [20] Yan, Junjie, et al. "The fastest deformable part model for object detection." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014.
    [21] Zhu, Xiangxin, and Deva Ramanan. "Face detection, pose estimation, and landmark localization in the wild." Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 2012.
    [22] Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).
    [23] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012.
    [24] Szegedy, Christian, et al. "Going deeper with convolutions." Cvpr, 2015.
    [25] He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
    [26] Baccouche, Moez, et al. "Sequential deep learning for human action recognition." International Workshop on Human Behavior Understanding. Springer, Berlin, Heidelberg, 2011.
    [27] Wang, Limin, Yu Qiao, and Xiaoou Tang. "Action recognition with trajectory-pooled deep-convolutional descriptors." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
    [28] Ji, Shuiwang, et al. "3D convolutional neural networks for human action recognition." IEEE transactions on pattern analysis and machine intelligence 35.1 (2013): 221-231.
    [29] Sun, Yi, Xiaogang Wang, and Xiaoou Tang. "Hybrid deep learning for face verification." Computer Vision (ICCV), 2013 IEEE International Conference on. IEEE, 2013.
    [30] Taigman, Yaniv, et al. "Deepface: Closing the gap to human-level performance in face verification." Proceedings of the IEEE conference on computer vision and pattern recognition. 2014.
    [31] Sun, Yi, et al. "Deep learning face representation by joint identification-verification." Advances in neural information processing systems. 2014.
    [32] Schroff, Florian, Dmitry Kalenichenko, and James Philbin. "Facenet: A unified embedding for face recognition and clustering." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
    [33] Wen, Yandong, Zhifeng Li, and Yu Qiao. "Latent factor guided convolutional neural networks for age-invariant face recognition." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.
    [34] Parkhi, O.M., Vedaldi, A., Zisserman, A.: Deep face recognition. In: Proceedings of the British Machine Vision, vol. 1, no. 3, p. 6 (2015).
    [35] Fukunaga, Keinosuke, and Patrenahalli M. Narendra. "A branch and bound algorithm for computing k-nearest neighbors." IEEE transactions on computers 100.7 (1975): 750-753.
    [36] Suykens, Johan AK, and Joos Vandewalle. "Least squares support vector machine classifiers." Neural processing letters 9.3 (1999): 293-300.
    [37] Moore, Robert, and John DeNero. "L1 and L2 regularization for multiclass hinge loss models." Symposium on Machine Learning in Speech and Language Processing. 2011.
    [38] Yi, Dong, et al. "Learning face representation from scratch." arXiv preprint arXiv:1411.7923 (2014).
    [39] Huang, Gary B., and Erik Learned-Miller. "Labeled faces in the wild: Updates and new reporting procedures." Dept. Comput. Sci., Univ. Massachusetts Amherst, Amherst, MA, USA, Tech. Rep (2014): 14-003.
    [40] Chen, Dong, et al. "Bayesian face revisited: A joint formulation." European Conference on Computer Vision. Springer, Berlin, Heidelberg, 2012.
    [41] Berg, Thomas, and Peter N. Belhumeur. "Tom-Vs-Pete Classifiers and Identity-Preserving Alignment for Face Verification." BMVC. Vol. 2. 2012.
    [42] Chen, Dong, et al. "Blessing of dimensionality: High-dimensional feature and its efficient compression for face verification." Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on. IEEE, 2013.
    [43] Cao, Xudong, et al. "A practical transfer learning algorithm for face verification." Computer Vision (ICCV), 2013 IEEE International Conference on. IEEE, 2013.
    [44] Liu, Jingtuo, et al. "Targeting ultimate accuracy: Face recognition via deep embedding." arXiv preprint arXiv:1506.07310 (2015).

    無法下載圖示 校內:2023-07-10公開
    校外:不公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE