簡易檢索 / 詳目顯示

研究生: 廖韋翔
Liao, Wei-Hsiang
論文名稱: 可辨識配戴太陽眼鏡人臉之深度學習模型應用
A Deep Learning Model Application for Recognizing Faces Wearing Sunglasses
指導教授: 王宗一
Wang, Tzone-I
學位類別: 碩士
Master
系所名稱: 工學院 - 工程科學系
Department of Engineering Science
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 51
中文關鍵詞: 人臉辨識身分識別深度學習神經網路
外文關鍵詞: Face Recognition, Identity Verification, Deep Learning, Neural Networks
相關次數: 點閱:81下載:19
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 人臉辨識技術在現今有著廣泛的應用,例如用於住宅大樓門禁、健身房入口、與手機臉部解鎖等身份認證的場景。目前人臉辨識技術能夠快速、準確地辨識使用者的身份,提高門禁系統的安全性,同時也能保護使用者的隱私和安全。在實際應用的人臉辨識系統中,傳統的技術已經能夠有效地識別出配戴普通眼鏡的使用者。然而,當使用者戴著太陽眼鏡時,傳統人臉辨識技術容易出現失敗的情況。為解決此問題,本研究提出了一種新的訓練方法,使用自製擴充工具對現有的人臉資料集進行擴增 ,將資料集裡的臉孔其眼部周圍進行遮蔽處理或是合成上太陽眼鏡,並以此資料集訓練深度學習模型。這種方法使得即使是辨識戴著太陽眼鏡的人臉,能有非常高的準確率,而且無需特意新拍攝或製作用於戴著太陽眼鏡的人臉資料,也無需重新訓練深度學習模型即可應用於辨識未見過的人臉。
    本研究中使用了自製工具擴充後的人臉資料集VGGFace2,其中包含眼部周圍遮蔽的人臉影像。利用這些影像訓練了一個深度學習模型FaceNet ,以學習遮蔽眼睛周圍特徵後造成的影像變化對人臉辨識的影響。在模型訓練完成後,使用已經過擴充工具大量合成太陽眼鏡的人臉資料集LFW (Labeled Faces in the Wild)對其進行了測試,包括在戴著太陽眼鏡和未戴太陽眼鏡的情況下辨識人臉的準確率。
    實驗結果顯示,本研究所提出的方法在配戴太陽眼鏡人臉辨識上取得了較好的表現,準確率超過了98%,且在誤識率(FAR; False Acceptance Rate)設為0.1%的情況下,召回率仍超過了88%,相較於傳統方法提高了22%以上。基於此成果,本研究進一步將訓練完的FaceNet模型實際應用在一個名為VisioVoice的系統上。VisioVoice是一個專為視障人士設計的系統,旨在幫助他們在日常生活中辨識面前人物的身份並透過語音進行提示。通過整合本研究的FaceNet模型,VisioVoice將能夠在視障人士面對陌生人時,即時提供相關的身份資訊,從而提升他們的生活品質和社交互動能力。

    Face recognition technology is widely used in various applications today, such as access control in residential buildings, entry to gyms, and facial unlocking on smartphones for identity authentication. Current face recognition technology can quickly and accurately identify users' identities, enhancing the security of access control systems while protecting users' privacy and safety. In practical face recognition systems, traditional techniques can effectively identify users wearing regular eyeglasses. However, traditional face recognition technology often fails when users wear sunglasses. To address this issue, this study proposes a novel training method using a self-developed augmentation tool to augment existing face datasets. The tool masks the area around the eyes or synthesizes sunglasses onto faces in the dataset, which is then used to train a deep learning model. This approach enables highly accurate recognition of faces wearing sunglasses without the need for newly captured or specially created face data for sunglasses. It also eliminates the requirement of retraining the deep learning model for recognizing unseen faces.
    In this study, a self-developed tool was used to augment the VGGFace2 dataset, which contains face images with masked eye regions. These images were used to train a deep learning model called FaceNet, which learns the impact of masked eye region features on face recognition. After training the model, extensive testing was conducted on the LFW (Labeled Faces in the Wild) dataset, which was augmented with synthesized sunglasses using the augmentation tool. The testing evaluated the accuracy of face recognition under both sunglass-wearing and non-sunglass-wearing conditions.
    The experimental results demonstrate that the proposed method achieves excellent performance in recognizing faces wearing sunglasses, with an accuracy exceeding 98%. Even at a false acceptance rate (FAR) of 0.1%, the recall rate remains above 88%, representing an improvement of over 22% compared to traditional methods. Building upon these findings, the trained FaceNet model was further applied in a system called VisioVoice. VisioVoice is designed to assist visually impaired individuals in recognizing the identities of people in front of them and providing prompts through voice assistance. By integrating the FaceNet model from this study, VisioVoice can offer real-time identity information when visually impaired individuals encounter strangers, thus enhancing their quality of life and social interaction abilities.

    摘要 I Extended Abstract II 誌謝 XII 目錄 XIII 表目錄 XV 圖目錄 XVI 第一章 緒論 1 1.1 研究背景與動機 1 1.2 研究目的 1 1.3 研究方法 2 1.4 研究貢獻 2 第二章 文獻探討 3 2.1 人臉辨識資料集 3 2.2 FaceNet 6 2.3 其他臉部有遮蔽的人臉辨識 9 第三章 系統設計與架構 11 3.1 資料預處理 11 3.2 人臉資料集擴充工具 14 3.3 模型架構 23 3.4 系統實際應用 28 第四章 實驗設計與結果 33 4.1 資料及實驗環境 33 4.2 評估工具 37 4.3 實驗結果與分析 40 4.4 討論 45 第五章 結論及未來展望 47 5.1 結論 47 5.2 未來展望 47 參考文獻 48

    [1] T. Abdullah, Y. Bazi, M. M. Al Rahhal, M. L. Mekhalfi, L. Rangarajan, and M. Zuair, "TextRS: Deep bidirectional triplet network for matching text to remote sensing images," Remote Sensing, vol. 12, no. 3, p. 405, 2020.
    [2] A. Al-Nuaimi and G. Mohmmed, "A New Method for Head Direction Estimation based on Dlib Face Detection Method and Implementation of Sine Invers Function," Journal of Education and Science, vol. 30, no. 5, pp. 114-124, 2021.
    [3] C. Alvarez Casado and M. Bordallo Lopez, "Real-time face alignment: evaluation methods, training strategies and implementation optimization," Journal of Real-Time Image Processing, vol. 18, no. 6, pp. 2239-2267, 2021.
    [4] A. Anwar and A. Raychowdhury, "Masked face recognition for secure authentication," arXiv preprint arXiv:2008.11104, 2020.
    [5] A. Bansal, C. Castillo, R. Ranjan, and R. Chellappa, "The do's and don'ts for cnn-based face verification," in Proceedings of the IEEE international conference on computer vision workshops, 2017, pp. 2545-2554.
    [6] A. Bansal, A. Nanduri, C. D. Castillo, R. Ranjan, and R. Chellappa, "Umdfaces: An annotated face dataset for training deep networks," in 2017 IEEE international joint conference on biometrics (IJCB), 2017: IEEE, pp. 464-473.
    [7] Q. Cao, L. Shen, W. Xie, O. M. Parkhi, and A. Zisserman, "Vggface2: A dataset for recognising faces across pose and age," in 2018 13th IEEE international conference on automatic face & gesture recognition (FG 2018), 2018: IEEE, pp. 67-74.
    [8] Y. Guo, L. Zhang, Y. Hu, X. He, and J. Gao, "Ms-celeb-1m: A dataset and benchmark for large-scale face recognition," in Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part III 14, 2016: Springer, pp. 87-102.
    [9] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770-778.
    [10] G. B. Huang. "LFW Face Database : Main - Computer Vision Lab." https://vis-www.cs.umass.edu/lfw/ (accessed.
    [11] G. B. Huang, M. Mattar, T. Berg, and E. Learned-Miller, "Labeled faces in the wild: A database forstudying face recognition in unconstrained environments," in Workshop on faces in'Real-Life'Images: detection, alignment, and recognition, 2008.
    [12] I. Kemelmacher-Shlizerman, S. M. Seitz, D. Miller, and E. Brossard, "The megaface benchmark: 1 million faces for recognition at scale," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 4873-4882.
    [13] D. E. King, "Dlib-ml: A machine learning toolkit," The Journal of Machine Learning Research, vol. 10, pp. 1755-1758, 2009.
    [14] D. E. King, "Max-margin object detection," arXiv preprint arXiv:1502.00046, 2015.
    [15] D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980, 2014.
    [16] B. F. Klare et al., "Pushing the frontiers of unconstrained face detection and recognition: Iarpa janus benchmark a," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 1931-1939.
    [17] A. Liang, C. S. N. Pathirage, C. Wang, W. Liu, L. Li, and J. Duan, "Face recognition despite wearing glasses," in 2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA), 2015: IEEE, pp. 1-8.
    [18] E. N. A. Neto et al., "Real-time head pose estimation for mobile devices," in Intelligent Data Engineering and Automated Learning-IDEAL 2012: 13th International Conference, Natal, Brazil, August 29-31, 2012. Proceedings 13, 2012: Springer Berlin Heidelberg, pp. 467-474.
    [19] O. M. Parkhi, A. Vedaldi, and A. Zisserman, "Deep face recognition," 2015.
    [20] F. Schroff, D. Kalenichenko, and J. Philbin, "Facenet: A unified embedding for face recognition and clustering," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 815-823.
    [21] Y. Sun, X. Wang, and X. Tang, "Deep learning face representation from predicting 10,000 classes," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 1891-1898.
    [22] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. Alemi, "Inception-v4, inception-resnet and the impact of residual connections on learning," in Proceedings of the AAAI conference on artificial intelligence, 2017, vol. 31, no. 1.
    [23] C. Szegedy et al., "Going deeper with convolutions," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 1-9.
    [24] WeidiXie and lishen-shirley. "VGGFace2 Dataset for Face Recognition." https://github.com/ox-vgg/vgg_face2 (accessed.
    [25] Y. Wen, K. Zhang, Z. Li, and Y. Qiao, "A discriminative feature learning approach for deep face recognition," in Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part VII 14, 2016: Springer, pp. 499-515.
    [26] C. Whitelam et al., "Iarpa janus benchmark-b face dataset," in proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2017, pp. 90-98.
    [27] L. Wolf, T. Hassner, and I. Maoz, "Face recognition in unconstrained videos with matched background similarity," in CVPR 2011, 2011: IEEE, pp. 529-534.
    [28] D. Yi, Z. Lei, S. Liao, and S. Z. Li, "Learning face representation from scratch," arXiv preprint arXiv:1411.7923, 2014.
    [29] K. Zhang, Z. Zhang, Z. Li, and Y. Qiao, "Joint face detection and alignment using multitask cascaded convolutional networks," IEEE signal processing letters, vol. 23, no. 10, pp. 1499-1503, 2016.
    [30] Z. Zhong, L. Zheng, G. Kang, S. Li, and Y. Yang, "Random erasing data augmentation," in Proceedings of the AAAI conference on artificial intelligence, 2020, vol. 34, no. 07, pp. 13001-13008.

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE