簡易檢索 / 詳目顯示

研究生: 石哲榮
Shih, Che-June
論文名稱: 深度學習於深度偽造影片檢測之應用
Application of Deep Learning for DeepFake Videos Detection
指導教授: 王明習
Wang, Ming-Shi
學位類別: 碩士
Master
系所名稱: 工學院 - 工程科學系
Department of Engineering Science
論文出版年: 2020
畢業學年度: 108
語文別: 中文
論文頁數: 54
中文關鍵詞: 深度偽造視訊檢測深度學習深度偽造卷積神經網路
外文關鍵詞: deep learning, convolution neural network, deepfake, deepfake video detection
相關次數: 點閱:85下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 目前,視訊內之人物可容易地透過相關數位偽造技術將這些人物的臉部它更換,使此視訊成為一部偽造視訊。如何利用相關技術來檢驗一部視訊是否經過偽造是個蠻重要的研究議題。本研究探討如何檢測視訊框內之臉部是否為偽造。作法如下:首先,檢測視訊框內是否存在臉部,若視訊框內有出現臉部,則進一步評估該臉部是否有被變造過。若是視訊框內沒有出現臉部,則可進一步做全視訊框之變造偵測。在本論文中,透過兩種不同的臉部檢測模型來檢知視訊框內是否存在臉部,並比較這兩種臉部檢測模型對系統在準確度的表現。另外,也考慮不同的每秒取樣幀數對系統的準確性及耗時的影響。為了讓系統能夠處理畫面為過度曝光或光線不足的情況下也可以正常進行判斷是否被偽造,分別考慮只對臉部偽造檢測的卷積神經網路和對整個畫面做全圖臉部偽造檢測的卷積神經網路,來增加系統的準確性及泛用性。透過這些方法,最終結果顯示該系統的準確性達到92.6%。

    Nowadays, someone’s face in a video can easily be altered or swapped through the relevant digital forgery technology, making this video as a fake video. How to use related technologies to detect whether a video has been forged is a very important research issue. This study explores how to detect if the face in the video frame is altered. The method is as follows: First, detect whether there is a face in the video frame, if there is a face in the video frame, then to evaluate whether the face has been altered. If there is no face existed in the video frame, then further detection of changes in the full video frame can be performed. In this study, two different face detection models are applied respectively, to detect if there is a face in the video frame, and the accuracy of the system is compared by these two face detection models. In addition, the impact of different sampling frames per second on the accuracy and time-consuming of the system is also considered. In order to process these frames that is overexposed or under-lighted, it can also decide whether it has been forged normally, both the convolutional neural network that only detects faked of the face and that detects the forgery of the entire frame. Using convolutional neural network can increase the accuracy and versatility of the system. Our experimental results show that the accuracy is 92.6%.

    目錄 摘要 i 誌謝 iii 目錄 xi 表目錄 xiii 圖目錄 xiv 第一章 緒論 1 1.1 研究背景與動機 1 1.2 研究目的 2 1.3 論文架構 2 第二章 相關資料探討 4 2.1 深度學習 4 2.1.1 類神經網路 5 2.1.2 卷積神經網路 7 2.2 臉部辨識 14 2.2.1 Dlib 15 2.3 深度偽造生成 26 2.4 文獻探討 28 第三章 研究方法 30 3.1 系統整體架構 30 3.2 卷積神經網路 33 3.3 臉部檢測 37 第四章 實驗結果與討論 39 4.1 實驗環境 39 4.2 實驗資料庫 40 4.3 實驗過程及結果 41 第五章 結論與未來展望 49 5.1 結論 49 5.2 未來展望 49 參考文獻 50

    [1]D. Michie, D.J. Spiegelhalter, and C.C. Taylor, “ Machine learning. Neural and Statistical Classification ”, Journal of the American Statistical Association,Vol. 91, No. 433, 1994, pp.436-438
    [2]N.R. Gavai, Y.A. Jakhade, S.A. Tribhuvan, et al., “ MobileNets for flower classification using TensorFlow ”, 2017 International Conference on Big Data, IoT and Data Science (BID), Pune, 20-22 Dec., 2017, pp. 154-158.
    [3]D. Güera, and E.J. Delp, “ Deepfake Video Detection Using Recurrent Neural Networks ”, 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Auckland, New Zealand, 27-30 Nov, 2018, pp. 1-6.
    [4]M. Đorđević, M. Milivojević and A. Gavrovska, “ DeepFake Video Analysis using SIFT Features ”, 2019 27th Telecommunications Forum (TELFOR), Belgrade, 2019, pp. 1-4.
    [5]F. Rosenblatt, “ Perceptron Simulation Experiments ”, Proceedings of the IRE, March, 1960, pp.301-309.
    [6]P. Korshunov, and S. Marcel, “ Vulnerability assessment and detection of deepfake videos ”, 2019 International Conference on Biometrics (ICB), Crete, Greece, 4-7 June, 2019, pp.1-6.
    [7]I. Korshunova, W. Shi, J. Dambre, and L. Theis, “ Fast Face-swap Using Convolutional Neural Networks ”, 2017 IEEE International Conference on Computer Vision (ICCV), Venice, 22-29 Oct., 2017, pp. 3697-3705.
    [8]J. Cao, Y. Hu, B. Yu, R. He, and Z. Sun, “ 3D Aided Duet GANs for Multi-View Face Image Synthesis ”, in IEEE Transactions on Information Forensics and Security, Vol. 14, No. 8, 2019, pp.2028-2042.
    [9]A. Vedaldi, M. Blaschko, and A. Zisserman, “ Learning equivariant structured output SVM regressors ”, 2011 International Conference on Computer Vision, Barcelona, 6-13 Nov., 2011, pp. 959-966.
    [10]F. Schroff, D. Kalenichenko, and J. Philbin, “ FaceNet: A unified embedding for face recognition and clustering”, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, 7-12 June, 2015, pp. 815-823.
    [11]T. Nyein, and A.N. Oo, “ University Classroom Attendance System Using FaceNet and Support Vector Machine ”, 2019 International Conference on Advanced Information Technologies (ICAIT), Yangon, Myanmar, 6-7 Nov., 2019, pp. 171-176.
    [12]J.S. Chung, A. Senior, O. Vinyals, and A. Zisserman, “ Lip Reading Sentences in the Wild ”, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July, 2017, pp. 3444-3453.
    [13]S. Suwajanakorn, S.M. Seitz, et al., “ Synthesizing Obama: learning lip sync from audio ”, ACM Transactions on Graphics, Vol. 36, No. 4, 2017, pp. 1-13
    [14]P. Korshunov, and S. Marcel, “ Speaker inconsistency detection in tampered video ”, 2018 26th European Signal Processing Conference (EUSIPCO), Rome, 3-7 Sept., 2018, pp.2375-2379.
    [15]J. Galbally, and S. Marcel, “ Face Anti-spoofing Based on General Image Quality Assessment ”, 2014 22nd International Conference on Pattern Recognition, Stockholm, 24-28 Aug., 2014, pp. 1173-1178.
    [16]D. Wen, H. Han, and A.K. Jain, “ Face spoof detection with image distortion analysis ”, 2016 International Conference on Emerging Technological Trends (ICETT), Kollam, 2016, pp. 1-5.
    [17]E. Sabir, J. Cheng, A. Jaiswal, W. AbdAlmageed, I.Masi, P. Natarajan, “ Recurrent convolutional strategies for face manipulation detection in videos ”, Conference on Computer Vision and Pattern Recognition Workshops, Boston, 2019, pp. 80-87.
    [18]G. Huang, Z. Liu, L.V.D. Maaten, and K.Q. Weinberger, “ Densely connected convolutional networks ”, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July, 2017, pp. 2261-2269.
    [19]K. Cho, B.V. Merrienboer, C. Gulcehre, et al., “ Learning phrase representations using RNN encoderdecoder for statistical machine translation ”, Conference on Empirical Methods in Natural Language Processing (EMNLP), Austin, 2014, pp. 1724-1734.
    [20]A. Rossler, D. Cozzolino, L. Verdoliva, C. Riess, J. Thies, and M. Nießner, “ FaceForensics++: Learning to Detect Manipulated Facial Images ”, 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South), 27 Oct.-2 Nov., 2019, pp. 1-11.
    [21]S. Liu, and W. Deng, “ Very deep convolutional neural network based image classification using small training sample size ”, 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR), Kuala Lumpur,, 3-6 Nov. 2015, pp. 730-734.
    [22]K. He, X. Zhang, S. Ren, and J. Sun, “ Deep residual learning for image recognition ”, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 27-30 June, 2016, pp.770-778.
    [23]D.E. King, “ Dlib-ml: A Machine Learning Toolkit ” Journal of Machine Learning Research ,Vol.10 ,No. 69, 2009, pp.1755-1758.
    [24]V. Bazarevsky, Y. Kartynnik, A. Vakunov, K. Raveendran, and M. Grundmann, “ BlazeFace: Sub-millisecond Neural Face Detection on Mobile GPUs ”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 2019.
    [25]Y. Li, S. Lyu, “ Exposing DeepFake Videos By Detecting Face Warping Artifacts ”, Computer Science Department University at Albany State University of New York, 2019.
    [26]B. Dolhansky, R. Howes, B. Pflaum, N. Baram, and C.C. Ferrer, “ The Deepfake Detection Challenge(DFDC) Preview Dataset ”, arXiv:1910.08854,19 October, 2019, https://arxiv.org/abs/1910.08854 .
    [27]F. Chollet, “ Xception: Deep Learning with Depthwise Separable Convolutions ”, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July, 2017, pp. 1800-1807.
    [28]C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “ Rethinking the Inception Architecture for Computer Vision ”, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 27-30 June, 2016, pp. 2818-2826.
    [29]A. Qi, J. Wei, B. Bai. “ Research on Deep Learning Expression Recognition Algorithm Based on Multi-Model Fusion ”, 2019 International Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI), Taiyuan, China, 8-10 Nov., 2019, pp. 288-291.
    [30]Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “ Gradient-Based Learning Applied to Document Recognition ”, in Proceedings of the IEEE, Vol. 86, No. 11, 1998, pp.2278-2324.
    [31]V. Vapnik, and O. Chapelle, “ Bounds on Error Expectation for Support Vector Machines ”, Neural Computation, Vol.12, No.9, 2000, pp.2013-2036.
    [32]W.S. McCulloch, and W. Pitts, “A logical calculus of the ideas immanent in nervous activity”, The bulletin of mathematical biophysics, 1943, pp.115-133.
    [33]R.K. Mohapatra, B. Majhi and S.K. Jena “ Classification performance analysis of MNIST Dataset utilizing a Multi-resolution Technique ”, 2015 International Conference on Computing, Communication and Security (ICCCS), Pamplemousses, 4-5 Dec., 2015, pp. 1-5.
    [34]J. Deng, W. Dong, R. Socher, L.J. Li, K. Li, and F.F. Li, “ ImageNet: A Large-Scale Hierarchical Image Database ”, 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, 20-25 June, 2009, pp. 248-255.
    [35]U. Karn, “An Intuitive Explanation of Convolutional Neural Networks”, 2016, https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/
    [36]N. Bodla, B. Singh, R. Chellappa, L.S. Davis, “ Soft-NMS — Improving Object Detection with One Line of Code ”, 2017 IEEE International Conference on Computer Vision (ICCV), Venice, 22-29 Oct., 2017, pp.5562-5570.

    無法下載圖示 校內:2025-09-05公開
    校外:不公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE