簡易檢索 / 詳目顯示

研究生: 張舜傑
Chang, Shun-Chieh
論文名稱: 一個改善泛化能力的深度偽造檢測演算法
A Deepfake Detection Algorithm With Improved Generalization Ability
指導教授: 戴顯權
Tai, Shen-Chuan
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電腦與通信工程研究所
Institute of Computer & Communication Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 英文
論文頁數: 47
中文關鍵詞: 深度偽造深度偽造偵測深度學習注意力機制資料擴增
外文關鍵詞: deepfake, deepfake detection, deep learning, attention mechanism, data augmentation
相關次數: 點閱:120下載:34
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著深度學習的快速發展,深度偽造技術也越來越進步了,因此有大量的深度偽造影片在網路上流傳引起熱烈的討論。然而,有些影片的不良用途例如色情,政治誤導,惡作劇都造成許多社會問題並且影響他人的名譽,所以偵測出深度偽造影片成了現今重要的任務。
    現今有許多深度偽造偵測模型都面臨著過擬合的問題。因此,本篇論文提出了一個改善模型泛化能力的深度偽造偵測演算法。首先,使用 EfficientNet 作為模型的骨幹,然後加入注意力機制使模型專注於重要信息。最後,遮蓋圖像被設計作為資料擴增加入訓練。所提出方法的訓練集為 FaceForensics++ ,並且測試於 FaceForensics++、Celeb-DF、DFDC 測試集上。實驗結果顯示,本論文提出的方法在比較的方法中擁有更好的泛化能力。

    With the rapid development of deep learning, deepfake technology has also improved. Therefore, many deepfake videos are circulating on the Internet and arousing heated discussions. However, some malicious uses of videos, such as pornography, political misleading, and hoaxes, cause many social problems and affect the reputation of others. Therefore, detecting deepfake videos has become an important task nowadays. Nowadays, many deepfake detection models are facing the problem of overfitting. Therefore, this Thesis proposes a deepfake detection algorithm that improves the generalization ability of the model. Firstly, EfficientNet is used as the backbone of the model, then an attention mechanism is added to make the model focus on important information. Finally, the mask image is designed as data augmentation for training. The training set of the proposed method is FaceForensics++, and it is tested on the FaceForensics++, Celeb-DF, and DFDC testing sets. The experimental results show that the proposed method has better generalization ability among the compared methods.

    Contents iv List of Tables vi List of Figures vii Chapter 1 Introduction 1 1.1 Overview 1 Chapter 2 Background and Related Works 4 2.1 Deepfake generation 4 2.1.1 Autoencoder 4 2.1.2 Generative Adversarial Network (GAN) 6 2.2 Deepfake Detection 9 2.3 Transfer Learning 10 2.4 Data Augmentation 11 2.5 EfficientNet 12 2.6 Multi-task Cascaded Convolutional Networks 15 2.7 Squeeze-and-Excitation Block 17 2.8 Coordinate Attention Block 18 Chapter 3 The Proposed Algorithm 21 3.1 Data Preparation 22 3.1.1 Crop Face 22 3.1.2 Mask Image 22 3.2 Proposed Network Architecture 23 3.2.1 EfficientNet-B5 24 3.2.2 Coordinate Attention Block 28 3.3 Loss Function 30 3.3.1 Cross Entropy 30 Chapter 4 Experimental Results 31 4.1 Experimental Dataset 31 4.2 Parameter and Experimental Setting 35 4.3 Experimental Results 36 4.4 Ablation Experimental Results 39 4.4.1 Generalization Ability Experimental Results 39 4.4.2 Attention Mechanism Experimental Results 43 Chapter 5 Conclusion and Future Work 45 5.1 Conclusion 45 5.2 Future Work 45 References 46

    References
    [1] Tolosana, R., et al., Deepfakes and beyond: A survey of face manipulation and fake detection. Information Fusion, 2020. 64: p. 131-148.
    [2] Li, Y., et al. Celeb-df: A large-scale challenging dataset for deepfake forensics. in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.
    [3] Rossler, A., et al. Faceforensics++: Learning to detect manipulated facial images. in Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019.
    [4] Choi, Y., et al. Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. in Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
    [5] Bondi, L., et al., Training Strategies and Data Augmentations in CNN-based DeepFake Video Detection. arXiv preprint arXiv:2011.07792, 2020.
    [6] Bourlard, H. and Y. Kamp, Auto-association by multilayer perceptrons and singular value decomposition. Biological cybernetics, 1988. 59(4): p. 291-294.
    [7] Hinton, G.E. and R.R. Salakhutdinov, Reducing the dimensionality of data with neural networks. science, 2006. 313(5786): p. 504-507.
    [8] Goodfellow, I.J., et al., Generative adversarial networks. arXiv preprint arXiv:1406.2661, 2014.
    [9] Li, L., et al., Faceshifter: Towards high fidelity and occlusion aware face swapping. arXiv preprint arXiv:1912.13457, 2019.
    [10] Nirkin, Y., Y. Keller, and T. Hassner. Fsgan: Subject agnostic face swapping and reenactment. in Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019.
    [11] Rana, M.S. and A.H. Sung. Deepfakestack: A deep ensemble-based learning technique for deepfake detection. in 2020 7th IEEE International Conference on Cyber Security and Cloud Computing (CSCloud)/2020 6th IEEE International Conference on Edge Computing and Scalable Cloud (EdgeCom). 2020. IEEE.
    [12] Kumar, A., A. Bhavsar, and R. Verma. Detecting deepfakes with metric learning. in 2020 8th International Workshop on Biometrics and Forensics (IWBF). 2020. IEEE.
    [13] Li, L., et al. Face x-ray for more general face forgery detection. in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.
    [14] Zhao, H., et al., Multi-attentional deepfake detection. arXiv preprint arXiv:2103.02406, 2021.
    [15] Tan, M. and Q. Le. Efficientnet: Rethinking model scaling for convolutional neural networks. in International Conference on Machine Learning. 2019. PMLR.
    [16] Tan, M., et al. Mnasnet: Platform-aware neural architecture search for mobile. in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019.
    [17] Sandler, M., et al. Mobilenetv2: Inverted residuals and linear bottlenecks. in Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
    [18] Hu, J., L. Shen, and G. Sun. Squeeze-and-excitation networks. in Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
    [19] He, K., et al. Deep residual learning for image recognition. in Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
    [20] Zhang, K., et al., Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters, 2016. 23(10): p. 1499-1503.
    [21] Hou, Q., D. Zhou, and J. Feng, Coordinate attention for efficient mobile network design. arXiv preprint arXiv:2103.02907, 2021.
    [22] King, D.E., Dlib-ml: A Machine Learning Toolkit. Journal of Machine Learning Research, 2009. 10: p. 1755-1758.
    [23] Dolhansky, B., et al., The deepfake detection challenge (dfdc) preview dataset. arXiv preprint arXiv:1910.08854, 2019.
    [24] Afchar, D., et al. Mesonet: a compact facial video forgery detection network. in 2018 IEEE International Workshop on Information Forensics and Security (WIFS). 2018. IEEE.
    [25] Selvaraju, R.R., et al. Grad-cam: Visual explanations from deep networks via gradient-based localization. in Proceedings of the IEEE international conference on computer vision. 2017.

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE