簡易檢索 / 詳目顯示

研究生: 陳奕婷
Chen, Yi-Ting
論文名稱: 應用融合空間與頻譜特徵的歸納分類器在深度偽造檢測競賽上
An Inductive Classifier with Fused Spatial-Spectral Features on Deepfake Detection Challenge
指導教授: 戴顯權
Tai, Shen-Chuan
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電腦與通信工程研究所
Institute of Computer & Communication Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 英文
論文頁數: 62
中文關鍵詞: 深度偽造深度偽造檢測卷積神經網路離散傅立葉轉換
外文關鍵詞: Deepfake, Deepfake detection, convolutional neural networks, discrete Fourier transform
相關次數: 點閱:97下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著機器學習技術的快速發展,在影像處理領域中有越來越多方面的應用隨之產生,其中「深度偽造」也隨之進步到幾乎沒有換臉及修改的痕跡。因為有免費開源的製作軟體,如:FakeApp [1] 是一款使用TensorFlow [2] 開源軟體庫的Deepfake影片製作程式,即使就算不具有相關知識背景或程式能力也能自行製作換臉影片,因此大量各種不同目的的深度偽造影片在網路上流傳,每每都能在社交平台上引起熱烈的廣泛討論,其中有些影片內容不乏含有色情、造謠、惡作劇和政治干擾意味,會造成許多社會問題,甚至是影響他人名譽,所以偵測是否為他人惡意製造的深度偽造影片成了重要的任務。
    本論文提出一個融合時域和頻域的雙域歸納分類器。在時域上使用逐漸縮小的卷積神經網路,結合在頻域上使用傅立葉轉換分析真假臉頻譜。本文所提出的方法,相較於其他深度偽造檢測的方法僅需要少量資料進行訓練。實驗結果顯示,本論文所提出的方法,在比較方法當中,準確率是最高的、損失值是最低的。

    With the rapid development of machine learning technology, there are more and more applications in the image processing field. Among that "Deepfake" has progressed to almost no trace of face swapping or modification. There is free and open-source production software such as FakeApp. FakeApp is a Deepfake video production program that uses the TensorFlow open-source software library. Even if you do not have the relevant knowledge or programming capabilities, you can make face-changing videos well by yourself. Therefore, there are a lot of different purposes for Deepfake videos that are circulated on the Internet, and they often garner widespread attention and extensive discussions on social platforms. Some of the Deepfake videos contain pornography, rumors, hoaxes, and political interference, which can cause many social problems and even affect the reputation of others. It has become an important task to detect whether a Deepfake film is forged by others.
    In this Thesis, a dual-domain inductive classifier that combines the time domain and frequency domain of the image is proposed. In the spatial domain, the Gradually Down Scaled Convolutional Neural Network is used. Besides, the dual-domain inductive classifier combines discrete Fourier transform to capture spectral features in the frequency domain of images, which is helpful for analysis of the difference spectrums between the real and fake faces. The proposed method in this Thesis only needs a small amount of data for training as compared with other methods. The experimental results show that the proposed method achieves the highest accuracy and the lowest loss among all methods.

    摘要 i Abstract ii Acknowledgements iv Contents v List of Tables vii List of Figures viii Chapter 1 Introduction 1 Chapter 2 Background and Related Works 4 2.1 Deepfake 4 2.2 Multi-task Cascaded Convolutional Networks 9 2.3 MesoNet 16 Chapter 3 The Proposed Algorithm 19 3.1 Data Preparation 22 3.2 Proposed Network Architecture 24 3.2.1 Complete network architecture 24 3.2.2 Spectral domain feature extraction 27 3.2.3 Gradually Down Scaled CNN 29 3.2.4 Classifier 32 3.3 Loss Function 33 3.3.1 Shannon Entropy 33 3.3.2 Binary Cross-Entropy 34 Chapter 4 Experimental Results 36 4.1 Experimental Dataset 36 4.2 Parameter and Experimental Setting 38 4.3 Experimental Results of Spectral Domain 38 4.4 Experimental Results 42 Chapter 5 Conclusion and Future Work 44 5.1 Conclusion 44 5.2 Future Work 44 Appendix 45 Convolutional Layer 45 Stride 47 Max Pooling Layer 47 Fully Connected Layer 49 Softmax 50 Batch Normalization 51 Activation Functions 52 Complete Pseudo Code 54 References 58

    [1] Fakeapp. https://www.fakeapp.org/.
    [2] M. Abadi et al. Tensorflow: A system for large-scale machine learning. Proceedings of the USENIX Conference on Operating Systems Design and Implementation, 16:265–283, Nov. 2016. Savannah, GA
    [3] https://www.youtube.com/watch?v=iHv6Q9ychnA
    [4] Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. Automatic differentiation in PyTorch. NIPS Autodiff workshop, 2017.
    [5] F. Chollet et al. Keras. https://keras.io, 2015.
    [6] E. Jones, T. Oliphant, P. Peterson, et al. SciPy: Open source scientific tools for Python, 2001.
    [7] H. T. Sencar and N. Memon, editors. Digital Image Forensics. Springer New York, 2013.
    [8] C. Peacock, A. Goode, A. Brett. Automatic forensic face recognition from digital images. Sci. Justice, 44 (1) (2004), pp. 29-34
    [9] J. A. Redi, W. Taktak, and J.-L. Dugelay, “Digital image forensics: A booklet for beginners,” Multimedia Tools Applicat., vol. 51, no. 1, pp. 133–162, 2011.
    [10] T. Julliand, V. Nozick, and H. Talbot. Image noise and digital image forensics. In Y.-Q. Shi, J. H. Kim, F. Perez-Gonz ´ alez, ´ and I. Echizen, editors, Digital-Forensics and Watermarking: 14th International Workshop (IWDW 2015), volume 9569, pages 3–17, Tokyo, Japan, October 2015.
    [11] M. Barni, L. Bondi, N. Bonettini, P. Bestagini, A. Costanzo, M. Maggini, B. Tondi, and S. Tubaro. Aligned and nonaligned double jpeg detection using convolutional neural networks. Journal of Visual Communication and Image Representation, 49:153–163, 2017.
    [12] Xin Yang, Yuezun Li, and Siwei Lyu. Exposing deep fakes using inconsistent head poses. In IEEE International Conference on Acoustics, Speech, and Signal Processing, Bristol,United Kingdom, 2019. 1
    [13] Yuezun Li and Siwei Lyu. Exposing deepfake videos by detecting face warping artifacts. arXiv preprint arXiv:1811.00656, 2018.
    [14] Falko Matern, Christian Riess, and Marc Stamminger. Exploiting visual artifacts to expose deepfakes and face manipulations. In IEEE Winter Applications of Computer Vision Workshops, pages 83–92. IEEE, 2019.
    [15] Shruti Agarwal, Hany Farid, Yuming Gu, Mingming He, Koki Nagano, and Hao Li. Protecting world leaders against deep fakes. In IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2019.
    [16] A. Tewari et al. Mofa: Model-based deep convolutional face autoencoder for unsupervised monocular reconstruction. Proceedings of the IEEE International Conference on Computer Vision Workshops, pages 1274–1283, Oct. 2017. Venice, Italy.
    [17] D. P. Kingma and M. Welling. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
    [18] Ali Razavi, Aaron van den Oord, and Oriol Vinyals. Generating diverse high-fidelity images with vq-vae-2. In Advances in Neural Information Processing Systems, pages 14837– 14847, 2019.
    [19] I. Goodfellow et al. Generative adversarial nets. Advances in Neural Information Processing Systems, pages 2672–2680, Dec. 2014. Montreal, Canada.
    [20] G. Antipov, M. Baccouche, and J.-L. Dugelay. Face aging with conditional generative adversarial networks. arXiv:1702.01983, Feb. 2017.
    [21] T. Karras, T. Aila, S. Laine, and J. Lehtinen. Progressive growing of gans for improved quality, stability, and variation. arXiv:1710.10196, Oct. 2017.
    [22] David Guera and Edward J Delp. Deepfake video detection ¨ using recurrent neural networks. In AVSS, 2018. 1
    [23] E. Sabir, J. Cheng, A. Jaiswal, W. AbdAlmageed, I. Masi, and P. Natarajan, “Recurrent Convolutional Strategies for Face Manipulation Detection in Videos,” in Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2019.
    [24] Yuezun Li, Ming-Ching Chang, Siwei Lyu et al. In Ictu Oculi: Exposing AI Generated Fake Face Videos by Detecting Eye Blinking. Proceedings of IEEE International Workshop on Information Forensics and Security (WIFS), 2018.
    [25] K. Zhang, Z. Zhang, Z. Li, and Y. Qiao. Joint face detection and alignment using multi-task cascaded convolutional networks. arXiv preprint:1604.02878, 2016.
    [26] D. Afchar, V. Nozick, J. Yamagishi, and I. Echizen. Mesonet: a compact facial video forgery detection network. In 2018 IEEE International Workshop on Information Forensics and Security (WIFS), pages 1–7. IEEE, 2018.
    [27] Tolosana, R., Vera-Rodriguez, R., Fierrez, J., Morales, A., Ortega-Garcia, J.: Deepfakes and beyond: A survey of face manipulation and fake detection. arXiv preprint arXiv:2001.00179 (2020)
    [28] J. Thies et al. Face2Face: Real-time face capture and reenactment of rgb videos. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2387– 2395, June 2016. Las Vegas, NV
    [29] https://www.youtube.com/watch?v=cQ54GDm1eL0&t=40s
    [30] https://www.youtube.com/watch?v=dkoi7sZvWiU
    [31] Faceapp. https://www.faceapp.com/.
    [32] https://dailyview.tw/Popular/Detail/8664
    [33] https://www.youtube.com/watch?v=iLoq02XE1Jo
    [34] T. Karras, T. Aila, S. Laine, and J. Lehtinen, “Progressive Growing of GANs for Improved Quality, Stability, and Variation,” in Proc. International Conference on Learning Representations, 2018.
    [35] A. L. Maas, A. Y. Hannun, and A. Y. Ng, "Rectifier nonlinearities improve neural network acoustic models," in Proc. ICML, 2013, vol. 30, no. 1.
    [36] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich, et al. Going deeper with convolutions. Cvpr, 2015. 3
    [37] F. Yu and V. Koltun. Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122, 2015.
    [38] W. Shi, F. Jiang, and D. Zhao. Single image superresolution with dilated convolution based multi-scale information learning inception module. arXiv preprint arXiv:1707.07128, 2017.
    [39] Brian Dolhansky, Russ Howes, Ben Pflaum, Nicole Baram, and Cristian Canton Ferrer. The deepfake detection challenge (DFDC) preview dataset. arXiv:1910.08854, 2019.
    [40] Yuezun Li, Xin Yang, Pu Sun, Honggang Qi, and Siwei Lyu. Celeb-DF: A Large-scale Challenging Dataset for DeepFake Forensics. arXiv preprint: 1909.12962v4, 2020
    [41] Andreas Rossler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Nießner. FaceForensics++: Learning to Detect Manipulated Facial Images. arXiv preprint: 1901.08971v3, 2019
    [42] Shohel Rana and Andrew H. Sung. DeepfakeStack: A Deep Ensemble-based Learning Technique for Deepfake Detection. In 7th IEEE International Conference on Cyber Security and Cloud Computing (CSCloud), 2020. And in 6th IEEE International Conference on Edge Computing and Scalable Cloud (EdgeCom), 2020.
    [43] Ehsan Nowroozi, Ali Dehghantanha, Reza M. Parazi, Kima-Kwang Raymond Choo: A survey of machine learning techniques in adversarial image forensics. Computers & Security, 102092, January 2021.
    [44] Akash Kumar and Arnav Bhavsar. Detecting Deepfakes with Metric Learning. arXiv preprint arXiv:2003.08645, 2020.

    下載圖示 校內:2024-04-21公開
    校外:2024-04-21公開
    QR CODE