簡易檢索 / 詳目顯示

研究生: 楊佳憲
Yang, Chia-Hsien
論文名稱: 利用Dropout以及符號函數修改以增進對抗例之可轉移性
Transferability of adversarial examples improved with Dropout and Sign Modification
指導教授: 陳盈如
Chen, Yean-Ru
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 英文
論文頁數: 74
中文關鍵詞: 深度學習可轉移性對抗例黑盒攻擊
外文關鍵詞: Deep learning, Transferability, Adversarial Attack, Black-box attack
相關次數: 點閱:77下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 影像辨識在使用深度學習的技術之前,一直都難以達到人類的辨識水準。
    其中的困難之處在我們沒辦法歸納出一套完整的邏輯或規則可以套用在所有的照片上。相同的事物會因為拍攝時間點而有光線或顏色上面的變化,所以我們沒辦法只利用顏色來當作辨識的特徵。除此之外,其他可以想到的特徵,像是形狀或輪廓也難以用來當作不變的辨識準則,比如說動物的軀體便是變化多端,我們難以列出所有的可能性。在人類無法歸納完整邏輯來做影像辨識的情況底下,機器學習這個想法便受到了重視,也就是交由"機器"自己來"學習"如何完成給定的任務。深度學習屬於機器學習的一個分支,近年來由於理論上逐漸成熟且在許多領域中的應用效果顯著,因此特別受到矚目。

    機器學習的概念為建立一個"模型",並準備"訓練資料"來使得這個模型可以學會給定的任務。以深度學習來說,模型是透過類神經網路來建立而成,而類神經網路又是由一層一層的人工神經元所組成,對於不同的任務來說,模型需要的層數以及每層包含的人工神經元數量也會有所差異。
    深度學習中的"深度"指的是去建立一個非常多層的類神經網路,諸如幾百或幾千層。越深的類神經網路可以執行的任務也可以越複雜,但隨之而來的是訓練上的困難,包含非常龐大的計算量還有難以訓練成功等等。

    深度學習在影像辨識上的成效已經超越了訓練有素的人類。影像辨識的本質其實是資料分類的問題,這也使得深度學習在其他有資料分類需求的領域中被廣泛應用。然而,深度學習背後對於輸入資料的分類依據終究與人類不同,這也導致了在一些特例中,深度學習的可靠性還有安全性備受質疑。
    這樣的特例在深度學習的研究中被稱為"對抗例",以資料類型為照片來說的話,我們準備一個訓練好的類神經網路,以及一張這個類神經網路可以分類正確的照片,對抗例便是對這一張照片的像素值做輕微繞動後所產生出來的一張新照片,而原先的類神經網路匯對這張對抗例做出完全錯誤的判斷。這裡強調"輕微擾動"的用意為,以人類的感知邏輯來說,擾動前後的兩張照片應為相同的分類結果,但對於類神經網路的感知邏輯卻會造成巨大的影響,進而做出錯誤的分類。上述的擾動方式是特別的,一般情況下,一個訓練好的類神經網路並不會因為輸入資料被輕微擾動就做出誤判,實際上的擾動方式就留在內文做介紹了。既然類神經網路會受到對抗例的影響,代表在實際的應用情況中,總是存在一些事前難以掌握的情境。如果把深度學習應用在攸關人身安全的需求中,例如自動駕駛,便有可能造成危害。

    目前還未有萬全的方式來使得類神經網路抵禦對抗例,所以在對抗例的研究上分成兩個角度來進行,一個是攻擊,另一個是防禦。透過研究攻擊來更加了解類神經網路的弱點和不足,然後再藉由前面設計的攻擊來研發防禦,這樣的過程並非是單次而是一個循環,在這個循環中,我們會設計出更強的攻擊但同時也找到用來抵禦這個強力攻擊的反制手段。在無數次的循環過程中達到最終的目的,使得對抗例無效化。其實這個攻擊跟防禦的循環就是一個對抗的過程,這也是對抗例取名"對抗"的用意。

    對抗例的攻擊情境分成白盒以及黑盒。白盒指的是攻擊目標的類神經網路是已知的,屬於很容易攻擊成功的情境,但在現實世界中是不實際的。大多數使用深度學習的商業服務都不會公開他們的技術細節,而且在類神經網路的使用上,到底哪些資料會被輸入進來是難以預料的,所以若是只考慮白盒這種攻擊情境並不能確保類神經網路是真的沒有漏洞的。黑盒指的是攻擊目標的類神經網路是未知的,包含用來訓練的資料集。對於黑盒攻擊來說,對抗例的產生是攻擊者找一個類似目標的類神經網路來當作替代模型,並去產生一張可以使得替代模型分類錯誤的對抗例。對抗例可以從替代模型上產生出來並攻擊真正的目標的這個特性稱作"可轉移性"。要提升黑盒的攻擊成功率有兩個,一個是找到很類似目標的替代模型,但這不太容易,另一個便是去分析類神經網路的弱點而設計出來的"黑盒攻擊"。

    在本研究中,我們設計了兩個黑盒攻擊。第一個方法是在選定的替代模型中插入額外的dropout。Dropout原本用於提升類神經網路的訓練成效,經過我們對於黑盒攻擊的分析,我們發現黑盒攻擊跟類神經網路的訓練方式其實為對偶問題。這樣的結論讓我們得以推理出原先用來提升類神經網路訓練成效的方法也有助於提升對抗例的可轉移性。第二個方法為利用符號函數來修改現有方法I-FGSM所產生的擾動大小。對抗例是擾動原始照片而得到的,到底要擾動哪些像素,以及擾動程度便會大大影響到攻擊成功率。I-FGSM用在白盒攻擊中都有機會達到接近100%的攻擊成功率,但在黑箱攻擊上效果並不顯著,在我們的實驗中只有30%到50%。我們進一步的分析發現這是因為I-FGSM對於原始照片的擾動會比容許的擾動程度還要來得小,但他對原始照片的擾動方式還是可以有效的達到黑箱攻擊。
    因此我們在不違反給定容許擾動程度的限制之下,去放大I-FGSM對原始照片的擾動。

    另外,我們所提的兩個方法可以結合在一起形成一個更強效的攻擊,在實驗數據中,結合後的兩個方法與現有方法相較之下,平均有5%到30%的攻擊成功率提升。最後我們的方法還可以跟現有方法再進一步結合,使得攻擊成功率比任一方法都還要來的高。

    Deep neural networks are concerned about being misled by the adversarial examples generated by adding human-imperceptible perturbations to clean examples. Noted that the adversarial examples can be used in the attacks in which the targets are known by ttackers (so-called white-box attacks) and be transferred to attack the targets that have not been seen yet (so-called black-box attacks). In this work, we aim to improve the black-box attack success rates, and we achieve this goal by two methods of generating the adversarial examples with high transferability. The first method is to apply the dropout technique based on the observation of that mitigating overfitting in adversarial example generation can benefit its transferability. The second proposed method is to expand the perturbation generated by Iterative Fast Gradient Sign Method (I-FGSM) in terms of the perturbation size to its maximum valid value by sign function modification since we discover that the erturbations have a very small size and cause poor transferability. We propose an algorithm named Dropout and Sign Modification (DSM) based on the two methods. The attack success rates of the standalone DSM outperform the existing black-box attacks by 5% to 30% on average.
    Moreover, we further integrate our methods with the previous works, such as Translation-invariant Method (TI) or Scale-invariant Nesterov Iterative Fast Gradient Sign Method (SI-NI), and the best case chieves 10% improvement on the attack success rate.

    摘要 i Abstract iv 誌謝 v Table of Contents vi List of Tables viii List of Figures ix Nomenclature x Chapter 1. Introduction 1 1.1 Threat model 4 1.2 Deep learning 5 1.2.1. Training neural networks 7 1.2.2. Convolution neural network (CNN) 10 1.3 Motivation 11 1.4 Contribution 12 1.5 Thesis Structure 13 Chapter 2. Related Work 14 2.1 Adversarial example 14 2.2 White-Box Attacks 16 2.2.1. Existing methods 16 2.2.2. Attack success rates of exiting methods 17 2.3 Black-Box Attacks 21 2.3.1. Transferability 22 2.3.2. Existing methods 25 Chapter 3. Methodology 32 3.1 Improving Transferability by Dropout-Method 1 32 3.1.1. Motivation 32 3.1.2. Insert Dropout Layers 34 3.2 Expanding the Adversarial Perturbation-Method 2 37 3.2.1. Motivation 37 3.2.2. Sign Modification Method 38 3.3 Proposed Algorithm of Dropout and Sign Modification (DSM) 38 Chapter 4. Experimental Results 45 4.1 Experimental Settings 45 4.1.1. Neural network model 45 4.1.2. Dataset 54 4.1.3. Baselines 57 4.1.4. Hyper-parameters 57 4.2 Dropout Probability p 57 4.3 L∞ norm bound ϵ 58 4.4 Attacking Models 60 4.5 Attacking by Ensemble Method 62 4.6 Images of adversarial examples 63 Chapter 5. Conclusion 65 References 67

    [1] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012, Lake Tahoe, Nevada, United States (P. L. Bartlett, F. C. N. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, eds.), pp. 1106–1114, 2012.
    [2] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. S. Bernstein, A. C. Berg, and F. Li, “Imagenet large scale visual recognition challenge,” Int. J. Comput. Vis., vol. 115, no. 3, pp. 211–252, 2015.
    [3] A. Karpathy, “What i learned from competing against a convnet on imagenet,”
    [4] “Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups,” IEEE Signal Process. Mag., vol. 29, no. 6, pp. 82–97, 2012.
    [5] I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural
    networks,” in Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada (Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, eds.), pp. 3104–3112, 2014.
    [6] R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction. MIT press, 2018.
    [7] H. Y. Xiong, B. Alipanahi, L. J. Lee, H. Bretschneider, D. Merico, R. K. Yuen, Y. Hua,
    S. Gueroussov, H. S. Najafabadi, T. R. Hughes, et al., “The human splicing code reveals new insights into the genetic determinants of disease,” Science, vol. 347, no. 6218, 2015.
    [8] V. Gulshan, L. Peng, M. Coram, M. C. Stumpe, D. Wu, A. Narayanaswamy, S. Venugopalan, K. Widner, T. Madams, J. Cuadros, et al., “Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs,” Jama, vol. 316, no. 22, pp. 2402–2410, 2016.
    [9] D. Ardila, A. P. Kiraly, S. Bharadwaj, B. Choi, J. J. Reicher, L. Peng, D. Tse,
    M. Etemadi, W. Ye, G. Corrado, et al., “End-to-end lung cancer screening with threedimensional deep learning on low-dose chest computed tomography,” Nature medicine, vol. 25, no. 6, pp. 954–961, 2019.
    [10] R. Evans, J. Jumper, J. Kirkpatrick, L. Sifre, T. Green, C. Qin, A. Zidek, A. Nelson,
    A. Bridgland, H. Penedones, et al., “De novo structure prediction with deeplearning based scoring,” Annu Rev Biochem, vol. 77, no. 363-382, p. 6, 2018.
    [11] P. Baldi, P. Sadowski, and D. Whiteson, “Searching for exotic particles in high-energy physics with deep learning,” Nature communications, vol. 5, no. 1, pp. 1–9, 2014.
    [12] P. M. DeVries, F. Viégas, M. Wattenberg, and B. J. Meade, “Deep learning of aftershock patterns following large earthquakes,” Nature, vol. 560, no. 7720, pp. 632–634, 2018.
    [13] A. Ramcharan, K. Baranowski, P. McCloskey, B. Ahmed, J. Legg, and D. P. Hughes, "Deep learning for image-based cassava disease detection,” Frontiers in plant science, vol. 8, p. 1852, 2017.
    [14] A. I. Maqueda, A. Loquercio, G. Gallego, N. García, and D. Scaramuzza, “Event-based vision meets deep learning on steering prediction for self-driving cars,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5419–5427, 2018.
    [15] A. Kurakin, I. J. Goodfellow, and S. Bengio, “Adversarial machine learning at scale,” in 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings, OpenReview.net, 2017.
    [16] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. J. Goodfellow, and R. Fergus, “Intriguing properties of neural networks,” in 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings (Y. Bengio and Y. LeCun, eds.), 2014.
    [17] I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” in 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings (Y. Bengio and Y. LeCun, eds.), 2015.
    [18] N. Papernot, P. McDaniel, X. Wu, S. Jha, and A. Swami, “Distillation as a defense to adversarial perturbations against deep neural networks,” in 2016 IEEE symposium on security and privacy (SP), pp. 582–597, IEEE, 2016.
    [19] P. Samangouei, M. Kabkab, and R. Chellappa, “Defense-gan: Protecting classifiers against adversarial attacks using generative models,” in 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings, OpenReview.net, 2018.
    [20] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,” in 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings, OpenReview.net, 2018.
    [21] McAfee, “Advanced guide to inception v3 on cloud tpumodel hacking adas to pave safer roads for autonomous vehicles.” https://cloud.
    google.com/tpu/docs/inception-v3-advanced#learning_rate_
    adaptationhttps://www.mcafee.com/blogs/other-blogs/mcafee-labs/
    model-hacking-adas-to-pave-safer-roads-for-autonomous-vehicles/,
    2020.
    [22] K. Eykholt, I. Evtimov, E. Fernandes, B. Li, A. Rahmati, C. Xiao, A. Prakash, T. Kohno, and D. Song, “Robust physical-world attacks on deep learning visual classification,” in 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018, pp. 1625–1634, IEEE Computer Society, 2018.
    [23] S. Ruder, “An overview of gradient descent optimization algorithms,” arXiv preprint arXiv:1609.04747, 2016.
    [24] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.
    [25] N. Srivastava, G. E. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov,
    “Dropout: a simple way to prevent neural networks from overfitting,” J. Mach. Learn. Res., vol. 15, no. 1, pp. 1929–1958, 2014.
    [26] J. Deng, W. Dong, R. Socher, L. Li, K. Li, and F. Li, “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 20-25 June 2009, Miami, Florida, USA, pp. 248–255, IEEE Computer Society, 2009.
    [27] D. Su, H. Zhang, H. Chen, J. Yi, P.-Y. Chen, and Y. Gao, “Is robustness the cost of accuracy?–a comprehensive study on the robustness of 18 deep image classification models,” in Proceedings of the European Conference on Computer Vision (ECCV), pp. 631–648, 2018.
    [28] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings (Y. Bengio and Y. LeCun, eds.), 2015.
    [29] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1–9, 2015.
    [30] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” in International conference on machine learning, pp. 448–456, PMLR, 2015.
    [31] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the inception architecture for computer vision,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016, pp. 2818–2826, IEEE Computer Society, 2016.
    [32] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi, “Inception-v4, inception-resnet and the impact of residual connections on learning,” pp. 4278–4284, 2017.
    [33] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,”in 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016, pp. 770–778, IEEE Computer Society, 2016.
    [34] G. Huang, Z. Liu, L. van der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017, pp. 2261–2269, IEEE Computer Society, 2017.
    [35] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” arXiv preprint arXiv:1704.04861, 2017.
    [36] B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le, “Learning transferable architectures for scalable image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8697–8710, 2018.
    [37] Y. Dong, F. Liao, T. Pang, H. Su, J. Zhu, X. Hu, and J. Li, “Boosting adversarial attacks with momentum,” in 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018, pp. 9185–9193, IEEE Computer Society, 2018.
    [38] J. Lin, C. Song, K. He, L. Wang, and J. E. Hopcroft, “Nesterov accelerated gradient and scale invariance for adversarial attacks,” in 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020, OpenReview.net, 2020.
    [39] C. Xie, Z. Zhang, Y. Zhou, S. Bai, J. Wang, Z. Ren, and A. L. Yuille, “Improving
    transferability of adversarial examples with input diversity,” in IEEE Conference on
    Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June
    16-20, 2019, pp. 2730–2739, Computer Vision Foundation / IEEE, 2019.
    [40] Y. Dong, T. Pang, H. Su, and J. Zhu, “Evading defenses to transferable adversarial examples by translation-invariant attacks,” in IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, pp. 4312–4321, Computer Vision Foundation / IEEE, 2019.
    [41] D. Wu, Y. Wang, S. Xia, J. Bailey, and X. Ma, “Skip connections matter: On the transferability of adversarial examples generated with resnets,” in 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020, OpenReview.net, 2020.
    [42] A. Al-Dujaili and U. O’Reilly, “Sign bits are all you need for black-box attacks,” in 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020, OpenReview.net, 2020.
    [43] Y. Li, S. Bai, Y. Zhou, C. Xie, Z. Zhang, and A. L. Yuille, “Learning transferable adversarial examples via ghost networks,” in The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020, pp. 11458–11465, AAAI Press, 2020.
    [44] F. Tramèr, A. Kurakin, N. Papernot, I. J. Goodfellow, D. Boneh, and P. D. McDaniel,“Ensemble adversarial training: Attacks and defenses,” in 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings, OpenReview.net, 2018.
    [45] Google, “Advanced guide to inception v3 on cloud tpu.” https://cloud.google. com/tpu/docs/inception-v3-advanced#learning_rate_adaptation, 2021.
    [46] A. Kurakin, I. J. Goodfellow, and S. Bengio, “Adversarial examples in the physical world,” in 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Workshop Track Proceedings, OpenReview.net, 2017.

    無法下載圖示 校內:2026-07-29公開
    校外:2026-07-29公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE