簡易檢索 / 詳目顯示

研究生: 潘凡雯
Shaik Reshma Parveen
論文名稱: 使用低成本神經網絡對驗證碼進行漏洞評估
Vulnerability assessment of captchas using a low-cost neural network
指導教授: 楊竹星
Yang, Chu-Sing
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電腦與通信工程研究所
Institute of Computer & Communication Engineering
論文出版年: 2022
畢業學年度: 110
語文別: 英文
論文頁數: 96
外文關鍵詞: CAPTCHA recognition, machine learning, deep neural network, non-segmentation
相關次數: 點閱:161下載:18
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • CAPTCHA is a security test used to differentiate between humans and bots, and the most widely used scheme in various web services is text-based CAPTCHA. With the advancement in AI, challenges to make secure CAPTCHA becomes extensively important to secure our network from various threats and attacks. Hence, the need is to focus on how vulnerable CAPTCHAs are and understand what factors make them better secure, thereby developing more robust CAPTCHAs. In recent times, many machine learning techniques have achieved good results in CAPTCHA recognition. However, most of the attack processes are restricted by a limited amount of labeled CAPTCHA data, which is time-consuming and expensive as they manually collect data and label the samples to train the model. Even though some researchers have cracked the CAPTCHA, they have adopted dense networks which consumes a lot of memory. Therefore, in this research, we propose a simple customized deep neural network which reduces the complexity, lowers the cost of attack, and even achieves high accuracy. The model is trained on open-source python library, for which we perform preprocessing and then use the simple model for recognition. It’s a non- segmentation model the whole CAPTCHA is trained at once without the need to segment the image characters. This approach achieves high accuracy on various datasets of CAPTCHAs with various lengths and categories even effectively reduces the memory consumption. The recognition rate for one of the datasets - numerical CAPTCHA- is 99.51%, 99.20%, and 98.18% for 4, 5, and 6 length CAPTCHAs. The research even analyzes the various factors which influence the vulnerability of the text-based CAPTCHA, for the robustness of CAPTCHA.

    Abstract I Acknowledgements II Catalog III Figure Catalog VII Table Catalog XI 1. INTRODUCTION 1 2. BACKGROUND AND RELATED WORK 3 2.1 Types of captchas 3 2.1.1 Text-based CAPTCHA 3 2.1.2 Image-based CAPTCHA 3 2.1.3 Video-based CAPTCHA 4 2.1.4 Audio-based CAPTCHA 5 2.1.5 Puzzle-based CAPTCHA 6 2.2 Applications of CAPTCHAs 7 2.3 Segmentation based model VS segmentation free model 8 2.4 Convolutional Neural Network (CNN) 9 2.5 Related Work 11 2.6 Goal of the Project: 14 3. METHODOLOGY 15 3.1 Proposed model 15 3.2 Captcha Dataset 16 3.2.1 Splitting of Dataset 23 3.3 Data Preprocessing 24 3.3.1 Color Space conversion: 24 3.3.2 Image resizing: 25 3.3.3 Noise reduction filtering: 26 3.4 Proposed Network Architecture 28 3.4.1 Convolutional Layers 32 3.4.2 Rectified Linear Unit (ReLU) 32 3.4.3 Pooling Layer 33 3.4.3.1 Max Pooling 33 3.4.4 Fully Connected Layer 34 3.4.5 Dropout Layer 35 3.4.6 Softmax 35 3.4.7 Adam Optimizer 36 3.4.8 Binary Cross Entropy 36 3.5 Environment Setup 36 4. RESULTS AND DISCUSSION 38 4.1. Numerical dataset 39 4.1.1. Numerical dataset - Length 4 39 4.1.2. Numerical dataset - Length 5 41 4.1.3. Numerical dataset - Length 6 43 4.2. Alphanumerical dataset 45 4.2.1. Alphanumerical dataset - Length 4 45 4.2.2. Alphanumerical dataset - Length 5 47 4.2.3. Alphanumerical dataset - Length 6 49 4.3. Alphabets combination dataset 51 4.3.1. Alphabets combination dataset - Length 4 52 4.3.2. Alphabets combination dataset - Length 5 54 4.3.3. Alphabets combination dataset - Length 6 56 4.4. Capital alphabets dataset 58 4.4.1. Capital alphabets dataset - Length 4 58 4.4.2. Capital alphabets dataset - Length 5 60 4.4.3. Capital alphabets dataset - Length 6 62 4.5. Small alphabets dataset 64 4.5.1. Small alphabets dataset - Length 4 64 4.5.2. Small alphabets dataset - Length 5 66 4.5.3. Small alphabets dataset - Length 6 68 4.6. Capital alphabets & numbers combination dataset 70 4.6.1. Capital alphabets & numbers combination dataset - Length 4 71 4.6.2. Capital alphabets & numbers combination dataset - Length 5 73 4.6.3. Capital alphabets & numbers combination dataset - Length 6 75 4.7. Small alphabets & numbers combination dataset 77 4.7.1. Small alphabets & numbers combination dataset - Length 4 77 4.7.2. Small alphabets & numbers combination dataset - Length 5 79 4.7.3. Small alphabets & numbers combination dataset - Length 6 81 4.8. Performance measures 84 4.9. Performance comparison 89 4.9.1. Comparison with different methods reference to previous results 89 4.9.2. Performance comparison with same dataset 91 5. CONCLUSION AND FUTURE WORK 92 REFERENCES 94

    [1] L. Von Ahn, M. Blum, N. J. Hopper, and J. Langford, “CAPTCHA : Using Hard AI Problems for Security,” pp. 294–311, 2003.
    [2] C. Pope and K. Kaur, “Is It Human or Computer? Defending E-Commerce with Captchas,” no. April, pp. 43–49, 2005.
    [3] “The captcha project.” http://www.captcha.net/captchas/pix (accessed Dec. 10, 2021).
    [4] K. Kaur and S. Behal, “Captcha and Its Techniques : A Review,” Int. J. Comput. Sci. Inf. Technol., vol. 5, no. 5, pp. 6341–6344, 2014.
    [5] T. Defense and K. A. Kluever, “Evaluating the Usability and Security of a Video CAPTCHA,” Master’s thesis, Rochester Institute of Technology, Rochester, New York, 2008.
    [6] K. Rao, K. Sri, and G. Sai, “A Novel Video CAPTCHA Technique to Prevent BOT Attacks,” Procedia Comput. Sci., vol. 85, no. Cms, pp. 236–240, 2016, doi: 10.1016/j.procs.2016.05.220.
    [7] A. B. and B. S. Saini, “A Review of Bot Protection using CAPTCHA for Web Security,” (IOSR-JCE)IOSR J. Comput. Eng., vol. 8, no. 6 (Jan.-Feb. 2013), pp. 36–42, 2019, doi: 10.9790/0661-0863642.
    [8] V. P. Singh and P. Pal, “Survey of Different Types of CAPTCHA,” Int. J. Comput. Sci. Inf. Technol., vol. 5, no. 2, pp. 2242–2245, 2014.
    [9] A. Thobhani, M. Gao, A. Hawbani, S. Taher, and M. Ali, “CAPTCHA Recognition Using Deep Learning with Attached Binary Images,” pp. 1–19, doi: 10.3390/electronics9091522.
    [10] W. G. Morein, A. Stavrou, and D. L. Cook, “Using Graphic Turing Tests to Counter Automated DDoS Attacks against Web Servers,” no. September, 2003, doi: 10.1145/948109.948114.
    [11] M. Captchas, “CAPTCHA challenges for massively multiplayer online games : Mini-game CAPTCHAs,” no. October, 2010, doi: 10.1109/CW.2010.48.
    [12] M. H. Aldosari and A. A. Al-daraiseh, “Strong Multilingual CAPTCHA Based on Handwritten Characters,” pp. 239–245, 2016.
    [13] I. Captcha, “Research on Deep Learning Techniques in Breaking Text-Based Captchas and Designing,” vol. 13, no. 10, pp. 2522–2537, 2018.
    [14] P. Wang, H. Gao, Q. Rao, S. Luo, Z. Yuan, and Z. Shi, “A Security Analysis of Captchas with Large Character Sets,” vol. 14, no. 8, 2020, doi: 10.1109/TDSC.2020.2971477.
    [15] F. Abbas, U. Rajput, and I. A. Dahar, “Enhancing Security of Urdu Language Websites through Urdu CAPTCHA,” no. November, 2020.
    [16] P. Sahare and S. B. Dhok, “Robust Character Segmentation and Recognition Schemes for Multilingual Indian Document Images,” vol. 4602, 2019, doi: 10.1080/02564602.2018.1450649.
    [17] R. G. Casey and E. Lecolinet, “A Survey of methods and strategies in character segmentation.”
    [18] P. Y. Chellapilla, K., & Simard, “Using Machine Learning to Break Visual Human Interaction Proofs ( HIPs ),” NIPS, 2004.
    [19] R. Plamondon, “On-line and off-line handwriting recognition : a comprehensive survey . IEEE Trans Pattern Anal Mach Intell ( T-PAMI ),” no. January 2000, 2014, doi: 10.1109/34.824821.
    [20] J. Yan and A. S. El Ahmad, “Breaking visual CAPTCHAs with naïve pattern recognition algorithms,” Proc. - Annu. Comput. Secur. Appl. Conf. ACSAC, pp. 279–291, 2007, doi: 10.1109/ACSAC.2007.47.
    [21] I. J. Goodfellow, Y. Bulatov, J. Ibarz, S. Arnoud, and V. Shet, “Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks,” pp. 1–12.
    [22] E. Bursztein, J. Aigrain, J. C. Mitchell, and A. Moscicki, “The End is Nigh : Generic Solving of Text-based CAPTCHAs.”
    [23] I. Science et al., “The Robustness of ‘Connecting Characters Together’ CAPTCHAs,” J. Inf. Sci. Eng, vol. 30, no. 2, pp. 347–369, 2014, doi: 10.6688/JISE.2014.30.2.5.
    [24] H. Gao et al., “A Simple Generic Attack on Text Captchas,” no. February, pp. 21–24, 2017, doi: 10.14722/ndss.2016.23154.
    [25] Y. Wang and M. Lu, “A self-adaptive algorithm to defeat text-based CAPTCHA,” IEEE Int. Conf. Ind. Technol., no. doi: 10.1109/ICIT.2016.7474839., pp. 720–725, 2016.
    [26] T. Captcha, “Research on the Security of Microsoft’s Two-Layer Captcha,” IEEE Trans. Inf. Forensics Secur., vol. 12, no. 7, pp. 1671–1685, 2017.
    [27] R. Hussain, H. Gao, R. A. Shaikh, and K. Kumar, “Recognition of text-based CAPTCHAs with neural confidence,” vol. 14, no. 9, pp. 290–295, 2016.
    [28] A. Kehar, R. H. Arain, and R. A. Shaikh, “Deciphering complex text-based CAPTCHAs with deep learning,” pp. 1390–1400, 2020.
    [29] J. Du, Feng-Lin & Li, Jia-Xing & Yang, Zhi & Chen (陈鹏), Peng & Wang, Bing & Zhang, “CAPTCHA Recognition Based on Faster R-CNN,” vol. 10362, pp. 421–427, 2017, doi: 10.1007/978-3-319-63312-1.
    [30] Z. Wang and P. Shi, “CAPTCHA Recognition Method Based on CNN with Focal Loss,” Complexity, vol. 2021, 2021, doi: 10.1155/2021/6641329.
    [31] F. H. Alqahtani and F. A. Alsulaiman, “Computers & Security Is image-based CAPTCHA secure against attacks based on machine learning ? An experimental study,” Comput. Secur., vol. 88, p. 101635, 2020, doi: 10.1016/j.cose.2019.101635.
    [32] F. Stark and C. Hazırba, “CAPTCHA Recognition with Active Deep Learning,” no. April, 2016.
    [33] Y. Hu, L. Chen, and J. Cheng, “A CAPTCHA recognition technology based on deep learning,” pp. 617–620, 2018.
    [34] J. Wang, J. Qin, X. Xiang, Y. Tan, and N. Pan, “CAPTCHA recognition based on deep convolutional neural network,” Math. Biosci. Eng., vol. 16, no. 5, pp. 5851–5861, 2019, doi: 10.3934/mbe.2019292.
    [35] K. Qing and R. Zhang, “A Multi-label Neural Network Approach to Solving Connected CAPTCHAs,” 2017, doi: 10.1109/ICDAR.2017.216.
    [36] S. Sachdev, “Breaking CAPTCHA characters using Multi-task Learning CNN and SVM,” 4th Int. Conf. Comput. Intell. Networks, CINE 2020, 2020, doi: 10.1109/CINE48825.2020.234400.
    [37] T. Schubert, “Why are digits easier to identify than letters,” vol. Neuropsych, no. Ccd, 2016, doi: 10.1016/j.neuropsychologia.2016.12.016.
    [38] E. M. Weisenmiller and J. C. Fortune, “A study of the readability of on -screen text,” 1999.
    [39] Z. Noury and M. Rezaei, “Deep-CAPTCHA : a deep learning based CAPTCHA solver for vulnerability assessment,” 2020.
    [40] C. Sites, R. H. Arain, R. A. Shaikh, K. Kumar, and A. Maitlo, “Verifying the Robustness of Text-based CAPTCHAs offered by Local E-Commerce Sites,” no. September, 2018.
    [41] N. Yu and K. Darling, “A low-cost approach to crack python CAPTCHAs using AI-based chosen-plaintext attack,” Appl. Sci., vol. 9, no. 10, 2019, doi: 10.3390/app9102010.

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE