研究生: |
林文彬 Lin, Wen-Pin |
---|---|
論文名稱: |
實作網路預警機制於機器學習與深度學習之網路入侵偵測系統建置 Implementation of Network Early Warning Mechanism in Machine Learning and Deep Learning for the Construction of an Intrusion Detection System |
指導教授: |
陳牧言
Chen, Mu-Yen |
學位類別: |
碩士 Master |
系所名稱: |
工學院 - 工程科學系碩士在職專班 Department of Engineering Science (on the job class) |
論文出版年: | 2023 |
畢業學年度: | 111 |
語文別: | 中文 |
論文頁數: | 66 |
中文關鍵詞: | 深度學習 、機器學習 、入侵偵測系統 、一維卷積神經網路 、資訊獲利 |
外文關鍵詞: | Deep Learning, Machine Learning, Intrusion Detection System, 1D Convolutional Neural Network, Information Gain |
相關次數: | 點閱:162 下載:59 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著資訊與網路以網路安全問題日益嚴重,入侵偵測系統(Intrusion detection system, IDS)已成為保護網路安全的基本重要工具。本論文提出了一個基於機器學習與深度學習的IDS架構,該架構可以有效地協助現有應用層防火牆檢測網路中的惡意活動。
本論文主要利用已成熟的機器與深度學習技術來整合某公司現有資源,並實作入侵偵測系統來協助資訊同仁即時確認防火牆上的惡意活動。在第一階段,本論文的資料集主要利用某公司實際環境之Palo Alto應用層防火牆來收集異常與正常流量日誌,並利用資訊獲利(information gain)有效的達到特徵降維,並以支援向量機(Support Vector Machine, SVM)、隨機森林(Random Forest, RF)、極限樹(ExtraTreesClassifier, ETC)和1維卷積神經網路(1D Convolutional Neural Network, 1D CNN)這4種機器與深度學習演算法來訓練並預測本論文的模型並預先儲存;在第二階段使用Palo Alto提供之API服務,利用即時辨識程式串接並即時讀取防火牆日誌並載入已訓練好之模型來即時辨識與偵測網路中的惡意活動,並運用目前成熟的即時通訊軟體的推播傳送功能來協助預警通知讓IT部門能即時反應,本論文通過實驗證明,研究實作的研究方法能夠檢測出防火牆中的威脅類別log,並且能即時預警傳送訊息至Line Notify。且實驗證明透過資訊獲利(Information Gain)的特徵選取算法可有效針對大量不確定性之特徵值作出最佳選擇,將特徵維度降低讓模型訓練有更好的表現。
As the issue of cybersecurity has become increasingly serious with the proliferation of information and the internet, intrusion detection systems have become an essential tool for protecting network security. This paper proposes an IDS architecture based on machine learning and deep learning that can effectively assist existing application-layer firewalls in detecting malicious activity on the network.
This paper primarily utilizes mature machine learning and deep learning techniques to integrate existing company resources, and implements an intrusion detection system to assist information professionals in promptly identifying malicious activity on the firewall. In the first phase, the dataset for this paper primarily uses logs of abnormal and normal traffic from a company's actual Palo Alto application-layer firewall to effectively achieve feature dimensionality reduction using information gain, and trains and predicts our model using Support Vector Machine、Random Forest、ExtraTreesClassifier and 1D-CNN machine learning and deep learning algorithms, and stores them in advance. In the second phase, the Palo Alto API service is used to seamlessly read and load the trained model to real-time identify and detect malicious activity on the network using real-time recognition programs, and utilize the push transmission function of current mature real-time communication software to assist in early warning notifications for the IT department to promptly respond. Through experiments, we have proven that our research method can detect threat category logs on the firewall and send early warning messages to Line Notify in real time. In addition, experiments have proven that through the feature selection algorithm of information gain, it is effective for selecting the best features for a large number of uncertain feature values, reducing the feature dimensionality and allowing the model to better generalize.
[1] Wu, A., "白話文講解支持向量機(二)非線性SVM. " AndyWu’s Notes, [Online]. Available: https://reurl.cc/zrVeO0, last retrieve 05 Dec 2022.
[2] Huang, E., "白皮書《2022 台灣網路攻擊大調查》現已釋出!一起來探討網路攻擊現況吧." HENNGE, [Online]. Available: https://hennge.com/tw/blog/2022-cyber-security-white-paper-introduction.html, last retrieve 07 Dec 2022.
[3] Kuo, Y. C., "ML02:初探遺失值(missing value)處理." Medium, [Online]. Available: https://yc-kuo.medium.com/ml02-na-f2072615158e, last retrieve 06 Dec 2022.
[4] 維基百科, "入侵偵測系統." [Online]. Available: https://reurl.cc/33z7MV, last retrieve 06 Dec 2022.
[5] 維基百科, "垃圾進,垃圾出." [Online], Available: https://reurl.cc/x1RL81, last retrieve 06 Dec 2022.
[6] 羅正漢, "政府推動機關落實記錄保存初見成效,更多資安通報可追查其根因." [Online]. Available: https://www.ithome.com.tw/news/154164. last retrieve 06 Dec 2022.
[7] Akgun, D., Hizal, S. and Cavusoglu, U., "A new DDoS attacks intrusion detection model based on deep learning for cybersecurity." Computers & Security 118:102748, 2022.
[8] Arshad, M., Ali, M. S. and Kim, K. H., "Intrusion detection system using data normalization and feature selection." In Proceedings of the 3rd International Conference on Emerging Databases, pp. 3–7, 2017.
[9] Almseidin, M., Alzubi, M., Kovacs, S. and Alkasassbeh, M., "Evaluation of machine learning algorithms for intrusion detection system." IEEE 15th International Symposium on Intelligent Systems and Informatics. IEEE, 2017.
[10] Ageev, M. S. and Dobrov, B. V., "Support Vector Machine Parameter Optimization for Text Categorization Problems." Proceedings of the Fourth International Conference on Data Mining, 2003.
[11] Bhati, B. S. and Rai, C. S., "Ensemble based approach for intrusion detection using extra tree classifier." Intelligent computing in engineering. Springer. Singapore. 213-220, 2020.
[12] Breiman, L., "Random forests." Machine learning 45.1. 5-32, 2001.
[13] Chen, C., Liang, X., and Hanzo, L., "Random Forests-Based Intrusion Detection in VANETs," in IEEE Transactions on Vehicular Technology, vol. 67, no. 2, pp. 1450-1461, Feb. 2018.
[14] Francois, C., "Deep learning with Python." Manning Publications Co., 2018.
[15] Cohen, J., "A Coefficient of Agreement for Nominal Scales." Educational and Psychological Measurement, vol. 20, no. 1, pp. 37-46, 1960.
[16] Cheung, S. and Liao, S. H., "A Data Mining Approach for Network Intrusion Detection," in Proceedings of the ACM Symposium on Applied Computing, pp. 1212-1218, 2004.
[17] Chang, Y., Li , W. and Yang, Z., "Network Intrusion Detection Based on Random Forest and Support Vector Machine," in Proceedings of IEEE International Conference on Computational Science and Engineering and IEEE International Conference on Embedded and Ubiquitous Computing, 2017.
[18] Dason, J., "Support Vector Machines." Journal of Computer Science, Singapore University, 1995.
[19] Ghazal, T. M., "Data Fusion-based machine learning architecture for intrusion detection." Computers, Materials & Continua 70.2: 3399-3413, 2022.
[20] Ho, T. K., "Random decision forests." Proceedings of 3rd international conference on document analysis and recognition. Vol. 1. IEEE, 1995.
[21] Huang, G. B., Zhu, Q. Y. and Siew, C. K., "Extreme learning machine: Theory and applications." Neurocomputing 70.1. 489-501, 2006.
[22] Hamid, Y., Sugumaran, M. and Journaux, L., "Machine learning techniques for intrusion detection: a comparative analysis." Proceedings of the International Conference on Informatics and Analytics, 2016.
[23] Hsu, Y. F., He, Z., Tarutani, Y. and Matsuoka, M., "Toward an online network intrusion detection system based on ensemble learning." IEEE 12th international conference on cloud computing. IEEE, 2019.
[24] Hastie, T., Tibshirani, R. and Friedman, J., "The elements of statistical learning." 2009.
[25] Jha, J. and Ragha, L., "Intrusion Detection System using Support Vector Machine," in Proceeding of International Conference & workshop on Advanced Computing, 2013.
[26] De Boer, P. and Pels, M., "Host-based intrusion detection system." Journal of Computer Virology 1, no. 1, 27-40, 2005.
[27] LeCun, Y, Bottou, L., Bengio, Y. and Haffner, P., "Gradient-Based Learning Applied to Document Recognition." Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.
[28] Myles, A. J., Feudale, R. N., Liu, Y., Woody, N. A. and Brown, S. D., "An introduction to decision tree modeling." Journal of Chemometrics: A Journal of the Chemometrics Society 18.6: 275-285, 2004.
[29] McCarthy, J., "Artificial Intelligence." Dartmouth College, Hanover, NH, 1955.
[30] Mitiche, I., Nesbitt, A., Conner, S., Boreham, P. and Morison, G., "1D‐CNN based real‐time fault detection system for power asset diagnostics." IET Generation, Transmission & Distribution 14.24, 5766-5773, 2020.
[31] Ollech, D. and Webel, K., "A random forest-based approach to identifying the most informative seasonality tests.". Deutsche Bundesbank Discussion Paper No. 55, 2020.
[32] P. A. Networks, "API Documentation." TECHDOCS, [Online]. Available: https://docs.paloaltonetworks.com/develop/api#sort=relevancy&layout=card&numberOfResults=25, last retrieve 08 Dec 2022.
[33] P. A. Networks, "PA-800 Series Datasheet." [Online]. Available: https://www.paloaltonetworks.com/resources/datasheets/pa-800-series-datasheet, last retrieve 08 Dec 2022.
[34] Quinlan, J. R., "Simplifying decision trees." International Journal of Man-Machine Studies, 27 (3): 221-234, 1987.
[35] Sanjeev, A., "Information Gain and Entropy Explained | Data Science." [Online]. Available: https://www.humaneer.org/blog/data-science-information-gain-and-entropy-explained/. last retrieve 05 Dec 2022.
[36] Shannon, C. E., "Communication Theory of Secrecy Systems." Bell System Technical Journal, vol. 28, no. 4, pp. 656-715, 1949.
[37] Sajja, G. S., Mustafa, M., Ponnusamy, R. and Abdufattokhov, S., "Machine learning algorithms in intrusion detection and classification." Annals of the Romanian Society for Cell Biology 25.6. 12211-12219, 2021.
[38] Tarnowska, K. A. and Patel, A., "Log-based malicious activity detection using machine and deep learning." Malware Analysis Using Artificial Intelligence and Deep Learning. Springer, Cham. 581-604, 2021.
[39] Usman, M. and Hussain, A., "Support Vector Machines for Network Intrusion Detection: A Review." IEEE Access, vol. 6, pp. 29644-29657, 2018.
[40] Usman, M., Shafique, M., Hussain, A. and Zeeshan, F., "A Review of Extreme Learning Machine-Based Intrusion Detection Systems." Expert Systems with Applications, vol. 44, pp. 107-117, 2016.
[41] Visa, S., Ramsay, B., Ralescu, A. L. and Van Der Knaap, E., "Confusion matrix-based feature selection." MAICS 710.1. 120-127, 2011.
[42] Whitepaper, C., "Cisco Visual Networking Index: Forecast and Trends, 2017–2022." Cisco, 2018.
[43] Wang, X. and Chen, X., "Missing value imputation for intrusion detection using a novel data preprocessing method." In Proceedings of the 4th ACM Workshop on Artificial Intelligence and Security, pp. 29–38, 2017.
[44] Wikipedia, "Generalization (learning)." [Online]. Available: https://en.wikipedia.org/wiki/Generalization_(learning), last retrieve 10 Dec 2022.
[45] Zhang, J., Zulkernine, M. and Haque, A., "Random-Forests-Based Network Intrusion Detection Systems," IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 38, no. 5, 2008.
[46] Zhang, Z., Zhan, Y. and Chen, Y., "Intrusion Detection in BlockChain Using Random Forests," IEEE Access, vol. 7, pp. 83820-83830, 2019.