簡易檢索 / 詳目顯示

研究生: 高熙翔
Kao, Hsi-Hsiang
論文名稱: 應用簡易貝氏分類與基因演算法之封裝法於網路入侵偵測之研究
An NB and GA based wrapper approach for feature selection in Intrusion Detection problems
指導教授: 吳植森
none
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理研究所
Institute of Information Management
論文出版年: 2006
畢業學年度: 94
語文別: 英文
論文頁數: 63
中文關鍵詞: 支援向量機類神經網路基因演算法特徵選取入侵偵測系統
外文關鍵詞: Artificial neural network, Feature selection, Support vector machine, Intrusion detection system, Genetic algorithm
相關次數: 點閱:99下載:3
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 只要有人的地方,就存在著管理的問題。正如同有電腦網路存在的地方,必定會有電腦安全的疑慮。隨著網際網路的蓬勃發展,電腦遭受到駭客入侵以及病毒攻擊的事件層出不窮。傳統的防範手法,像是防火牆和防毒軟體,已無法有效遏止新型態的攻擊手法,或立即採取緊急應變措施。
    過去,已經有許多關於偵測網路攻擊事件的研究指出,對於明確定義的攻擊手法,絕大部分的入侵偵測系統都可以準確地察覺出來。很遺憾地,對於新攻擊手段的防範,至今仍舊是一個無法解決的問題。在此前提下,本研究將以縮短對於已知攻擊的偵測時間以及提高偵測準確度為目標。
    本研究將以麻省理工學院林肯實驗室所提供的網路流量資料及作為實驗樣本,並將研究流程區分成兩大階段。第一階段,採用以簡易貝氏分類器結合基因演算法為基礎之封裝法,進行特徵選取。第二階段,分別使用兩種分類技術,類神經網路以及支援向量機,建構入侵偵測模型。實驗結果證明,本研究所使用的方法,將可有效地縮短入侵偵測所需要的時間。

    With the flourishing development of the Internet, more and more malicious hackers and viruses have emerged. Traditional precautions against attacks, like firewalls and anti-virus software are already unable to effectively prevent new attacks, or take immediate emergency measures.
    There have been many studies about attack prevention, and explicit or well-defined attacks can be examined accurately by most intrusion detection systems. However, how to guard against novel intrusions remains an unresolved computer security issue. On this premise, this study aims at shortening the detection time and improving the detection accuracy rate of known attacks.
    The data used in this study are BSM audit data from the DARPA 1998 Intrusion Detection Evaluation Program at MIT’s Lincoln Labs. We perform Naive Bayes with genetic algorithm for feature selection to filter out the critical attributes and construct two classification models with ANN and SVM. The experiment finally made a significant enhancement in detection time but not detection accuracy rate. Future study will focus on a more complete solution.

    摘要 I ABSTRACT II ACKNOWLEDGEMENTS III LIST OF TABLES VI LIST OF FIGURES VIII CHAPTER 1 INTRODUCTION 1 1.1 Background and motivation 1 1.2 Research objective 2 1.3 Research process 3 1.4 Scope and limitations 5 CHAPTER 2 LITERATURE REVIEW 6 2.1 Intrusion detection system (IDS) 6 2.1.1 Feature analysis methods of intrusion detection systems 6 2.1.2 Categories of intrusion detection method 8 2.2 Intrusion detection process 11 2.2.1 Data collection 11 2.2.2 Data analysis 12 2.3 Feature selection 13 2.3.1 Definition of feature selection 13 2.3.2 Feature selection procedures 14 2.3.3 Categories of feature selection 14 2.4 Genetic algorithm 17 2.5 Artificial neural network (ANN) 20 2.5.1 Back-propagation network algorithm 21 2.5.2 Error evaluation in back-propagation network 22 2.6 Support vector machine 23 2.6 Summary 25 CHAPTER 3 METHODOLOGY 26 3.1 Research framework 26 3.2 Select critical features 28 3.3 Classification model construction 30 3.3.1 Model with artificial neural network 30 3.3.2 Model with support vector machine 32 CHAPTER 4 EXPERIMENTS 34 4.1 Data source and analytical tools 34 4.1.1 Data source 34 4.1.2 Dataset description 35 4.1.3 Software 36 4.2 Experimentation 36 4.2.1 Feature selection with classifiers 37 4.2.2 Naive Bayes validation 39 4.3 Model validation 40 4.3.1 Validation with neural network 40 4.3.2 Validation with support vector machine 43 4.3.3 Experimental results 46 4.3.4 Synthetic comparison 48 4.3.5 Effect on sample size and accuracy rate 50 CHAPTER 5 CONCLUSION AND EXTENSIONS 54 5.1 Research conclusions 54 5.2 Future extensions 55 REFERENCES 57 APPENDIX A 62

    1 English :

    Almuallim, H. and Dietterich, T.G., Learning with many irrelevant features. Proceedings of the Ninth National Conference on Artificial Intelligence, 2, 547-552, 1991.

    Amoroso, E. Intrusion Detection., Intrusion.Net Books, Sparta, New Jersey, 1999.

    Barbara, D. and Jajodia, S., Applications of Data Mining in Computer Security. Kluwer Academic Publishers, Boston, 2002.

    Berry, A.J. and Linoff, G., Mastering Data Mining : The Art and Science of Customer Relationship Management. John Wiley & Sons, Canada, 2000.

    Caruana, R.A. and Freitag, D., Greedy attribute selection. Proceedings of the Eleventh International Conference on Machine Learning, 28-36, 1994.

    Chen, W.H., Hsu, S.H., and Shen, H.P., Application of SVM and ANN for intrusion detection. Computers & Operations Research, 2617-2634, 2004.

    Cheung, K.W., Kwok, J.T., Law, M.H., and Tsui, K.C., Mining customer product ratings for personalized marketing. Decision Support Systems, 35(2), 231-243, 2003.

    Cristianini, N. and Shawe-Taylor, J., An Introduction to Support Vector Machines. Cambridge University Press, Cambridge, UK, 2000.

    `Dash, M. and Liu, H., Feature selection for classification. Intelligent Data Analysis, 1, 131-156, 1997.

    Dunja, M. and Morko, G., Feature selection on hierarchy of web documents. Decision support systems, 35(1), 45-87, 2003.

    Engels, R. and Theusinger, C., Using a data metric for preprocessing advice for data mining applications. European Conference on Artificial Intelligence (ECAI 98), Springer, Brighton, 430-434, 1998.

    Erick, C.P., Feature Subset Selection, Class Separability, and Genetic Algorithms. Center for Applied Scientific Computing, LNCS 3102, 959–970, 2004.

    Giacinto, G., Roli, F., and Didaci, L., Fusion of multiple classifiers for intrusion detection in computer networks. Pattern Recognition Letters, 1795-1803, 2002.

    Han, H., Lu, X.L., Lu, J., Bo, C., and Yong, R.L., Data Mining Aided Signature Discovery in Network-based Intrusion Detection System. CM SIGOPS Operating Systems Review, 36(4), 7-13, 2002.

    Haykin, S., Neural Networks:A Comprehensive Foundation. Prentice Hall, Upper Saddle River, NJ, USA, ,1999.

    Ho, K.L., Hsu, Y.Y., and Yang, C.C., Short Term Load Forecasting Using A Multilayer Neural Network With An Adaptive Learning Algorithm. Transactions on Power Systems, 7(1), 141-149, 1992.

    Holland, J.H., Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor, MI, 1975.

    John, G.H., Kohavi, R. and Pfleger, K., Irrelevant features and the subset selection problem. Proceedings of the Eleventh International Conference on Machine Learning, 121-129, 1994.

    Kira, K. and Rendell, L., A practical approach to feature selection. Proceedings of the Ninth International Conference on Machine Learning, 249-256, 1992.

    Lee W., Stolfo SJ., Data mining approaches for intrusion detection. Proceedings of the seventh USENIX Security Symposium, 1998.

    Lee W., Stolfo SJ., A Framework for Constructing Features and Models for Intrusion Detection Systems. ACM Transactions on Information and System Security, 3(4), 227-261, 2000.

    Lee, S.C. and Heinbuch, D.V., Building a true anomaly detector for intrusion detection. Architectures and Technologies for Information Superiority, IEEE, 2(2), 1171-1175, 2000.

    Ling, C.X. and Li, C., Data mining for direct marketing : Problems and solutions. Knowledge Discovery & Data Mining (KDD-98), 73-79, 1998.

    Mark, A. and Geoffrey, H., Benchmarking attribute selection techniques for discrete class data mining. IEEE Transactions on Knowledge and Data Engineering, 15(6), 1437–1447, 2003.

    Michalewicz, Z., Genetic Algorithm + Data Structure = Evaluation Programs. Springer-Verlag, Berlin, 1992.

    Mladenic´, D., Automated model selection. Proceedings of the MLNet Familiarisation Workshop: Knowledge Level Modelling and Machine Learning, 15, 1995.

    Moore, A.W. and Lee, M. S., Efficient algorithms for minimizing cross validation error. Proceedings of the Eleventh International Conference on Machine Learning, 190-198, 1994.

    Mukkamala, S., Sung A.H., and Abraham A., Intrusion detection using an ensemble of intelligent paradigms. Journal of Network and Computer Applications, 28(2), 167-182, 2005.

    Perner, P. and Apte, C., Empirical Evaluation of Feature Subset Selection Based on a Real-world Data Set. Engineering Applications of Artificial Intelligence, 17(3), 285-288, 2004.

    Proctor, P.E., The Practical Intrusion Detection Handbook. Prentice Hall, Upper Saddle River, NJ, USA, 2001.

    Scholkopf, B., Platt, J., Shawe-Taylor, J., Smola, A.J., and Williamson, R.C., Estimating the support of a high-dimensional distribution. Neural Computation, 87-99, 1999.

    Sung, A.H. and Mukkamala, S., Identifying Important Features for Intrusion Detection Using Support Vector Machines and Neural Networks. Proceedings of the 2003 Symposium on Applications and the Internet, IEEE, 2003.

    Vapnik, V. and Chervonenkis, A., The theory of Pattern Recognition. Nauka, Moscow, 1974.

    Vellido, A., Lisboba P.J.G., and Vaughan J., Neural network in business : a suvey of applications. Expert Systems with Application, 17(1), 51–70, 1999.

    Verwoerd, T., and Hunt, R., Intrusion detection techniques and approaches. Computer Communications, 1356–1365, 2002.

    Wun, L.C. and Chen, S.F., Building Intrusion Pattern Miner for Snort Network Intrusion Detection System. IEEE, 2003.

    2 Internet :

    Jacobson V., Leres, C., and McCanne, S., tcpdump. available via anonymous ftp to ftp.ee.lbl.gov, 1989.

    KDD Cup 1999 Data, http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html, October 28, 1999.

    下載圖示 校內:2011-06-21公開
    校外:2011-06-21公開
    QR CODE