| 研究生: | 周暉堡 Chou, Hui-pao | 
|---|---|
| 論文名稱: | 運用分群技術在識別新型態的網路異常入侵偵測 A clustering-based method for detecting network intrusions with new types | 
| 指導教授: | 翁慈宗 Wong, Tzu-tsung | 
| 學位類別: | 碩士 Master | 
| 系所名稱: | 管理學院 - 資訊管理研究所 Institute of Information Management | 
| 論文出版年: | 2007 | 
| 畢業學年度: | 95 | 
| 語文別: | 中文 | 
| 論文頁數: | 74 | 
| 中文關鍵詞: | 異常值偵測 、網路入侵偵測 、分群 | 
| 外文關鍵詞: | outlier detection, intrusion detections, clustering | 
| 相關次數: | 點閱:87 下載:4 | 
| 分享至: | 
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 | 
隨著網際網路的快速普及,電腦和網路已與我們日常生活有著密切的關係,因此網路安全的議題逐漸受到被重視,但網路異常入侵(intrusion)的型態日新月異,要如何因應新型態異常入侵的發生將是一值得重視的課題。傳統上對於異常入侵是運用分類技術如決策樹、貝氏分類器、SVM等方法來偵測,這些方法先透過已發生過的異常入侵攻擊的資料來學習,以便能正確的識別出已發生過的異常入侵類別;但是對於新型態的入侵行為,由於並沒有已發生過的資料可供學習,使得一般的分類技術無法用來解決這類問題。本研究將先運用分群技術對異常資料進行區分,由於分群技術屬於非監督式學習,在未知資料類別值情況下,利用資料特性將有類似性質的資料劃分為同一群組,當新型態攻擊的資料發生時,將會因資料本身性質的差異而被探勘出來。而經由實證結果顯示,本研究所提出的方法對於新型態的異常資料及已知型態的異常資料之識別能力都相當的不錯,但是誤判率稍高。
With the rapid popularization of the Internet, the computer and network already related to our daily life closely. So the topic of the network security had gradually paid more attention to. However, the types of the network intrusions are changed with each passing day. It will be an important issue to detect the occurrence of new types of intrusions. In traditional, intrusions are detected by classification methods, such as decision trees, Bayesian classifiers, SVMs, and so on. All of the above methods are trained by network data to identify the intrusions that had occurred before. However, the general classification methods cannot detect the intrusions never appeared in the training data. This study proposes a clustering-based method to distinguish intrusion data from normal data first. A clustering method is unsupervised and can group data with similar characteristic into the same cluster. A new type of intrusions generally has significantly different data characteristics, hence it can be detected when it cannot be assigned to any known cluster. According to our experimental results, our clustering-based method has a significant superior performance in identifying new types of intrusions than the CBUID, but its resulting false alarm rate is a little bit higher than the CBUID.
Aggarwal, C. and Yu, P. (2001). Outlier detection for high dimensional data, Proceedings of the ACM SIGMOD International Conference on Management of Data, 30(2), 37-46, Santa Barbara, California, USA.
Agrawal, R., Gehrke, J., Gunopulos, D., and Raghavan, P. (1998). Automatic subspace clustering of high dimensional data mining applications, Proceedings of the ACM SIGMOD International Conference on Management of Data, 94-105, Seattle, Washington, USA.
Barnett, V. and Lewis, T. (1994). Outliers in Statistical Data, 3rd edition, John Wiley & Sons.
Borah, B. and Bhattacharyya, D. K. (2004). An improved sampling-based DBSCAN for Large Spatial Databases, Proceedings of International Conference on Intelligent Sensing and Information Processing, 92-96, Chennai, India.
Cherendinchenko, S. (2005). Outlier Detection in Clustering, University of Joensuu Department of Computer Science, Master Thesis.
Computer Emergency Response Term/Coordination Center, http://www.cert.org/stats/cert_stats.html#incidents
Daszykowski, M., Walczak, B., and Massart, D. L. (2001). Looking for natural patterns in data part 1. density-based approach, Chemometrics and Intelligent Laboratory Systems, 56, 83-92.
Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society, Series B (Methodological), 39(1), 1-38.
Everitt, B. S. (1993). Cluster analysis, Jonhn Wiley & Sons, New York.
Giha, S., Rasstogi, R., and Shim, K. (1998). CURE: an efficient clustering algorithm for large databases, Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, 73-84, Seattle, Washington, USA.
Halkidi, M., Batiskakis, Y., and Vazirgiannis, M. (2001). Clustering algorithm and validity measures, Proceedings of the Thirteenth International Conference on Scientific and Statistical Database Management, 3-22, Edinburgh, Scotland.
Hautamäki, V., Kärkkäinen, I., and Fränti, P. (2004). Outlier Detection Using k-Nearest Neighbor Graph, Proceedings of the International Conference on Pattern Recognition, 3, 430-433, Cambrige, UK.
Hawkins, D.M. (1980). Identification of Outliers. Chapman and Hall.
Jiang, S., Song, X., Uang, H., Han, J.-J., and Li, Q.-H. (2006). A clustering-based method for unsupervised intrusion detections, Pattern Recognition Letter, 27, 802-810.
Jin, W., Tung, A., and Han, J. (2001). Mining top-n local outliers in large databases, Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 293-298, Santa Barbara, California, USA.
Kantardzic M. (2003), Data Mining – Concepts, Models, Methods and Algorithms, Wiley – Interscience.
Kaufman, L. and Rousseeuw, P.J. (1990). Finding groups in data: an Introduction to cluster analysis, John Wiley & Sons
Knoor, E., Ng, R., and Zamar, R. (2001). Robust space transformation for distance-based operations, Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 126-135, Santa Barbara, California, USA.
Knorr, E., and Ng, R. (1998). Algorithms for mining distance-based outliers in large datasets, Proceedings of the 24th VLDB Conference, 392–403, New York, USA.
MacQueen, J. (1967). Some methods for classification and analysis of multivariate observation, Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, 1, 281-297, Berkeley, California, USA.
Novikov, D., Yampolskiy, R.V., and Reznik, L. (2006). Anomaly detection based intrusion detection, Proceedings of the Third International Conference on Information Technology: New Generations (ITNG’06), 420-425, Las Vegas, Nevada, USA.
Paquet, E. (2004). Exploring anthropometric data through cluster analysis, Published in Digital Human Modeling for Design and Engineering, Seattle, Washington, USA.
Ramaswamy, S., Rastogi, R., and Shim, K. (2000). Efficient algorithms for mining outliers from large data sets, Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, 29(2), 427-438, Dallas, Texas, USA.
Tung, A., Hou, J., and Han, J. (2001). Spatial clustering in the presence of obstacles, Proceedings of the 17th International Conference on Data Engineering, 359-367, Heidelberg, Germany.
Wang, W., Yang, J., and Muntz, R. (1997). Sting: a Statistical information grid approach to spatial data mining, Proceedings of the 23rd International Conference on Very Large Data Bases (VLDB), 186-195, Athens, Greece.
Williams, G., Baxter, R., He, H., Hawkin, S., and Gu, L. (2002). A comparative study for RNN for outlier detection in data mining, Proceedings of the 2nh IEEE International Conference on Data Mining, 709-712, Maebashi TERRSA,Maebashi City, Japan.
Xu, X., Ester, M., Kriegel, H.-P., and Sander, J. (1998). A distribution-based clustering algorithm for mining in large spatial databases, Proceedings of the 14th International Conference on Data Engineering, 342-331, Orlando, Florida, USA.
Yamanishi, K. and Takeuchi, J. (2001). Discovering outlier filtering rules from unlabeled data: combining a supervised learner with and unsupervised learner, Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 389-394, Santa Barbara, California, USA.