簡易檢索 / 詳目顯示

研究生: 黃程斌
Huang, Cheng-Pin
論文名稱: 入侵偵測系統中基於群集演算法之異常偵測技術評比
Clustering-based Techniques for Anomaly Detection in Intrusion Detection System: a Comparative Study
指導教授: 曾新穆
Tseng, Vincent Shin-Mu
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2005
畢業學年度: 93
語文別: 中文
論文頁數: 87
中文關鍵詞: 群集演算法群集標稱技術資料探勘入侵偵測系統異常偵測
外文關鍵詞: labeling techniques, data mining, intrusion detection, anomaly detection, clustering techniques
相關次數: 點閱:77下載:3
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  •   現今的電腦攻擊技術可謂一日千里,近年來入侵偵測系統中提出的不當偵測系統(misuse detection)面對日新月異的攻擊技巧已不敷使用,必須搭配異常偵測(anomaly detection)系統之輔助以偵測未知的攻擊模式。過去已有研究在探討應用群集演算法於異常偵測技術上,但這些研究卻忽略群集演算結果仍須進一步地標稱(label)群集為正常或異常,本研究主要在於入侵偵測系統中基於群集演算法之異常偵測技術評比,對於整個異常偵測技術設計重要的環節,資料前置處理、特徵個數組合、相異度量測、群集演算法與群集標稱技術上作深入的分析討論,以期在異常偵測技術提供重要的參考資訊。

      為驗證本研究所評比的各項因素,我們進行一系列的實驗,以評估各種不同特徵個數組合、相異度量測、群集演算法與群集標稱技術對異常偵測在分群品質、偵測準確率(detection rate)、誤判率(false alarm rate)等方面的執行效益。藉由實驗結果之觀察分析,本研究提供應用群集演算法於入侵偵測系統中異常偵測技術上一系列有用的參考資訊。

     With the advance of diverse computer attack techniques, the misuse and anomaly detection methods employed in the intrusion detection system limited by nature can no longer catch up with the latest intrusions. Researches to date  have been conducted mainly on the evaluation of clustering-based techniques for anomaly detection, but the cluster labeling techniques were less studied.

     In this research, we compare different clustering-based techniques for anomaly detection in Intrusion Detection System, which cover the four design factors – distinct attribute combination, dissimilarity measurement, clustering techniques and labeling techniques. A series of experiments were performed to evaluate the impacts of each of the design factors on cluster quality, detection rate, and false alarm rate. It is expected that the outcome of such a comparative study will offer suggestive guidelines on the design of anomaly detection system.

    目錄 英文摘要(Abstract) Ⅰ 中文摘要 Ⅱ 誌謝 Ⅲ 目錄 Ⅳ 表目錄 Ⅶ 圖目錄 Ⅷ 第一章 緒論 1 1.1 研究背景 1 1.2 入侵偵測系統 2 1.3 研究動機 3 1.4 研究目的 4 1.5 論文架構 6 第二章 相關研究探討 7 2.1 群集分析 7 2.1.1 資料類型與其相似度量測方法 7 2.1.1.1 區間刻度變數 7 2.1.1.2 二元變數 10 2.1.1.3 標稱變數 11 2.1.1.4 序數型變數 12 2.1.1.5 比例刻度型變數 12 2.1.1.6 混合類型的變數 12 2.1.2 群集方法 13 2.1.2.1 階層式群集方法 13 2.1.2.2 分割式群集方法 16 2.1.2.3 密度基礎的群集方法 16 2.1.2.4 網格基礎的群集方法 17 2.1.2.5 模型基礎的群集方法 17 2.1.2.6 異質點分析 18 2.1.3 驗證技術 19 2.2 異常偵測方法之相關研究 20 2.2.1 基於統計方法 21 2.2.2 基於距離方法 22 2.2.3 基於法則方法 22 2.2.4 基於特徵方法 23 2.2.5 基於模型方法 24 第三章 群集與標記方法 26 3.1 群集演算方法 27 3.1.1 SMART-CAST 28 3.1.2 基於隨機搜尋的群集演算法(CLARANS) 30 3.2 群集標稱方法 33 3.2.1 LabelBySize (LBS) 34 3.2.2 LabelByNormalSize (LBNS) 34 3.2.3 LabelByDistance (LBD) 35 3.2.4 LabelByDist2Largest (LBD2L) 36 3.2.5 LabelByDist2Farthest (LBD2F) 37 3.2.6 LabelBySizeAndDistance (LBSAD) 38 第四章 實驗結果與分析 40 4.1 實驗數據分析 40 4.2 資料的前置處理 43 4.2.1 資料轉換(data transformation) 44 4.2.2 相異度量測方法(dissimilarity measurement) 44 4.3 實驗結果分析 48 4.3.1 比較不同群集演算法所得到的分群品質與異常偵測的表現 .53 4.3.2 比較CLARANS於不同特徵個數下群集的結果 66 4.3.3 比較CLARANS於不同特徵個數下各種標稱方法的表現 70 第五章 結論與未來展望 80 5.1 結論 80 5.2 未來研究方向 81

    [1] C. C. Aggarwal and P. Yu, ”Outlier Detection for High Dimensional
    Data.” Proceedings of the ACM SIGMOD International Conference on
    Management of Data, Santa Barbara, CA, May 2001.
    [2] R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan, “Automatic
    Subspace Clustering of High Dimensional Data for Data Mining
    Applications.” Proceedings of the ACM SIGMOD International Conference
    on Management of Data, Seattle, Washington, 1998.
    [3] M. S. Aldenderfer and R. K. Blashfield, “Cluster Analysis.” Sage
    Publications, Inc.,1984
    [4] M. Ankerst, M. M. Breunig, H. P. Kriegel, and J. Sander, “OPTICS:
    Ordering Points to Identify the Clustering Structure.” Proceedings of
    the 1999 ACM SIGMOD International Conference on Management of Data,
    Philadelphia Pennsylvania, USA, pages 49-60, 1999.
    [5] D. Barbara, N. Wu and S. Jajodia, “Detecting Novel Network Intrusion
    using Bayes Estimators.” Proceedings of the First SIAM Conference on
    Data Mining, Chicago, Illions, April 2001.
    [6] A. Ben-Dor and Z. Yakhini, “Clustering Gene Expression Patterns.“
    Proceedings of the 3rd Annual International Conference on Computational
    Molecular Biology, Lyon ,France, 1999.
    [7] S. Chebrolu, A. Abraham, and J. P. Thomas, “Feature Deduction and
    Ensemble Design of Intrusion Detection Systems” Computers and Security,
    2004.
    [8] S. Cherednichenko, “Outlier Detection in Clustering.” Master’s
    Thesis, University of Joensuu Department of Computer Science, 2005.
    [9] P. Cjeeseman and J. Stutz, “Bayesian Classification (AutoClass): Theory
    and Results.” Advances in Knowledge Discovery and Data Mining, pages
    153-180, Cambridge, MA, AAAI/MIT Press, 1996.
    [10] D. Denning, “An Intrusion Detection Model.” IEEE Transactions on
    Software Engineering, Vol. SE-13, pages 222-232, 1987.
    [11] C. Dowell and P. Ramstedt, “The Computerwatch Data Reduction Tool.”
    Proceedings of the 13th National Computer Security Conference,
    Washington, D.C., 1990.
    [12] D. Engelhardt, “Directions for Intrusion Detection and Response: A
    Survey.”Australia Technical Report DSTO-GD-0155, Electronics and
    Surveillance Research Laboratory, Department of Defense, 1997.
    [13] L. Ertoz, E. Eilertson, P. Dokas, V. Kumar, and K. Long, “Scan
    Detection Revisited.” Army High Performance Computing Research Center
    Technical Report, 2004.
    [14] E. Eskin, “Anomaly Detection over Noisy Data using Learned Probability
    Distributions.”Proceedings of the International Conference on Machine
    Learning , Stanford University, CA, June 2000.
    [15] E. Eskin, A. Arnold, M. Prerau, L. Portnoy, and S. Stolfo, “A Geometric
    Framework for Unsupervised Anomaly Detection: Detecting Intrusions in
    Unlabeled Data.” Applications of Data Mining in Computer Security,
    Advances In Information Security, S. Jajodia D. Barbara, Ed. Boston:
    Kluwer Academic Publishers, 2002.
    [16] M. Ester, H. P. Kriegel, J. Sander, and X. Xu, ”A Density-Based
    Algorithm for Discovering Clusters in Large Spatial Databases with
    Noise.” Proceedings of the 2nd International Conference on Knowledge
    Discovery and Data Mining, pages 226-231, Portland, Oregon, 1996
    [17] D. Fisher, “Improving Inference Through Conceptual Clustering.”
    Proceedings of 1987 AAAI Conference, pages 461-465, Seattle, Washington,
    1987.
    [18] J. H. Gennari, P. Langley, and D. Fisher, “Models of Incremental
    Concept Formation.” Artificial Intelligence, Vol. 40, pages 11-61, 1989.
    [19] A. Ghosh and A. Schwartzbard, “A Study in Using Neural Networks for
    Anomaly and Misuse Detection.” Proceedings of the Eighth USENIX
    Security Symposium, Washington, D.C., 141-151, August , 1999.
    [20] S. Guha, R. Rastogi, and K. Shim, “CURE: An Efficient Clustering
    Algorithm For Large Databases.” Proceedings of ACM SIGMOD International
    Conference on Management of Data, pages 73-84, New York, 1998.
    [21] S. Guha, R. Rastogi, and K. Shim, ”ROCK: A Robust Clustering Algorithm
    For Categorical Attributes.” Proceedings of the 15th International
    Conference on Data Engineering, 1999.
    [22] J. Han and M. Kamber, “Data Mining:Concepts and Techniques.” Morgan
    Kaufmann Publishers, 2000.
    [23] A. Hinneburg and D. A. Keim, “An Efficient Approach to Clustering in
    Multimedia Databases with Noise.” Proceedings of 4th International
    Conference on Knowledge Discovery and Data Mining, New York, AAAI Press,
    1998.
    [24] A. K. Jain and R. C. Dubes, “Algorithms for Clustering Data.” Prentice
    Hall, 1988.
    [25] G. Karypis, E. H. Han, and V. Kumar, “CHAMELEON: Hierarchical
    Clustering using Dynamic Modeling.” Technical Report TR-99-120,
    Department of Computer Science, University of Minnesota,
    Minneapolis,1999.
    [26] L. Kaufman and P. J. Rousseeuw, “Finding Groups in Data: An
    Introduction to Cluster Analysis.” John Wiely & Sons, 1990.
    [27] T. Kohonen, “The Self-Organizing Map.“ Proceedings of the IEEE, Vol.
    78, No. 9, pages 1464-1480, 1990.
    [28] C. Krugel , T. Toth, and E. Kirda, “Service Specific Anomaly Detection
    for Network Intrusion Detection.” Proceedings of the ACM Symposium on
    Applied Computing, Madrid, Spain, March 2002.
    [29] T. Lane and C. Brodley, “Temporal Sequence Learning and Data Reduction
    for Anomaly Detection.” ACM Transactions on Information and System
    Security, vol. 2,3, pages 295-331, 1999.
    [30] A. Lazarevic, L. Ertoz, A. Ozgur, J. Srivastava, V. Kumar, “A
    Comparative Study of Anomaly Detection Schemes in Network Intrusion
    Detection.” Proceedings of the 3rd SIAM Conference on Data Mining, 2003.
    [31] W. Lee, S. Stolfo, and K. Mok, “A Data Mining Framework for Building
    Intrusion Detection Models.” Proceedings of the 1999 IEEE Symposium on
    Security and Privacy, pages 120-132, 1999.
    [32] W. Lee and D. Xinag, ”Information-Theoretic Measures for Anomaly
    Detection” Proceedings of the IEEE Symposium on Security and Privacy,
    Oakland, CA, May 2001.
    [33] G. Liepins and H. Vaccaro, “Intrusion Detection: It’s Role and
    Validation.” Computers and Security , pages 347-355, 1992.
    [34] M. Mahoney and P. Chan, “Learning Nonstationary Models of Normal
    Network Traffic for Detecting Novel Attacks.” Proceedings of the Eight
    ACM International Conference on Knowledge Discovery and Data Mining,
    Edmonton, Canada, July 2002.
    [35] H. H. McAdams and L. Shapiro, “Circuit Simulation of Genetic
    Networks.” Science 269, 650-656, 1995.
    [36] J. B. McQueen, “Some Methods of Classification and Analysis of
    Multivariate Observations.” Proceedings of the 5th Berkeley Symposium
    on Mathematical Statistics and Probability, pages 281-297, 1967.
    [37] H. Muhammad and K. C. Philip, “Identifying Outliers via Clustering for
    Anomaly Detection.” Technical Report, Florida Institute of Technology
    Melbourne, 2003.
    [38] S. Mukkamala and A. H. Sung, “Identifying Significant Features for
    Network Forensic Analysis Using Artificial Intelligent Techniques.”
    International Journal of Digital Evidence, Vol. 1, 2003.
    [39] S. Mukkamala and A. H. Sung, “Identifying Important Features for
    Intrusion Detection Using Support Vector Machines and Neural Networks.”
    Proceedings of International Symposium on Applications and the Internet
    (SAINT), 2003.
    [40] R. T. Ng and J. Han, “ Efficient and Effective Clustering Methods for
    Spatial Data Mining.” Proceedings of the 20th VLDB Conference, pages
    144-155, Santiago, Chile, 1994.
    [41] L. Portnoy, E. Eskin , and S. Stolfo, “Intrusion Detection with
    Unlabeled Data Using Clustering.”, ACM Workshop on Data Mining Applied
    to Security, 2001.
    [42] E. Schikuta, “Grid Clustering: An Efficient Hierarchical Clustering
    Method for Very Large Data Sets.” Proceedings of 13th International
    Conference on Pattern Recognition, 1996.
    [43] R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Tiwari, H. Yang, and
    S.Zhou ” Specification Based Anomaly Detection: A New Approach for
    Detecting Network Intrusions” Proceedings of the ACM Conference on
    Computer and Communications Security (CCS), Washington, D.C., November
    2002.
    [44] G.. Sheikholeslami, S. Chatterjee, and A. Zhand, “WaveCluster: A Multi-
    Resolution Clustering Approach for Very Large Spatial Databases.”
    Proceedings of the 24th Very Large Databases Conference (VLDB 98),pages
    428-439, New York, 1998.
    [45] D. Wagner and D. Dean, “Intrusion Detection via Static Analysis.”
    Proceedings of the IEEE Symposium on Security and Privacy, Oakland, CA,
    May 2001.
    [46] W. Wang, J. Yang, and R. Muntz, “STING: A Statistical Information Grid
    Approach to Spatial Data Mining.” Proceedings of 23rd International
    Conference on Very Large Data Bases (VLDB 97), pages 186-195.1997.
    [47] A. Wespi, M. Dacier, and H. Debar, “Intrusion Detection using Variable
    Length Audit Trail Patterns.” Proceedings of the Recent Advances in
    Intrusion Detection (RAID-2000), Toulouse, FR, 110-129, October 2000.
    [48] T. Zhang, R. Ramakrishnan, and M. Livny, “BIRCH: An Efficient Data
    Clustering Method for Very Large Databases.” Proceedings of the 1996
    ACM SIGMOD International Conference on Management of Data, pages 103-
    114, Montreal, Canada, 1996.
    [49] S. Zhong, T. M. Khoshgoftar, and N. Seliya, “Evaluating Clustering
    Techniques for Network Intrusion Detection.” In 10th ISSAT Int. Conf.
    on Reliability and Quality Design, pages 149-155, Las Vegas, Nevada,
    USA. August 2004
    [50] KDD Cup 1999 Data
    (http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html)
    [51] 高慶斌,應用於基因表現探勘之高效率叢集方法及其效能評估,碩士論文,國立成
    功大學資訊工程研究所。
    [52] 梁鐵柱,入侵檢測中的數據挖掘方法研究,博士論文,中國人民解放軍大學,2002.

    下載圖示 校內:2006-09-05公開
    校外:2006-09-05公開
    QR CODE