簡易檢索 / 詳目顯示

研究生: 蔡有爵
Tsai, Yu-Chueh
論文名稱: 減少可編程交換機的三元內容定址記憶體使用量以實現基於決策樹的網路入侵檢測
Reducing TCAM Memory Consumption of Programmable Switches for Decision Tree-based Network Intrusion Detection Scheme
指導教授: 張燕光
Chang, Yeim-Kuan
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 英文
論文頁數: 72
中文關鍵詞: 軟體定義網路P4可程式化交換機分散式阻斷攻擊機器學習決策樹範圍編碼PPCTNA
外文關鍵詞: Software-Defined Networking, P4, TNA , Programmable Switch, Distributed Denial of Service (DDoS) Attacks, Machine Learning, Decision Tree, Range Encoding, PPC
相關次數: 點閱:153下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 為了有效應對DDoS攻擊,本研究探討了使用決策樹分類算法來處理封包的方法。決策樹是一種熱門且高效的機器學習演算法,它能夠快速對封包中所提取出的特徵進行分類,迅速辨識出潛在的DDoS攻擊。相較於傳統中用來偵測是否為攻擊流量的方法,決策樹分類算法減少了記憶體消耗,提高了偵測和處理的效率,使其更適合實時應用。
    為了訓練我們的模型,我們選擇了UNSW-NB15數據集作為DDoS攻擊的測試樣本。這個數據集包含了不同類型的DDoS攻擊,能夠提供多樣性和代表性,使得我們的模型具有更好的泛化能力。通過訓練決策樹模型來識別不同的DDoS攻擊特徵,我們可以更準確地進行攻擊偵測,降低了誤報率,同時提高了攻擊偵測的準確性。
    最後,我們將得到的模型實作在可編程交換機上。這使得我們能夠在網路層面彈性地控制和保護封包流量。可編程交換機提供了高度可定制化的網路處理流程,讓網路管理員能夠根據不同需求進行自定義配置,同時也增強了網路的性能和安全性。
    在這份論文中,我們採用了Intel提出的TNA架構來撰寫P4程式,並使用P4 Studio SDE進行編譯和模擬P4程式的行為。我們不只應用了兩種方法將模型轉換到Tofino交換機上,也提出了新的儲存架構來減少現有方法的記憶體使用量。根據我們的實驗結果,我們可以明顯地得出結論,在記憶體使用方面,相對於其他論文提出的方法,我們的方法呈現出明顯的優越性。這些實驗結果清楚地顯示,我們的方法在高效運用記憶體資源方面表現優異,並在與現有方法的比較中脫穎而出。

    To effectively counter DDoS attacks, this study explores the use of decision tree classification algorithms to handle packet processing. Decision trees are popular and efficient machine learning algorithms that can rapidly classify packet features to quickly identify potential DDoS attacks. Compared to traditional detection methods, the decision tree classification algorithm reduces memory consumption to enhance detection and processing efficiency, making it suitable for real-time applications.
    To train our model, we select the UNSW-NB15 dataset as a test sample for DDoS attacks. This dataset contains different types of DDoS attacks, providing diversity and representativeness, which enhances the generalization ability of the model. By training the model to recognize different DDoS attack features, we can achieve more accurate attack detection, reduce false positives, and improve the accuracy of attack detection.
    Finally, we implement the obtained model on a programmable switch, allowing flexible control and protection of packet flows at the network level. Programmable switches offer highly customizable network processing flows, allowing network administrators to adjust configurations based on different requirements while improving network performance and security. In this thesis, we adopt Intel's TNA architecture to write P4 programs and use P4 Studio SDE to compile and simulate P4 program behavior. We not only apply two methods to convert the model to the Tofino switch but also propose a novel storage architecture to reduce memory consumption compared to existing methods. Based on our experimental results, we can clearly conclude that our method has a clear superiority over the methods proposed in other papers in terms of memory usage. These experimental results clearly show that our method excels in the efficient use of memory resources and stands out from existing methods.

    摘要 i Abstract ii 致謝 iv TABLE OF CONTENTS v LIST OF TABLES vii LIST OF FIGURES viii Introduction 1 1.1 Introduction 1 1.2 Organization of the Thesis 2 Background 3 2.1 Software-Defined Networking (SDN) 3 2.2 Programming Protocol-independent Packet Processors (P4) 3 2.2.1 Tofino Native Architecture (TNA) 4 2.2.2 P4 Studio SDE 5 2.3 Distributed Denial of Service (DDoS) 5 2.4 DDoS Dataset 6 2.4.1 UNSW-NB15 [7] 6 2.4.2 KDD99 10 2.5 Feature Importance 11 2.6 Machine Learning 13 2.6.1 Decision Tree 13 2.6.2 Random Forest 14 2.7 Range Encoding 15 2.7.1 Buddy Code 15 2.7.2 Binary Reflected Gray Code (BRGC) 16 2.7.3 Compress continuous codes into ternary strings 16 2.7.4 Parallel Packet Classification (PPC) 17 2.8 Binary Range Search [35] 20 Related Work 23 3.1 SwitchTree[28] 23 3.2 pForest[29] 23 3.3 IIsy[30] 24 3.4 Planter[31],[32] 25 Proposed Scheme 26 4.1 Overview 26 4.2 Features Selection 27 4.3 Dataset Preprocessing 29 4.4 Training Model 30 4.5 Convert Model to Match-Action rules 30 4.5.1 Direct Mapping 31 4.5.2 Encoding Method 35 4.6 Storage structure for mapping ranges to codes 42 4.6.1 Matching method in Match-Action table 42 4.6.2 Endpoint array 44 4.7 Decision Tree model on P4 switch 49 4.7.1 Forwarding 49 4.7.2 Check Direction 49 4.7.3 Hashing Index 50 4.7.4 Features Extraction 51 4.7.5 Classify 52 4.7.6 Processing Attack 52 Experimental Results 53 5.1 Experimental Environment 53 5.2 Experimental Results 54 5.2.1 Accuracy of Decision Tree and Random Forest 54 5.2.3 Number of Forwarding Rules 57 5.2.4 Memory Consumption of Forwarding Rules 61 Conclusion 67 References 69

    [1]“Open-Tofino/PUBLIC_Tofino-Native-Arch.pdf at master · barefoot networks/Open-Tofino.”https://github.com/barefootnetworks/OpenTofino/blob/master/PUBLIC_Tofino-Native-Arch.pdf (accessed Aug. 13, 2022).
    [2] Open Networking Foundation. (2021). Software-Defined Networking (SDN). Retrieved from https://www.opennetworking.org/sdn-definition/
    [3] McKeown, N., Anderson, T., Balakrishnan, H., Parulkar, G., Peterson, L., Rexford, J., . . . Turner, J. (2008). OpenFlow: Enabling Innovation in Campus Networks. ACM SIGCOMM Computer Communication Review, 38(2), 69-74. doi: 10.1145/1355734.1355746
    [4] P. Bosshart et al., “P4: Programming protocol-independent packet processors,” Computer Communication Review, vol. 44, no. 3, pp. 87–95, 2014, doi: 10.1145/2656877.2656890.
    [5] “Barefoot Networks may have built the world’s fastest networking switch chip | Computerworld.”https://www.computerworld.com/article/3083761/barefoot-networks-may-have-built-the-worlds-fastest-networking-switch-chip.html (accessed Sep. 10, 2022).
    [6]“Drive Programmable Networking Innovation with Intel® P4 Studio.”
    https://www.intel.com/content/www/us/en/products/network-io/programmable-ethernet-switch/p4-suite/p4-studio.html (accessed Aug. 10, 2022).
    [7] “UNSW-NB15.” Available: https://research.unsw.edu.au/projects/unsw-nb15-dataset
    [8] N. Moustafa and J. Slay, “UNSW-NB15: A comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set),”2015 Military Communications and Information Systems Conference, MilCIS 2015 - Proceedings, Dec. 2015, doi: 10.1109/MILCIS.2015.7348942.
    [9] N. Moustafa and J. Slay, “The evaluation of Network Anomaly Detection Systems: Statistical analysis of the UNSW-NB15 data set and the comparison with the KDD99 data set,” http://dx.doi.org/10.1080/19393555.2015.1125974, vol. 25, no. 1–3, pp. 18–31, Apr. 2016, doi: 10.1080/19393555.2015.1125974.
    [10] N. Moustafa, J. Slay, and G. Creech, “Novel Geometric Area Analysis Technique for Anomaly Detection Using Trapezoidal Area Estimation on Large-Scale Networks,” IEEE Trans Big Data, vol. 5, no. 4, pp. 481–494, Jun. 2017, doi: 10.1109/TBDATA.2017.2715166.
    [11] N. Moustafa, G. Creech, J. Slay, N. Moustafa, G. Creech, and • J Slay, “Big Data Analytics for Intrusion Detection System: Statistical Decision-Making Using Finite Dirichlet Mixture Models,” pp. 127–156, 2017, doi: 10.1007/978-3-319-59439-2_5.
    [12] “PerfectStorm | Keysight.” https://www.keysight.com/tw/zh/products/network-test/network-test-hardware/perfectstorm.html (accessed Aug. 03, 2022).
    [13] “openargus - Home.” https://openargus.org/ (accessed Aug. 03, 2022).
    [14] “The Zeek Network Security Monitor.” https://zeek.org/ (accessed Aug. 03, 2022).
    [15] “Kddcup1999,” April 2015. [Online]. Available: http://kdd.ics.uci.edu/
    databases/kddcup99/kddcup99.html
    [16] “Nslkdd,” April 2015. [Online]. Available: https://web.archive.org/web/
    20150205070216/http://nsl.cs.unb.ca/NSL-KDD/
    [17] J. R. Quinlan. “Induction of decision trees.” Machine Learning, vol. 1, no. 1, pp. 81-106, 1986.
    [18] J. R. Quinlan. “C4.5: Programs for Machine Learning.” Morgan Kaufmann Publishers, 1993.
    [19] L. Breiman, J. Friedman, R. Olshen, and C. Stone. “Classification and Regression Trees.” Chapman & Hall/CRC, 1984.
    [20] L. Breiman. “Random Forests.” Machine Learning, vol. 45, no. 1, pp. 5-32, 2001.
    [21] Gradient Boosting: J. H. Friedman. “Greedy Function Approximation: A Gradient Boosting Machine.” The Annals of Statistics, vol. 29, no. 5, pp. 1189-1232, 2001.
    [22] F. Pedregosa et al., “Scikit-learn: Machine Learning in Python,” Journal of Machine Learning Research, vol. 12, no. 85, pp. 2825–2830, 2011, [Online]. Available: http://jmlr.org/papers/v12/pedregosa11a.html
    [23] Liaw, A., & Wiener, M. (2002). “Classification and regression by randomForest.” R News, 2(3), 18-22.
    [24] Y. K. Chang, C. C. Su, Y. C. Lin, and S. Y. Hsieh, “Efficient gray-code-based range encoding schemes for packet classification in TCAM,” IEEE/ACM Transactions on Networking, vol. 21, no. 4, pp. 1201–1214, 2013, doi: 10.1109/TNET.2012.2220566.
    [25] B. Schieber, D. Geist, and A. Zaks, “Computing the minimum DNF representation of Boolean functions defined by intervals,” Discrete Appl Math (1979), vol. 149, no. 1–3, pp. 154–173, Aug. 2005, doi: 10.1016/J.DAM.2004.08.009.
    [26] A. Bremler-Barr and D. Hendler, “Space-efficient TCAM-based classification using Gray coding,” IEEE Transactions on Computers, vol. 61, no. 1, pp. 18–30, 2012, doi: 10.1109/TC.2010.267.
    [27] J. van Lunteren and T. Engbersen, “Fast and scalable packet classification,” IEEE Journal on Selected Areas in Communications, vol. 21, no. 4, pp. 560–571, May 2003, doi: 10.1109/JSAC.2003.810527.
    [28] J. H. Lee and K. Singh, “SwitchTree: in-network computing and traffic analyses with Random Forests,” Neural Computing and Applications 2020, pp. 1–12, Nov. 2020, doi: 10.1007/S00521-020-05440-2.
    [29] C. Busse-Grawitz, R. Meier, A. Dietmüller, T. Bühler, L. Vanbever, and E. Zürich, “pForest: In-Network Inference with Random Forests,” Sep. 2019, doi: 10.48550/arxiv.1909.05680.
    [30] Z. Xiong and N. Zilberman, “Do Switches Dream of Machine Learning? Toward In-Network Classification,” Proceedings of the 18th ACM Workshop on Hot Topics in Networks, 2019, doi 10.1145/3365609.
    [31] C. Zheng and N. Zilberman, “Planter: Seeding Trees Within Switches,” Proceedings of the SIGCOMM ’21 Poster and Demo Sessions, pp. 12–14, Aug. 2021, doi: 10.1145/3472716.
    [32] C. Zheng et al., “Automating In-Network Machine Learning,” May 2022, doi 10.48550/arxiv.2205.08824.
    [33] B. Schieber, D. Geist, and A. Zaks, “Computing the minimum DNF representation of Boolean functions defined by intervals,” Discrete Appl Math (1979), vol. 149, no. 1–3, pp. 154–173, Aug. 2005, doi: 10.1016/J.DAM.2004.08.009.
    [34] “Edgecore Networks.” https://www.edge-core.com/tw/index.php (accessed Aug. 14, 2022).
    [35] M. A. Ruiz-Sanchez, E. W. Biersack and W. Dabbous, “Survey and taxonomy of IP address lookup algorithms,” in IEEE Network, vol.15, no. 2, pp. 8-23, March-April 2001, doi: 10.1109/65.912716.]
    [36] S. G. Macías, L. P. Gaspary and J. F. Botero, "ORACLE: An Architecture for Collaboration of Data and Control Planes to Detect DDoS Attacks," 2021 IFIP/IEEE International Symposium on Integrated Network Management (IM), Bordeaux, France, 2021, pp. 962-967.

    無法下載圖示 校內:2028-08-25公開
    校外:2028-08-25公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE