簡易檢索 / 詳目顯示

研究生: 歐智行
Ou, Zhi-Xing
論文名稱: 基於強化學習用於封包分類之改良型多層樹
Improving hierarchical tree-based packet classification by reinforcement learning
指導教授: 張燕光
Chang, Yeim-Kuan
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 英文
論文頁數: 68
中文關鍵詞: 強化學習封包分類
外文關鍵詞: Reinforcement Learning, Packet Classification
相關次數: 點閱:67下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年來,隨著硬體技術的進步,人工智慧領域也取得了蓬勃的發展,學者們將人工智慧技術應用於各個領域,其中一個知名且重要的技術是強化學習(Reinforcement Learning, RL)。強化學習是一種深度學習方法,其演算法能夠在具有獎勵的環境中進行自我學習,以達到最大化長期獎勵的目標。
    在本篇論文中,我們提出了一個基於強化學習架構的模型。透過給予該模型封包分類所需的規則集,我們能夠訓練出一組用於封包分類的決策樹,該決策樹能夠根據規則集的特性,在搜尋時間、記憶體使用和規則更新時間等方面表現出優異的性能。我們改進了行動空間(action space),使得決策樹更具靈活性,能夠在內部節點上選擇進行分組(grouping)或切割(cutting)操作。同時,我們還優化了獎勵函數,使其能夠提供有關每棵訓練出的決策樹的規則更新速度的資訊,並將其用作訓練指標。
    除了使用記憶體存取次數來估算搜尋時間,我們還通過實驗實際計算了搜尋所需的時脈週期時間,並發現時脈週期時間與最大記憶體存取次數之間存在著高度正相關,相關係數介於0.88至0.95之間。我們將提出的方法與NeuroCuts以及近年來的幾種方法(包括HiCuts、HyperCuts)進行比較,並分析不同類型的規則集對結果的影響。實驗結果顯示,相較於NeuroCuts,我們提出的方法在搜尋時間上平均減少了9%,最大減少了26.5%;在規則更新時間上,平均減少了22%,最大減少了102.5%;至於每個規則所需的平均位元數,平均比NeuroCuts少了4%,最大減少了39.9%。

    In recent years, with the advancement of hardware technology, artificial intelligence (AI) has also experienced rapid development. Researchers have applied AI techniques in various fields, and one well-known and significant technique is Reinforcement Learning (RL). RL is a deep learning method in which algorithms learn by themselves in a rewarding environment to maximize long-term rewards.
    In this thesis, we propose a model based on the RL framework. By providing the model with a rule set required for packet classification, we can train a decision tree specifically designed for packet classification. This decision tree exhibits excellent performance in terms of search time, memory usage, and rule update time by taking advantage of utilizing the characteristics of the rule sets. We enhance the action space, allowing for greater flexibility in the decision tree by enabling the selection of grouping or cutting operations at internal nodes. Additionally, we optimize the reward function to provide information about the rule update performance of each trained decision tree.
    In addition to estimating search time based on memory access counts, we conduct experiments to measure the actual clock cycle time required for the search. Interestingly, we find a strong positive correlation ranging from 0.88 to 0.95 between clock cycle time and maximum memory access count. We compare our proposed method with NeuroCuts and several recent approaches, including HiCuts, and HyperCuts. We also analyze the impact of different types of rule sets on the results. The experimental results demonstrate that our proposed method outperforms NeuroCuts, with an average search time reduction of 9%, and with a maximum reduction of 26.5%. Moreover, our method achieves an average rule update time reduction of 22% compared to NeuroCuts, with a maximum reduction of 102.5%. Furthermore, the average number of bits required per rule is on average 4% less than that of NeuroCuts, with a maximum reduction of 39.9%.

    摘要 i Abstract ii 誌謝 iv TABLE OF CONTENTS v LIST OF TABLES vii LIST OF FIGURES viii Chapter 1 Introduction 1 1.1 Introduction 1 1.2 Organization of the Thesis 3 Chapter 2 Background 4 2.1 Packet Classification 4 2.2 Decision Tree 5 2.2.1 Node Cutting 7 2.2.2 Rule Grouping 8 2.3 Reinforcement Learning 10 2.4 ClassBench 12 Chapter 3 Related Work 14 3.1 Decision Tree Schemes 14 3.1.1 Hierarchical Intelligent Cuttings (HiCuts) 14 3.1.2 HyperCuts 16 3.2 Grouping-Based Schemes 16 3.2.1 Efficuts 17 3.2.2 CutSplit 18 3.3 Reinforcement Learning Schemes 19 3.3.1 Neurotrie 19 3.3.2 NeuroCuts 20 3.3.3 Others 21 Chapter 4 Proposed Scheme 22 4.1 Overview 22 4.2 Build Decision Tree with Reinforcement Learning 24 4.2.1 Why Reinforcement Learning? 26 4.2.2 Branching Decision Process of Decision Tree 27 4.2.3 Design of Action Space 28 4.2.4 Design of Reward Function 29 4.2.5 Reward Function Example 32 4.3 Internal Node Grouping 36 4.3.1 Grouping Algorithm 36 4.3.2 Optimization with Efficut 37 4.3.3 Model Comparison with NeuroCuts 38 4.4 Action Distribution 40 4.5 Model Training 42 4.6 Cycle Time Evaluation 43 Chapter 5 Experimental Results 45 5.1 Experimental Environment 45 5.2 Rule Set Analysis 45 5.3 Search Time 47 5.4 Memory Consumption 48 5.5 Update Time 50 5.6 Tree Diversity 52 5.7 Tree Architecture 53 5.8 Cycle Time 55 5.9 Parallel processing search time simulation 58 Chapter 6 Future Work 60 6.1 Elastic Root Node Grouping 60 6.2 Reward Function Coefficient Normalization 61 6.3 Optimization by Recursive Endpoint-Cutting Scheme 61 6.4 Tree Update 61 Chapter 7 Conclusion 63 References 64

    [1] "Deepmind," An artificial intelligence research laboratory, [Online]. Available: https://www.deepmind.com/. [Accessed 19 July 2023].
    [2] Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., ... & Hassabis, D. (2016). Mastering the game of Go with deep neural networks and tree search. nature, 529(7587), 484-489.
    [3] "OpenAI," An artificial intelligence research laboratory, [Online]. Available: https://openai.com/. [Accessed 19 July 2023].
    [4] "ChatGPT," OpenAI, [Online]. Available: https://chat.openai.com/chat. [Accessed 19 July 2023].
    [5] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., ... & Amodei, D. (2020). Language models are few-shot learners. Advances in neural information processing systems, 33, 1877-1901.
    [6] " Gartner," A technological research and consulting firm, [Online]. Available: https://www.gartner.com/en. [Accessed 19 July 2023].
    [7] D. Groombridge, "Gartner Top 10 Strategic Technology Trends for 2023," Gartner, 17 Oct 2022. [Online]. Available: https://www.gartner.com/en/articles/gartner-top-10-strategic-technology-trends-for-2023. [Accessed 19 July 2023].
    [8] Gupta, P., & McKeown, N. (1999, August). Packet classification using hierarchical intelligent cuttings. In Hot Interconnects VII (Vol. 40).
    [9] Singh, S., Baboescu, F., Varghese, G., & Wang, J. (2003, August). Packet classification using multidimensional cutting. In Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications (pp. 213-224).
    [10] Chang, Y. K., & Chen, H. C. (2019). Fast packet classification using recursive endpoint-cutting and bucket compression on FPGA. The Computer Journal, 62(2), 198-214.
    [11] Vamanan, B., Voskuilen, G., & Vijaykumar, T. N. (2010). EffiCuts: Optimizing packet classification for memory and throughput. ACM SIGCOMM Computer Communication Review, 40(4), 207-218.
    [12] Srinivasan, V., Suri, S., & Varghese, G. (1999, August). Packet classification using tuple space search. In Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication (pp. 135-146).
    [13] Li, W., Li, X., Li, H., & Xie, G. (2018, April). Cutsplit: A decision-tree combining cutting and splitting for scalable packet classification. In IEEE INFOCOM 2018-IEEE Conference on Computer Communications (pp. 2645-2653). IEEE.
    [14] Lin, Y. H., Shih, W. C., & Chang, Y. K. (2022). Efficient hierarchical hash tree for OpenFlow packet classification with fast updates on GPUs. Journal of Parallel and Distributed Computing, 167, 136-147.
    [15] Liang, E., Zhu, H., Jin, X., & Stoica, I. (2019). Neural packet classification. In Proceedings of the ACM Special Interest Group on Data Communication (pp. 256-269).
    [16] Chen, H., Yang, Y., Xu, M., Zhang, Y., & Liu, C. (2022, July). Neurotrie: Deep Reinforcement Learning-based Fast Software IPv6 Lookup. In 2022 IEEE 42nd International Conference on Distributed Computing Systems (ICDCS) (pp. 917-927). IEEE.
    [17] Li, W., Yang, T., Rottenstreich, O., Li, X., Xie, G., Li, H., ... & Lin, H. (2020). Tuple space assisted packet classification with high performance on both search and update. IEEE Journal on Selected Areas in Communications, 38(7), 1555-1569.
    [18] Pfaff, B., Pettit, J., Koponen, T., Jackson, E., Zhou, A., Rajahalme, J., ... & Casado, M. (2015). The design and implementation of open {vSwitch}. In 12th USENIX symposium on networked systems design and implementation (NSDI 15) (pp. 117-130).
    [19] Yingchareonthawornchai, S., Daly, J., Liu, A. X., & Torng, E. (2018). A sorted-partitioning approach to fast and scalable dynamic packet classification. IEEE/ACM Transactions on Networking, 26(4), 1907-1920.
    [20] " OpenFlow 1.0," [Online]. Available: https://opennetworking.org/wp-content/uploads/2013/04/openflow-spec-v1.0.0.pdf. [Accessed 19 July 2023].
    [21] "Arthur Samuel (computer scientist)," Wikipedia, [Online]. Available: https://en.wikipedia.org/wiki/Arthur_Samuel_(computer_scientist). [Accessed 19 July 2023].
    [22] "IBM701,"[Online].Available: https://www.ibm.com/ibm/history/exhibits/701/701_intro.html.[Accessed 19 July 2023].
    [23] "Geoffrey Everest Hinton," Wikipedia, [Online]. Available: https://en.wikipedia.org/wiki/Geoffrey_Hinton. [Accessed 19 July 2023].
    [24] " ILSVRC," A Competition, [Online]. Available: https://www.image-net.org/challenges/LSVRC/. [Accessed 19 July 2023].
    [25] Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2017). ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60(6), 84-90.
    [26] Bishop, C. M. (1995). Neural networks for pattern recognition. Oxford university press.
    [27] Seber, G. A., & Lee, A. J. (2003). Linear regression analysis (Vol. 330). John Wiley & Sons.
    [28] Breiman, L. (2001). Random forests. Machine learning, 45, 5-32.
    [29] Hartigan, J. A., & Wong, M. A. (1979). Algorithm AS 136: A k-means clustering algorithm. Journal of the royal statistical society. series c (applied statistics), 28(1), 100-108.
    [30] Murtagh, F., & Contreras, P. (2012). Algorithms for hierarchical clustering: an overview. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2(1), 86-97.
    [31] Vinyals, O., Babuschkin, I., Czarnecki, W. M., Mathieu, M., Dudzik, A., Chung, J., ... & Silver, D. (2019). Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature, 575(7782), 350-354.
    [32] Taylor, D. E., & Turner, J. S. (2007). Classbench: A packet classification benchmark. IEEE/ACM transactions on networking, 15(3), 499-511.
    [33] Pagiamtzis, K., & Sheikholeslami, A. (2006). Content-addressable memory (CAM) circuits and architectures: A tutorial and survey. IEEE journal of solid-state circuits, 41(3), 712-727.
    [34] Yu, F., Katz, R. H., & Lakshman, T. V. (2005). Efficient multimatch packet classification and lookup with TCAM. IEEE Micro, 25(1), 50-59.
    [35] Cheng, Y. C., & Wang, P. C. (2015). Scalable multi-match packet classification using TCAM and SRAM. IEEE Transactions on Computers, 65(7), 2257-2269.
    [36] Chang, Y. K., Su, C. C., Lin, Y. C., & Hsieh, S. Y. (2012). Efficient gray-code-based range encoding schemes for packet classification in TCAM. IEEE/ACM Transactions on Networking, 21(4), 1201-1214.
    [37] Lakshminarayanan, K., Rangarajan, A., & Venkatachary, S. (2005). Algorithms for advanced packet classification with ternary CAMs. ACM SIGCOMM Computer Communication Review, 35(4), 193-204.
    [38] Li, W., Li, X., Li, H., & Xie, G. (2018, April). Cutsplit: A decision-tree combining cutting and splitting for scalable packet classification. In IEEE INFOCOM 2018-IEEE Conference on Computer Communications (pp. 2645-2653). IEEE.
    [39] Lotfollahi, M., Jafari Siavoshani, M., Shirali Hossein Zade, R., & Saberian, M. (2020). Deep packet: A novel approach for encrypted traffic classification using deep learning. Soft Computing, 24(3), 1999-2012.
    [40] Jamil, H., & Weng, N. (2020, May). Multibit tries packet classification with deep reinforcement learning. In 2020 IEEE 21st International Conference on High Performance Switching and Routing (HPSR) (pp. 1-6). IEEE.
    [41] Jamil, H., Yang, N., & Weng, N. (2022). Many‐field packet classification with decomposition and reinforcement learning. IET Networks, 11(3-4), 112-127.
    [42] Tarjan, R. (1972). Depth-first search and linear graph algorithms. SIAM journal on computing, 1(2), 146-160.
    [43] Etessami, K., & Yannakakis, M. (2015). Recursive Markov decision processes and recursive stochastic games. Journal of the ACM (JACM), 62(2), 1-69.
    [44] Pliska, S. R. (1976). Optimization of multitype branching processes. Management Science, 23(2), 117-124.
    [45] Chang, Y. K., & Chen, H. C. (2019). Fast packet classification using recursive endpoint-cutting and bucket compression on FPGA. The Computer Journal, 62(2), 198-214.

    無法下載圖示 校內:2028-08-23公開
    校外:2028-08-23公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE