| 研究生: |
謝宗諭 Hsieh, Tsung-Yu |
|---|---|
| 論文名稱: |
基於一維範圍到二維座標轉換與直接二維座標採樣的高效封包分類學習索引結構 A Learned Index Structure for Efficient Packet Classification based on 1D Range to 2D Coordinate Conversion and Direct 2D Coordinate Sampling |
| 指導教授: |
張燕光
Chang, Yeim-Kuan |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2025 |
| 畢業學年度: | 113 |
| 語文別: | 英文 |
| 論文頁數: | 68 |
| 中文關鍵詞: | 封包分類 、遞迴模型索引 、k-means分群 、z曲線 |
| 外文關鍵詞: | Packet Classification, Recursive Model Index (RMI), K-Means Clustering, Z-Curve |
| 相關次數: | 點閱:9 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著網路流量持續增長,路由器必須快速且以極少的記憶體來分類封包。然而,當規則集包含大量的 IP 範圍條件且需要頻繁更新時,傳統的樹狀或雜湊(hash)方法便顯得力不從心。本論文旨在設計一種用於封包分類的多維學習索引。首先,為避免學習範圍規則特徵所使用的大量重新取樣,我們將每個一維的範圍規則映射到二維空間中的一個點,這使得範圍匹配可以被視為一般的點查詢。對於二維轉換的點我們有兩種處理方式,第一種是利用 K-means 演算法對這些點進行聚類,並為每個聚類選擇一個參考點。第二種是透過 Z-order (Morton) 曲線將二維轉換轉換為一維鍵值,並訓練一個輕量級的遞迴模型索引(Recursive Model Index, RMI)。這種「範圍轉二維」加上「聚類 或 Z 曲線」的雙重策略,不僅減少了訓練樣本的數量,還能讓相近的規則在鍵值空間中保持接近,從而加快搜尋速度。由於每個聚類的邊界都已被儲存,因此只需透過追加記錄而非重新訓練整個模型,即可套用小規模的規則更新。我們在十二個 ClassBench 規則集(每個包含 10 萬條規則,其中 5 個 ACL、5 個 FW、2 個 IPC)上評估了此方案。與Nuevomatch學習索引基準線相比,我們的方法平均查詢吞吐量提升了約 10%,同時將訓練資料量減少了 21%。這些結果表明,所提出的學習索引是一種實用且可擴展的解決方案,適用於高速、頻繁更新的封包分類器。
As Internet traffic continues to grow, routers must classify packets quickly and with minimal memory usage. However, when rule sets contain a large number of IP range conditions and require frequent updates, traditional tree-based or hash-based methods become inefficient. This thesis proposes the design of a multidimensional learned index for packet classification.
To avoid the large amount of resampling required for learning range rules, we map each one-dimensional range rule to a point in two-dimensional space, which allows range matching to be treated as a general point lookup. For the transformed 2D points, we adopt two approaches. The first uses the K-means algorithm to cluster the points and selects a reference point for each cluster. The second maps the 2D points into one-dimensional keys using a Z-order (Morton) curve and then trains a lightweight Recursive Model Index (RMI). This dual strategy—“range-to-2D mapping” combined with “clustering or Z-ordering”—not only reduces the number of training samples but also preserves the proximity of similar rules in the key space, thereby accelerating lookup speed.
Because the boundaries of each cluster are stored, small-scale rule updates can be applied by simply appending records instead of retraining the entire model. We evaluate our scheme on twelve ClassBench rule sets (each containing 100,000 rules: 5 ACL, 5 FW, and 2 IPC). Compared with the baseline learned index NuevoMatch, our method improves average lookup throughput by about 10% while reducing the amount of training data by 21%. These results demonstrate that the proposed learned index is a practical and scalable solution for high-speed, frequently updated packet classifiers.
[1] Alon Rashelbach, Ori Rottenstreich, and Mark Silberstein. 2020. A Computational Approach to Packet Classification. In Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication (SIGCOMM '20). Association for Computing Machinery, New York, NY, USA, 542–556.Gupta, P., & McKeown, N. (1999, August). Packet classification using hierarchical intelligent cuttings. In Hot Interconnects VII (Vol. 40).
[2] Singh, S., Baboescu, F., Varghese, G., & Wang, J. (2003, August). Packet classification using multidimensional cutting. In Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications (pp. 213-224).
[3] Vamanan, B., Voskuilen, G., & Vijaykumar, T. N. (2010). EffiCuts: Optimizing packet classification for memory and throughput. ACM SIGCOMM Computer Communication Review, 40(4), 207-218.
[4] X. Zhang, G. Xie, X. Wang, P. Zhang, Y. Li and K. Salamatian, "Fast Online Packet Classification With Convolutional Neural Network," in IEEE/ACM Transactions on Networking, vol. 29, no. 6, pp. 2765-2778, Dec. 2021, doi: 10.1109/TNET.2021.3100114
[5] Srinivasan, V., Suri, S., & Varghese, G. (1999, August). Packet classification using tuple space search. In Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication (pp. 135-146).
[6] Li, W., Li, X., Li, H., & Xie, G. (2018, April). Cutsplit: A decision-tree combining cutting and splitting for scalable packet classification. In IEEE INFOCOM 2018-IEEE Conference on Computer Communications (pp. 2645-2653). IEEE.
[7] Lin, Y. H., Shih, W. C., & Chang, Y. K. (2022). Efficient hierarchical hash tree for OpenFlow packet classification with fast updates on GPUs. Journal of Parallel and Distributed Computing, 167, 136-147.
[8] Liang, E., Zhu, H., Jin, X., & Stoica, I. (2019). Neural packet classification. In Proceedings of the ACM Special Interest Group on Data Communication (pp. 256-269).
[9] Li, W., Yang, T., Rottenstreich, O., Li, X., Xie, G., Li, H., ... & Lin, H. (2020). Tuple space assisted packet classification with high performance on both search and update. IEEE Journal on Selected Areas in Communications, 38(7), 1555-1569.
[10] Pfaff, B., Pettit, J., Koponen, T., Jackson, E., Zhou, A., Rajahalme, J., ... & Casado, M. (2015). The design and implementation of open {vSwitch}. In 12th USENIX symposium on networked systems design and implementation (NSDI 15) (pp. 117-130).
[11] Yingchareonthawornchai, S., Daly, J., Liu, A. X., & Torng, E. (2018). A sorted-partitioning approach to fast and scalable dynamic packet classification. IEEE/ACM Transactions on Networking, 26(4), 1907-1920.
[12] Li, W., Yang, T., Chang, Y. K., Li, T., & Li, H. (2019, September). TabTree: A TSS-assisted bitselecting tree scheme for packet classification with balanced rule mapping. In 2019 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS) (pp. 1-8). IEEE.
[13] " OpenFlow 1.0," [Online]. Available: https://opennetworking.org/wp-content/uploads/2013/04/openflow-spec-v1.0.0.pdf. [Accessed 19 July 2023].
[14] Taylor, D. E., & Turner, J. S. (2007). Classbench: A packet classification benchmark. IEEE/ACM transactions on networking, 15(3), 499-511.
[15] Haibin Lu and Sartaj Sahni, "O(log n) dynamic router-tables for prefixes and ranges," in IEEE Transactions on Computers, vol. 53, no. 10, pp. 1217-1230, Oct. 2004, doi: 10.1109/TC.2004.81. keywords: {Complexity theory;Tree data structures;Table lookup;Communication system routing;Internet;Packet switching;Index Terms- Packet routing;dynamic router-tables;longest-prefix matching;most-specific-range matching;conflict-free ranges.}.
[16] Jamil, H., & Weng, N. (2020, May). Multibit tries packet classification with deep reinforcement learning. In 2020 IEEE 21st International Conference on High Performance Switching and Routing (HPSR) (pp. 1-6). IEEE.
[17] Etessami, K., & Yannakakis, M. (2015). Recursive Markov decision processes and recursive stochastic games. Journal of the ACM (JACM), 62(2), 1-69.
[18] Li, Pengfei, et al. "LISA: A learned index structure for spatial data." Proceedings of the 2020 ACM SIGMOD international conference on management of data. 2020.
[19] Tim Kraska, Alex Beutel, Ed H. Chi, Jeffrey Dean, and Neoklis Polyzotis. 2018.The Case for Learned Index Structures. In ACM SIGMOD.
[20] Yifan Dai, Yien Xu, Aishwarya Ganesan, Ramnatthan Alagappan, Brian Kroth, Andrea Arpaci-Dusseau, and Remzi Arpaci-Dusseau. From Wisckey to Bourbon: A learned index for log-structured merge trees. In USENIX OSDI, 2020.
[21] Alon Rashelbach, Ori Rottenstreich, Mark Silberstein:Scaling Open vSwitch with a Computational Cache. NSDI 2022: 1359-1374
[22] Pagiamtzis, K., & Sheikholeslami, A. (2006). Content-addressable memory (CAM) circuits and architectures: A tutorial and survey. IEEE journal of solid-state circuits, 41(3), 712-727
[23] Yu, F., Katz, R. H., & Lakshman, T. V. (2005). Efficient multimatch packet classification and lookup with TCAM. IEEE Micro, 25(1), 50-59.
[24] Cheng, Y. C., & Wang, P. C. (2015). Scalable multi-match packet classification using TCAM and SRAM. IEEE Transactions on Computers, 65(7), 2257-2269.
[25] Chang, Y. K., Su, C. C., Lin, Y. C., & Hsieh, S. Y. (2012).Efficient gray-code-based range encoding schemes for packet classification in TCAM. IEEE/ACM Transactions on Networking, 21(4), 1201-1214
[26] Jianzhong Qi, Guanli Liu, Christian S. Jensen, and Lars Kulik. 2020. Effectively learning spatial indices. Proc. VLDB Endow. 13, 12 (August 2020), 2341–2354.
[27] Sati S, Jones P, Kim HS, Zhou LA, Rapp-Reyes E, Leung TH. HiCuT: An efficient and low input method to identify protein-directed chromatin interactions. PLoS Genet. 2022 Mar 23;18(3):e1010121. doi: 10.1371/journal.pgen.1010121. PMID: 35320278; PMCID: PMC8979432.
[28] P. Gupta and N. McKeown, “Packet Classification on Multiple Fields,” in ACM Sigcomm, August 1999.
[29] Daly, J., & Torng, E. (2017, July). Tuplemerge: Building online packet classifiers by omitting bits. In 2017 26th International Conference on Computer Communication and Networks (ICCCN) (pp. 1-10). IEEE.
[30] Gao, Jianyang, and Cheng Long. “High-Dimensional Approximate Nearest Neighbor Search: With Reliable and Efficient Distance Comparison Operations.” Proceedings of the ACM on Management of Data, vol. 1, no. 2, June 2023, pp. 1–27. Crossref, https://doi.org/10.1145/3589282.
[31] Chen, H., Yang, Y., Xu, M., Zhang, Y., & Liu, C. (2022, July). Neurotrie: Deep Reinforcement Learning-based Fast Software IPv6 Lookup. In 2022 IEEE 42nd International Conference on Distributed Computing Systems (ICDCS) (pp. 917-927). IEEE.
[32] K. P. Sinaga and M. -S. Yang, "Unsupervised K-Means Clustering Algorithm," in IEEEAccess,vol.8,pp.80716-80727,2020,doi:0.1109/ACCESS.2020.2988796.
[33] Meng Li, Huayi Chai, Siqiang Luo, Haipeng Dai, Rong Gu, Jiaqi Zheng, and Guihai Chen. 2025. VEGA: An Active-tuning Learned Index with Group-Wise Learning Granularity. Proc. ACM Manag. Data 3, 1, Article 86 (February 2025), 26 pages. https://doi.org/10.1145/3709736
[34] Davitkova, A., Milchevski, E., & Michel, S. (2020). The ML-Index: A Multidimensional, Learned Index for Point, Range, and Nearest-Neighbor Queries. International Conference on Extending Database Technology.
[35] Wamane, S., & Agarkar, B.S. (2025). Comprehensive Study of Implementation Strategies and Simulation Tools for Packet Classification. 2025 Global Conference in Emerging Technology (GINOTECH), 1-8.
校內:2030-08-25公開