| 研究生: |
邱品勳 Chiu, Pin-Hsun |
|---|---|
| 論文名稱: |
基於 FPGA 的快速類 TCAM Prefix設計與多類型記憶體應用於多維封包分類之實現 Efficient TCAM like Prefix Design and the Implementation using Various Types of Memory on FPGA for Multi-Dimensional Packet Classification |
| 指導教授: |
張燕光
Chang, Yeim-Kuan |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 人工智慧科技碩士學位學程 Graduate Program of Artificial Intelligence |
| 論文出版年: | 2025 |
| 畢業學年度: | 113 |
| 語文別: | 英文 |
| 論文頁數: | 62 |
| 中文關鍵詞: | 封包分類 、TCAM 、FPGA |
| 外文關鍵詞: | Packet classification, TCAM, FPGA |
| 相關次數: | 點閱:51 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著現代網路朝向高吞吐量與複雜服務持續演進,封包分類在實現路由、防火牆及服務品質(QoS)等功能中扮演關鍵角色。然而,傳統硬體面對大規模多欄位規則集合時,在記憶體使用與效能上皆面臨重大挑戰。為此,本論文提出一種以 FPGA 為基礎的架構,結合類 TCAM 前綴比對機制與具記憶體意識的規則分組策略,實現高吞吐量、可擴展且具記憶體效率的封包分類系統。
本研究將規則集依據前綴特性分為多個子集合,並根據子集合的大小與存取模式,選擇最適當的記憶體資源(如 Ultra RAM、Block RAM 或 Distributed RAM)進行儲存。為提升記憶體利用率,我們採用 (n+1) 位元前綴格式,並搭配索引技巧壓縮規則資料,達成顯著的空間節省,同時保有快速查找能力。此外,系統亦實作多階段的管線化比對架構與優先編碼機制,以加速分類流程並處理多重比對的情況。
本架構於 Xilinx Virtex UltraScale+ VU9P FPGA 平台上進行實作與評估。實驗結果顯示,該設計可有效支援超過十萬筆規則,同時相較傳統方法節省超過 50% 的記憶體使用量。在記憶體效率與系統擴展性方面皆優於現有先進方法,展現其於現代 SDN 交換器與高速網路設備中的實用價值。
As modern networks evolve toward higher throughput and more complex services, packet classification becomes a critical component in enabling functions such as routing, firewalls, and QoS. However, supporting large-scale multi-field rule sets imposes significant memory and performance challenges on traditional hardware. This thesis proposes an FPGA-based architecture that combines a TCAM-like prefix matching mechanism with efficient memory-aware rule grouping to enable high-throughput, scalable, and memory-efficient packet classification.
We introduce a subset-based rule partitioning strategy that divides the classification rules according to prefix characteristics and stores them using the most suitable memory type—Ultra RAM, Block RAM, or Distributed RAM—based on each subset’s size and access pattern. To maximize memory utilization, we adopt an (n+1)-bit prefix format and leverage indexing techniques to compress rule storage, significantly reducing memory usage while preserving fast lookup capability. A multi-stage pipelined matching architecture is also implemented to accelerate the classification process, with priority encoding used to resolve multiple matches.
The proposed design is synthesized and evaluated on the Xilinx Virtex UltraScale+ VU9P platform. Experimental results show that our architecture can efficiently support over 100,000 rules while reducing memory consumption by more than 50% compared to conventional methods. Moreover, our design outperforms state-of-the-art solutions in both memory efficiency and scalability, demonstrating its practical value in modern SDN switches and high-speed network devices.
[1] D. E. Taylor, “Survey and Taxonomy of Packet Classification Techniques,” ACM Computing Surveys (CSUR), vol. 37, no. 3, pp. 238–275, Sep. 2005, doi: 10.1145/1108956.1108958.
[2] P. Gupta and N. McKeown, “Classifying Packets with Hierarchical Intelligent Cuttings,” IEEE Micro, vol. 20, no. 1, pp. 34–41, Jan. 2000, doi: 10.1109/40.820051.
[3] S. Singh, F. Baboescu, G. Varghese, and J. Wang, “Packet Classification Using Multidimensional Cutting,” 2003.
[4] P. Gupta and N. McKeown, “Packet Classification on Multiple Fields,” in ACM Sigcomm, August 1999.
[5] V. Srinivasan, S. Suri, and G. Varghese, “Packet Classification Using Tuple Space Search,” ACM SIGCOMM Computer Communication Review, vol. 29, no. 4, pp. 135–146, Aug. 1999, doi: 10.1145/316194.316216.
[6] F. Yu, R. H. Katz, and T. v. Lakshman, “Efficient Multimatch Packet Classification and Lookup with TCAM,” IEEE Micro, vol. 25, no. 1, pp. 50–59, Jan. 2005, doi: 10.1109/MM.2005.8.
[7] K. Lakshminarayanan, A. Rangarajan, and S. Venkatachary, “Algorithms for Advanced Packet Classification with Ternary CAMs.” in Proc. ACMSIGCOMM, 2005, pp. 193-204.
[8] S. Yi, B. K. Kim, J. Oh, J. Jang, G. Kesidis, and C. R. Das, “Memory-efficient Content Filtering Hardware for High-speed Intrusion Detection Systems,” Proceedings of the ACM Symposium on Applied Computing, pp. 264–269, 2007, doi: 10.1145/1244002.1244068.
[9] A. Majumdar et al., “A Massively Parallel, Energy Efficient Programmable Accelerator for Learning and Classification,” ACM Transactions on Architecture and Code Optimization (TACO), vol. 9, no. 1, Mar. 2012, doi: 10.1145/2133382.2133388.
[10] Y. K. Chang and H. C. Chen, “Fast Packet Classification Using Recursive Endpoint-cutting and Bucket Compression on FPGA,” Computer Journal, vol. 62, no. 2, pp. 198–204, Feb. 2019, doi: 10.1093/COMJNL/BXY052.
[11] W. Li, X. Li, H. Li, and G. Xie, “CutSplit: A Decision-Tree Combining Cutting and Splitting for Scalable Packet Classification,” Proceedings - IEEE INFOCOM, vol. 2018-April, pp. 2645–2653, Oct. 2018, doi: 10.1109/INFOCOM.2018.8485947.
[12] Y. Qi, L. Xu, B. Yang, Y. Xue, and J. Li, “Packet Classification Algorithms: From Theory to Practice,” Proceedings - IEEE INFOCOM, pp. 648–656, 2009, doi: 10.1109/INFCOM.2009.5061972.
[13] W. Jiang and V. K. Prasanna, “Field-split Parallel Architecture for High Performance Multi-match Packet Classification Using FPGAs,” Annual ACM Symposium on Parallelism in Algorithms and Architectures, pp. 188–196, 2009, doi: 10.1145/1583991.1584044.
[14] T. Ganegedara and V. K. Prasanna, “StrideBV: Single Chip 400G+ Packet Classification,” 2012 IEEE 13th International Conference on High Performance Switching and Routing, HPSR 2012, pp. 1–6, 2012, doi: 10.1109/HPSR.2012.6260820.
[15] D. E. Taylor and J. S. Turner, “Classbench: A Packet Classification Benchmark.” IEEE/ACM Trans. Networking, vol. 15, no. 3, pp. 499-511, Jun. 2007.
[16] B. Pfaff and et al, “The Design and Implementation of Open vSwitch,” in USENIX NSDI, 2015.
[17] J. Daly and E. Torng, “TupleMerge: Building Online Packet Classifiers by Omitting Bits,” in IEEE ICCCN, 2017.
[18] J. Daly and et al, “Tuplemerge: Fast Software Packet Processing for Online Packet Classification,” IEEE/ACM Transactions on Networking, 2019.
[19] Y. K. Chang and C. S. Hsueh, “Range-Enhanced Packet Classification Design on FPGA,” IEEE Trans Emerg Top Comput, vol. 4, no. 2, pp. 214–224, Apr. 2016, doi: 10.1109/TETC.2015.2449666.
[20] Xilinx, “7 Series FPGAs Overview” Available: http://www.xilinx.com/support/documentation/data_sheets/ds180_7Series_Overview.pdf.
[21] Xilinx, “UltraScale Architecture and Product Data Sheet: Overview” Available: https://www.xilinx.com/support/documentation/data_sheets/ds890-ultrascale-overview.pdf
[22] T. Banerjee-Mishra, S. Sahni, and G. S. Seetharaman, “PC-TRIO: A power efficient TCAM architecture for packet classifiers,” IEEE Trans. Computers, vol. 64, no. 4, pp. 1104–1118, 2015.
[23] F. Zane, G. Narlikar, and A. Basu, “CoolCAMs: Power-efficient TCAMs for forwarding engines,” in Proc. 22nd Annu. Joint Conf. IEEE Comput. Commun., pp. 42–52, 2003.
[24] K. Lakshminarayan, A. Rangarajan, and S. Venkatachary, “Algorithms for advanced packet classification with ternary CAMs,” in Proc. Conf. Appl., Technol., Archit., Protocols Comput. Commun., pp. 193–204, 2005.
[25] Chang, Yeim-Kuan “An Improved Stride-based Encoding Scheme for Packet Classification with Update” in NCKU CIAL thesis,2015
[26] Y. Xin, W. Li, G. Tang, T. Yang, X. Hu and Y. Wang, "FPGA-Based Updatable Packet Classification Using TSS-Combined Bit-Selecting Tree," in IEEE/ACM Transactions on Networking, 2022
校內:2027-12-31公開