簡易檢索 / 詳目顯示

研究生: 郭芳辰
Kuo, Fang-Chen
論文名稱: 在網路處理器上加速封包分類速度的技術與快取設計
Cache Design and Techniques for Improving Packet Classification Performance on Network Processor
指導教授: 張燕光
Chang, Yeim-Kuan
學位類別: 博士
Doctor
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2013
畢業學年度: 101
語文別: 英文
論文頁數: 105
中文關鍵詞: 封包分類快取延遲處理網路處理器
外文關鍵詞: Packet Classification, Cache, Delay Processing, Network Processor
相關次數: 點閱:78下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 為了能夠針對不同封包作差異性的處理,路由器藉由執行一個稱之為”封包分類”的動作,搜尋既定的規則表來尋找最符合的規則,將收到的封包分成不同的種類。
    網路處理器是專門用來處理網路封包的處理器種類,在這篇論文中,我們首先將會檢視現有封包分類演算法,討論這些演算法的特色,並著重於實作在網路處理器上的可行性與效能分析。
    在上述過程中,我們發覺大部分的封包分類演算法皆無法達到我們使用的網路處理器 IXP2400 和 IXP2800 的最高處理速度。有鑑於此,我們提出兩種不同概念的增強機制來加速封包分類的動作。
    第一種快取機制嘗試減少快取失誤時的代價。我們在每個快取項目上多增加了數個稱作線索 (Hint) 的小欄位,當快取失誤時,透過線索欄位,我們能夠利用原本完全無法使用的快取項目,還是有機會能夠減少需要搜尋的動作量 (對記憶體的存取次數)。
    第二種機制嘗試減少瞬間的大量封包流量對系統效能的影響。在上述情況中,縱使有快取的機制在運作,但正是由於短時間內相同類型的封包太多,在快取項目尚未更新的情況下,所有網路處理器的執行緒皆嘗試更新同一快取項目,而造成重複且多餘的記憶體存取動作。我們提出透過延遲處理同一類型的封包來避免不同執行緒的同一快取項目更新動作來提升封包處理速度。

    Packet classification is the major operation for routers to classify incoming packets to different flows. In this thesis, we implement some notable hierarchical or decision-tree-based packet classification algorithms such as extended grid of tries (EGT), hierarchical intelligent cuttings (HiCuts), HyperCuts, and hierarchical binary search (HBS) on an IXP2400 and IXP2800 network processor. By using all of the available processing microengines (MEs), we find that none of these existing packet classification algorithms achieve the line speed of OC-48 (OC-192) provided by IXP2400 (IXP2800). To improve the search speed of these packet classification algorithms, we propose the use of software cache designs to take advantage of the temporal locality of the packets because IXP network processors have no built-in caches for fast path processing in MEs. Two different approaches are proposed.
    Firstly, we propose hint-based cache designs to reduce the search duration of the packet classification data structure when cache misses occur. Both the header and prefix caches are studied. Although the proposed cache schemes are designed for all the dimension-by-dimension packet classification schemes, they are, nonetheless, the most suitable for HBS. Our performance simulations show that the HBS enhanced with the proposed cache schemes performs the best in terms of classification speed and number of memory accesses when the memory requirement is in the same range as those of HiCuts and HyperCuts. Based on the experiments with all the high and low locality packet traces, five MEs are sufficient for the proposed rule cache with hints to achieve the line speed of OC-48 provided by IXP2400.
    The second approach Delay Processing mechanism is proposed to solve the performance degradation during bursty traffic. The approach DP mechanism utilizes the property of multi-thread network processors. Pending table delays the subsequent threads from doing the same computations which are being processed by the former thread. By applying DP mechanism, any packet processing tasks with high locality characteristic, can avoid the duplicate computations and hence achieve a higher packet processing rate. The experimental results show that DP mechanism can further improve the achievable throughput for the network processor we used.

    Chapter 1 Introduction 11 Chapter 2 Related Work 15 2.1 Classical Packet Classification 15 2.2 Packet Classification on Network Processor 16 2.2.1 Decomposition approaches 17 2.2.2 Decision Tree approaches 18 2.3 Cache for Packet Classification 20 Chapter 3 Baseline Packet Classification - HBS 22 3.1 Binary Prefix Search 22 3.2 Binary Range Search 23 3.3 Hierarchical Binary Search for Packet Classification 24 Chapter 4 IXP2XXX Network Processor Platform 28 4.1 IXP2XXX Hardware Component 28 4.1.1 IXP2400 and IXP2800 Specific Property 29 4.2 Programming Techniques 29 4.2.1 NP Independent Techniques 29 4.2.2 NP Memory Dependent Techniques 30 4.2.3 NP instruction dependent techniques 31 Chapter 5 Implementation Issues for HBS 33 5.1 Detail Search Function of HBS 33 5.2 Modified Version of HBS Searching 33 5.2.1 The Basic Design of HBS (HBS-V1) 34 5.2.2 The Second Design of HBS (HBS-V2) 35 5.2.3 The Third Design of HBS (HBS-V3) 35 5.2.4 The Fourth Design of HBS (HBS-V4) 38 5.2.5 The Fifth Design of HBS (HBS-V5) 40 5.3 Version for Evaluation 40 5.4 Resource Allocation 40 5.5 Data Structure Design of HBS 42 Chapter 6 Proposed Hint-Cache Schemes 44 6.1 Header Cache (H-Cache) 44 6.2 HeaderCache with hints (H-cache-hint) 44 6.3 Rule Cache 46 6.4 Rule Cache with Hints (R-cache-hint and MR-cache-hint) 50 6.5 Data Structure for Proposed Hint-Caches 50 Chapter 7 Proposed Delay Processing Mechanism 52 7.1 Motivation 52 7.2 More Details 56 7.3 Delay Processing Design 58 7.3.1 Delay Processing: First Design 60 7.3.2 Delay Processing: Second Design 62 7.3.3 Delay Processing: Third Design 65 7.4 Pending Table Design 67 7.4.1 Pending Table: First Design 68 7.4.2 Pending Table: Second Design 68 7.4.3 Pending Table: Third Design 68 7.5 Mutual Exclusion Accessing for Pending Table 70 Chapter 8 Performance Evaluation 73 8.1 Performance Evaluation Setting 73 8.2 Baseline Scheme Evaluation 75 8.3 Packet Classification Schemes without Cache 78 8.4 Packet Classification Schemes with Hint-Cache 79 8.5 Packet Classification Schemes with DP Mechanism 86 8.6 Evaluation Setting on IXP2800 88 8.7 Packet Classification Schemes on IXP2800 91 8.8 Packet Classification Schemes with the Proposed Mechanism on IXP2800 93 8.9 Performance for Out-of-order processing on IXP2800 95 Chapter 9 Conclusion and Future Work 98 References 101 Publication Lists 105

    [1] F. Baboescu, S. Singh, and G. Varghese, "Packet Classification for Core Routers: Is there an alternative to CAMs?" Proc. IEEE INFOCOM 2003, vol. 1, pp. 53–63, 2003.
    [2] F. Chang, W.-c. Feng, W.-chi Feng, K. Li, "Efficient Packet Classification of Digest Caches", Proc. the Third Workshop on Network Processors & Applications (NP3), 2004.
    [3] F. Chang, K. Li, and, W.-C. Feng "Approximate Caches for Packet Classification," Proc. IEEE INFOCOM 2004, vol. 4, pp. 2196–2207, 7–11 Mar 2004.
    [4] Y.-K. Chang, "Fast Binary and Multiway Prefix Searches for Packet Forwarding," Computer Networks, vol. 51, issue 3, pp. 588–605, February 2007.
    [5] Y.-K. Chang, "Efficient Multidimensional Packet Classification with Fast Updates," IEEE Transactions on Computers, vol. 58, no. 4, pp. 463–479, APRIL 2009.
    [6] Y.-K. Chang and Y.-C. Lin, "Dynamic Segment Trees for Ranges and Prefixes," IEEE Transactions on Computers, vol. 56, no. 6, pp. 769–784, June 2007.
    [7] Yeim-Kuan Chang, Yung-Chieh Lin, and Fang-Chen Kuo, “Scalable Packet Classification on Network Processor”, Technical Report.
    [8] Q. Dong, S. Banerjee, J. Wang, and D. Agrawal, "Wire Speed Packet Classification without TCAMs: a Few More Registers (and a Bit of Logic) Are Enough," ACM SIGMETRICS Performance Evaluation Review, vol. 35, issue 1, pp. 253–264, June 2007.
    [9] A. Feldmann and S. Muthukrishnan, "Tradeoffs for Packet Classification," Proc. IEEE INFOCOM 2000, vol. 3, pp. 1193–1202, 26–30 March 2000.
    [10] S. Giordano, G. Procissi, F. Rossi, and F. Vitucci, "Design of a Multi-Dimensional Packet Classifier for Network Processors," Proc. IEEE ICC 2006, vol. 2, pp. 503–508, 2006.
    [11] K. Gopalan and T. Chiueh, "Improving route lookup performance using network processor cache", Proc. the 2002 ACM/IEEE conference on Supercomputing, pages 1–10, 2002.
    [12] P. Gupta and N. McKeown, "Packet Classification on Multiple Fields," ACM SIGCOMM Computer Communication Review, vol. 29, issue 4, pp. 147–160, Oct. 1999.
    [13] P. Gupta and N. McKeown, "Classifying Packets with Hierarchical Intelligent Cuttings," IEEE Micro, vol. 20, no. 1, pp. 34-41, Jan –Feb. 2000.
    [14] P. Gupta and N. Mckeown, "Algorithms for Packet Classification," IEEE Network, vol. 15, no. 2, pp. 24–32, 2001.
    [15] Zhuo Huang, Gang Liu, Jih-Kwon Peir, "Greedy Prefix Cache for IP Routing Lookups," Proc. the 2009 10th International Symposium on Pervasive Systems, Algorithms, and Networks, pages 92–97, 2009.
    [16] Intel Corporation, "Intel® IXP2400 Network Processor Hardware Reference Manual," Nov. 2003.
    [17] Intel Corporation, “Intel® IXP2400/IXP2800 Network Processors Development Tools User's Guide”, Mar. 2004.
    [18] Intel Corporation, "Intel® IXP2400/IXP2800 Network Processors Microengine C Language Support Reference Manual," Nov. 2003.
    [19] E. J. Johnson and A. R. Kunze, “IXP2400/2800 Programming: The Complete Microengine Coding Guide”, Intel Press, 2003.
    [20] T. V. Lakshman, and D. Stiliadis, "High-Speed Policy-based Packet Forwarding using Efficient Multi-dimensional Range Matching," ACM SIGCOMM Computer Communication Review, vol. 28, no. 4, pp. 203–214, October 1998.
    [21] B. Lampson, V. Srinivasan, and G. Varghese, "IP Lookups using Multiway and Multicolumn Search," IEEE/ACM Transactions on Networking, vol. 7, no. 3, pp. 324–334, June 1999.
    [22] G. Liao, H. Yu, and L. Bhuyan, "A New IP Lookup Cache for High Performance IP Routers", Proc. the 47th Design Automation Conference (DAC '10), pp.338–343, 2010.
    [23] H. Lim, N. Lee, G. Jin, J. Lee, Y. Choi, and C. Yim, "Boundary Cutting for Packet Classification," IEEE/ACM Transactions on Networking, to be published.
    [24] D. Liu, Z. Chen, B. Hua, N. Yu, X. Tang, "High-performance Packet Classification Algorithm for Multithreaded IXP Network Processor," ACM Transactions on Embedded Computing Systems, vol. 7, no. 2, February 2008.
    [25] Wei-En Lo, “Fast Binary Search Packet Classification based on Decision Trees”, Master Thesis, CSIE, NCKU, 2008.
    [26] Network Flow Processor NFP-3240 and NFP-6XXX, http://www.netronome.com.
    [27] W. Pak and Y. Choi, "High-Performance Packet Classification for Network-Device Platforms," IEEE Communications Letters, to be published.
    [28] Yaxuan Qi, Bo Xu, Fei He, Baohua Yang, Jianming Yu, and Jun Li, “Towards high-performance flow-level packet processing on multi-core network processors”, Proc. of ANCS 2007, pp. 17-26, 2007.
    [29] Y. Qi, L. Xu, B. Yang, Y. Xue and J., Li, "Packet Classification Algorithms: From Theory to Practice," Proc. IEEE INFOCOM, pp.648–656, 2009.
    [30] RadiSys Corporation, "ENP-2611 Hardware Reference”, August 2003.
    [31] RadiSys Corporation, "ENP Software Development Kit Programmer's Guide”, Apr. 2004.
    [32] S. Singh, F. Baboescu, G. Varghese, and J. Wang, "Packet Classification Using Multidimensional Cutting," Proc. ACM SIGCOMM, pp. 25–29, Aug. 2003.
    [33] D. Srinivasan and W.-C. Feng, "Performance Analysis of Multi-dimensional Packet Classification on Programmable Network Processors," Computer Communications, vol. 28, issue 15, pp. 1752–1760, 15 Sep. 2005.
    [34] V. Srinivasan, S. Suri, and G. Varghese, "Packet Classification using Tuple Space Search," Proc. ACM SIGCOMM, pp. 135–146, 1999.
    [35] V. Srinivasan, G. Varghese, S. Suri, and M. Waldvogel, "Fast and Scalable Layer Four Switching," ACM SIGCOMM Computer Communication Review, vol. 28, no. 4, pp.191–202, Oct. 1998.
    [36] C.-F. Su, "High-Speed Packet Classification Using Segment Tree," Proc. IEEE GLOBECOM, vol. 1, pp. 582–586, 2000.
    [37] Z. X. Tan, C. Lin, H. Yin et al., “Optimization and benchmark of cryptographic algorithms on network processors,” IEEE Micro, vol. 24, no. 5, pp. 55-69, Sep-Oct, 2004.
    [38] D. E. Taylor, "Survey and Taxonomy of Packet Classification Techniques," ACM Computing Surveys, vol. 37, no. 3, pp. 238–275, Sep. 2005.
    [39] D. E. Taylor and J. S. Turner, "ClassBench: A Packet Classification Benchmark," IEEE/ACM Transactions on Networking, vol. 15, no. 3, pp. 499–511, June 2007.
    [40] Nian-Feng Tzeng, “Routing Table Partitioning for Speedy Packet Lookups in Scalable Routers”, IEEE Transactions on Parallel and Distributed Systems, vol. 17, no. 5, pp. 481–494, May 2006.
    [41] B. Vamanan, G. Voskuilen, and T. N. Vijaykumar, “EffiCuts: Optimizing packet classification for memory and throughput,” Proc. ACM SIGCOMM, 2010, pp. 207–218.
    [42] B. Xu, D. Jiang, and J. Li, "HSM: a Fast Packet Classification Algorithm," Proc. IEEE AINA 2005, vol. 1, pp. 987–992, 2005.
    [43] J. Xu, M. Singhal, and J. Degroat, "A Novel Cache Architecture to Support Layer-Four Packet Classification at Memory Access Speeds," Proc. IEEE INFOCOM, pp.1445–1454, 2000.
    [44] B. Yang, J. Fong, W. Jiang, Y. Xue, J. Li, "Practical Multi-tuple Packet Classification using Dynamic Discrete Bit Selection," IEEE Transactions on Computers, to be published.
    [45] Baohua Yang, Yaxuan Qi, Fei He, Yibo Xue, and Jun Li, "Discrete Bit Selection: Towards a Bit-Level Heuristic Framework for Multi-Dimensional Packet Classification," IEEE INFOCOM Workshops 2009, vol., no., pp.1,2, 19-25 April 2009.

    下載圖示 校內:2018-08-27公開
    校外:2018-08-27公開
    QR CODE