簡易檢索 / 詳目顯示

研究生: 余世淇
Yu, Shi-Qu
論文名稱: 使用GPU之改良式Wu-Manber字串比對演算法
Improved Wu-Manber Pattern Matching Algorithm on GPU
指導教授: 張燕光
Chang, Yeim-Kuan
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2013
畢業學年度: 101
語文別: 中文
論文頁數: 62
中文關鍵詞: 網路入侵偵測樣式比對位移基礎樣式演算法Wu-Manber演算法平行環境
外文關鍵詞: Network intrusion detection, shift-based algorithm, Wu-Manber algorithm,, parallel computing environment
相關次數: 點閱:105下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年來,由於網際網路的速度日漸具增,伴隨而來的是大量的網路攻擊或病毒在網際網路間傳播。網路偵測入侵系統是一套用來監控我們使用的網路是否含有有害的病毒或者是包含惡意攻擊的封包。為了能夠即時的找出這些有害的攻擊,我們需要一個有效率的樣是比對演算法去掃描出這些攻擊。我們藉由事先定義好的病毒碼或惡意攻擊的內容建立網路入侵系統去比對網路封包內容,查看其是否包含對我們系統有害的資料。以位移作為基礎的比對演算法,像是KMP, Boyer-Moore以及Wu-Manber演算法是網路入侵偵測系統的一種比對演算法。其特色為不需要建立大量的資料結構就能比對出有害的資料,但當我們的病毒碼過多時則我們需要耗費較多的時間去完整比對這些資料。
    在這篇論文中,我們將以Wu-Manber多重字串比對演算法來作為我們的基礎,修改原始Wu-Manber演算法的資料結構以平行運算的方式來執行.在新的架構中,我們以大量的執行緒來處理輸入字串(input string),去除了原始Wu-Manber中的位移的機制來節省比對的時間與期占用的記憶體空間。我們可以達到3.5GB/s的輸出,而新的架構比原始Wu-Manber的資料結構減少20%的記憶體使用量。此外,在某些情況下,新的架構的處理時間可以比以自動機(automata-based)類的演算法來的少。

    Since the traffic of Internet grows rapidly in the recent years, many viruses and malicious attacks have spread continuously over the Internet. Network intrusion detection system is needed to monitor network or system activities for malicious activities or pernicious payload of Internet packet. In order to quickly find out these malicious attacks, we need one efficient pattern matching algorithm to scan them. The pattern matching algorithms match the payloads of Internet packets against the pre-defined virus signatures and malicious attack contents (patterns). Shift based algorithms like KMP, Boyer-Moore or Wu-Manber pattern matching algorithm can perform the match process quickly with only a small amount of memory. However, one drawback is that the time taken to perform match process is too long when the pre-defined pattern set is very large.
    In this thesis, we will propose a new scheme that is based on Wu-Manber algorithm and modify its data structure to fit parallel computing environment. In our proposed scheme, numerous threads are used to process the input string by simplifying the original shift table in Wu-Manber structure such that the processing time and the memory usage can be reduced. Our proposed scheme can achieve a throughput of 27 Gbps and reduce 20% of the memory used for Wu-Manber algorithm that runs on CPU environment. In addition, our performance is better than PFAC which is an implementation of DFA in GPU environment.

    Table of Contents Abstract iv Table of Contents vi List of Tables viii List of Figures ix Chapter 1 Introduction 1 Chapter 2 Related Work 3 2.1 Automation-Based Algorithms 3 2.1.1 Knuth-Morris-Pratt Algorithm 3 2.1.2 Aho-Corasick Algorithm 6 2.2 Shift-Based Algorithms 9 2.2.1 Boyer-Moore Algortithm 9 2.2.2 Wu-Manber Algorithm 12 2.3 Algorithm of other Mechanism 14 2.3.1 Bloom Filter Based Algorithms 14 2.3.2 TCAM-Based Algorithms 16 Chapter 3 CUDA 19 3.1 CUDA programming model 19 3.2 CUDA hardware model 24 Chapter 4 Proposed scheme 27 4.1 Motivation 27 4.2 Proposed scheme 28 4.3 Optimization 33 4.3.1 Binary search on prefix in Full-match procedure 33 4.3.2 Remove read characters 35 4.3.3 Filter more patterns using multiple shift tables 36 4.4 CUDA implementation 38 4.4.1 CUDA Pseudo Code 38 Chapter 5 Experimental results 42 5.1 Simulation Environment 42 5.2 Pattern Set and Input Trace 44 5.3 Dividing Pattern Set by Binary Search into Group 45 5.4 Simulation Results 47 Chapter 6 Conclusion 60 References 61

    [1] CUDA http://www.nvidia.com/object/cuda_home_new.html
    [2] Defcon : http://www.defcon.org/
    [3] Fermi : http://www.nvidia.com/object/fermi-architecture.html
    [4] Snort : http://www.snort.org
    [5] Tesla 2050 :http://www.nvidia.com.tw/object/product_tesla_C2050_C2070_tw.html
    [6] V. Aho and M. J. Corasick, “Efficient string matching: An aid to bibliography search,” Communications of the ACM, vol. 18, no.6, pp.333-340, 1975.
    [7] B. H. Bloom, “Space/Time Trade-offs in Hash Coding with Allowable Errors.” Communications of the ACM, vol.13, no.7, pp.422-426, July 1970.
    [8] R. S. Boyer and J. S. Moore, “A fast string searching algorithm,” Communications of the ACM, vol. 20, no 10, pp.762-772, Oct. 1977.
    [9] L. Fan, P. Cao, J. Almeida, and A. Z. Broder, “Summary Cache: A scalable Wide-Area Web Cache Sharing Protocol,” IEEE/ACM Transactions on Networking, vol.8, no.3, pp.281-293, June 2000.
    [10] R. Nigel Horspool, “Practical fast searching in strings.” Software–Practice & Experience 10–6 (1980) pp.501-506.
    [11] Andrew Hume and Daniel Sunday, “Fast string searching.” Software–Practice &Experience 21-11 (1991) pp.1221-1248.
    [12] E. Knuth, J. H. M. Jr., and V. R. Pratt, “Fast pattern matching in strings,” SIAM Journal on Computing, vol. 6, no. 2, pp. 323–350, June 1977.
    [13] P. C. Lin, Y. D. Lin, and Y. C. Lai, “A Hybrid Algorithm of Backward Hashing and Automaton Tracking for Virus Scanning,” IEEE Transactions on Computers, vol. 60, no. 4, April 2011.
    [14] C. R. Meiners, J. Patel, E. Norige, E. Torng and A. X. Liu, “Fast Regular Expression Matching Using Small TCAMs For Network Intrusion Detection and Prevention Systems,” in Proceedings of the 19th USENIX conference on Security, 2010.
    [15] Sun Wu and Udi Manber, “A fast algorithm For Multi-Pattern Searching,” Technical Report TR 94-17, University of Arizona at Tuscon, May 1994
    [16] D. H. Yang, X. Ke and C. Yong, “An Improved Wu-Manber Multiple Patterns Matching Algorithm” IEEE International Performance Computing and Communications Conference, pp. 675-680, 2006.
    [17] F. Yu, R. H. Katz, and T. V. Lakshman, “Gigabit Rate Packet Pattern-Matching Using TCAM,” in Proceeding of 12th IEEE International Conference on Network Protocols (ICNP’04), pp. 174-183, 2004.
    [18] F. Zane, G. Narlikar and A. Basu, “CoolCAMs: Power-Efficient TCAMs for Forwarding Engines.” IEEE International Conference on Computer Communications, vol.1, pp.42-45, March 2003.
    [19] C.H.Lin, S.Y.Tsai, C.H.Liu, S.C.Chan and J.M.Shyu. “Accelerating String Matching Using Multi-threaded Algorithm on GPU.” Global Telecommunications Conference (GLOBECOM 2010), 2010 IEEE, pp.1-5

    無法下載圖示 校內:2018-09-06公開
    校外:不公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE