| 研究生: |
游家瑋 You, Jia-Wei |
|---|---|
| 論文名稱: |
減少完整比對成本之Wu-Manber字串比對演算法 Reducing the Overhead of Full Matching in Wu-Manber based Pattern Matching Algorithm |
| 指導教授: |
張燕光
Chang, Yeim-Kuan |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2013 |
| 畢業學年度: | 101 |
| 語文別: | 英文 |
| 論文頁數: | 59 |
| 中文關鍵詞: | 網路入侵偵測系統 、字串比對 、Wu-Manber |
| 外文關鍵詞: | Network Intrusion detection system, Pattern Matching, Wu-Manber |
| 相關次數: | 點閱:104 下載:1 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著電腦與網路發展的進步,網路已經成為人們生活中的一部分。隨著網路的蓬勃發展,大量的病毒以及有害的攻擊在網路上不斷的傳播。網路入侵偵測系統被用來做檢查封包的標頭與酬載以保護電腦不受有害的攻擊。字串比對在網路入侵偵測系統中是一個很重要的組成部分。一些有名的字串比對演算法,像是:AC、NFA、DFA都有很高的搜尋效率,但是記憶體使用量很大。但是Wu-Manber演算法所需要的記憶體較少,所以是這篇論文的焦點。Wu-Manber是兩段式的字串比對演算法。第一階段是過濾階段,使用前處理所得到的資訊來跳過一些字元來減少不必要的比對。第二階段是完整比對階段。把輸入資料跟病毒碼做檢查,查看是否有有害的病毒碼在其中。Wu-Manber在平均的情況下有很高的效率而且記憶體使用量少,可以儲存在晶片上的記憶體。
在這篇論文,我們為Wu-Manber演算法提出一個方案為減少完整比對成本(FOR),使用額外的位移値來改善,在做完完整比對之後,搜尋視窗只能位移一個字元的缺點。當位移値增加可以減少不必要的比較,並且提高搜尋處理的成效。我們的方法可以跟Backward Hashing演算法合併。在實驗的結果中,使用現實的封包紀錄,與Wu-Manber演算法相比我們可以減少50%的完整比較次數與66%的處理時間。我們也使用OpenMP來實作多執行緒的環境。相比於單執行緒,多執行緒減少了60%處理時間。
With the advance of computer and network development, Internet becomes an integral part of people’s lives. As the evolution of Internet goes faster and faster, more and more malicious attacks and viruses spread over the Internet every day. Network intrusion detection system (NIDS) is used to inspect the packet header and payload, and protect the computers by preventing the malicious attacks. Pattern matching is an important component of NIDS. Some of famous pattern matching algorithms, such as AC, NFA and DFA which have high performance of searching process but the memory usage is large. However, the Wu-Manber algorithm needs much less memory and thus is the focus of this thesis. Wu-Manber algorithm is a two-phase pattern matching algorithm. The first phase is the filter stage that uses the pre-computed information to skip some characters in the input text for reducing the unnecessary full matches. The second phase is the full match stage. It compares the input text with the patterns to inspect whether there is a true match or not. Wu-Manber has high performance in average case and the memory usage is small which can be stored in on-chip memory.
In this thesis, we propose a scheme called Full-matching Overhead Reduction (FOR) for Wu-Manber algorithm which uses the extra shift values to improve the drawback that where only one character is shifted in the searching window after the full match is performed. When the shift values are increased, we can avoid more unnecessary full matches and improve the performance of searching process. Our approach can be combined with Backward Hashing algorithm. In the experimental results, we can reduce 50% of full matches and reduce 66% of processing time compared with Wu-Manber algorithm with real life packet traces. We also use the OpenMP to implement our program in multi-thread environment. It reduces about 60% of processing time compared with using single-thread.
[1] V. Aho and M. J. Corasick, “Efficient string matching: An aid to bibliography search” Communications of the ACM, vol. 18, no.6, pp.333-340, 1975.
[2] M. Becchi, M. Franklin and P. Crowley, “A workload for evaluating deep packet inspection architectures” IEEE International Symposium on Workload Characterization, pp.79-89, 2008.
[3] B. H. Bloom, “Space/Time Trade-offs in Hash Coding with Allowable Errors” Communications of the ACM, vol.13, no.7, pp.422-426, July 1970.
[4] R. S. Boyer and J. S. Moore, “A fast string searching algorithm” Communications of the ACM, vol. 20, no 10, pp.762-772, Oct. 1977.
[5] Bro: The Bro Network Security Monitor, www.bro-ids.org.
[6] Clam Anti-Virus signature database, www.clamav.net.
[7] L. Fan, P. Cao, J. Almeida, and A. Z. Broder, “Summary Cache: A scalable Wide-Area Web Cache Sharing Protocol” IEEE/ACM Transactions on Networking, vol.8, no.3, pp.281-293, June 2000.
[8] R. Nigel Horspool, “Practical fast searching in strings.” Software–Practice & Experience 10–6 (1980) pp.501-506.
[9] N. F. Huang and W. Y. Tsai, “SHOCK: A Worst-Case Ensured Sub-Linear Time Pattern Matching Algorithm for Inline Anti-Virus Scanning” IEEE International Conference on Communications (ICC), pp.1-5, 2010
[10] Andrew Hume & Daniel Sunday, “Fast string searching.” Software–Practice &Experience 21-11 (1991) pp.1221-1248.
[11] E. Knuth, J. H. M. Jr., and V. R. Pratt, “Fast pattern matching in strings,” SIAM Journal on Computing, vol. 6, no. 2, pp. 323–350, June 1977.
[12] T. H. Lee and N. L. Huang, “A Pattern-Matching Scheme With High Throughput Performance and Low Memory Requirement” IEEE/ACM Transactions on Networking (Volume: PP , Issue: 99 ), 2012
[13] P. C. Lin, Y. D. Lin, and Y. C. Lai, “A Hybrid Algorithm of Backward Hashing and Automaton Tracking for Virus Scanning,” IEEE Transactions on Computers, vol. 60, no. 4, April 2011.
[14] C. R. Meiners, J. Patel, E. Norige, E. Torng and A. X. Liu, “Fast Regular Expression Matching Using Small TCAMs For Network Intrusion Detection and Prevention Systems,” in Proceedings of the 19th USENIX conference on Security, 2010.
[15] OpenMP Application Program Interface Version 3.1 July 2011, openmp.org.
[16] Snort: The Open Source Network Intrusion Detection System, www.snort.org.
[17] The Shmoo Group: the Capture the Flag Data. http://cctf.shmoo.com/.
[18] Sun Wu, Udi Manber, “A fast algorithm For Multi-Pattern Searching,” Technical Report TR 94-17, University of Arizona at Tuscon, May 1994.
[19] D. H. Yang, X. Ke and C. Yong, “An Improved Wu-Manber Multiple Patterns Matching Algorithm” IEEE International Performance Computing and Communications Conference, pp. 675-680, 2006.
[20] F. Zane, G. Narlikar and A. Basu, “CoolCAMs: Power-Efficient TCAMs for Forwarding Engines.” IEEE International Conference on Computer Communications, vol.1, pp.42-45, March 2003.